mm: replace vm_lock and detached flag with a reference countrw_semaphore is a sizable structure of 40 bytes and consumes considerablespace for each vm_area_struct. However vma_lock has two import
mm: replace vm_lock and detached flag with a reference countrw_semaphore is a sizable structure of 40 bytes and consumes considerablespace for each vm_area_struct. However vma_lock has two importantspecifics which can be used to replace rw_semaphore with a simplerstructure:1. Readers never wait. They try to take the vma_lock and fall back to mmap_lock if that fails.2. Only one writer at a time will ever try to write-lock a vma_lock because writers first take mmap_lock in write mode. Because of these requirements, full rw_semaphore functionality is not needed and we can replace rw_semaphore and the vma->detached flag with a refcount (vm_refcnt).When vma is in detached state, vm_refcnt is 0 and only a call tovma_mark_attached() can take it out of this state. Note that unlikebefore, now we enforce both vma_mark_attached() and vma_mark_detached() tobe done only after vma has been write-locked. vma_mark_attached() changesvm_refcnt to 1 to indicate that it has been attached to the vma tree. When a reader takes read lock, it increments vm_refcnt, unless the topusable bit of vm_refcnt (0x40000000) is set, indicating presence of awriter. When writer takes write lock, it sets the top usable bit toindicate its presence. If there are readers, writer will wait using newlyintroduced mm->vma_writer_wait. Since all writers take mmap_lock in writemode first, there can be only one writer at a time. The last reader torelease the lock will signal the writer to wake up. refcount mightoverflow if there are many competing readers, in which case read-lockingwill fail. Readers are expected to handle such failures.In summary:1. all readers increment the vm_refcnt;2. writer sets top usable (writer) bit of vm_refcnt;3. readers cannot increment the vm_refcnt if the writer bit is set;4. in the presence of readers, writer must wait for the vm_refcnt to dropto 1 (plus the VMA_LOCK_OFFSET writer bit), indicating an attached vmawith no readers;5. vm_refcnt overflow is handled by the readers.While this vm_lock replacement does not yet result in a smallervm_area_struct (it stays at 256 bytes due to cacheline alignment), itallows for further size optimization by structure member regrouping tobring the size of vm_area_struct below 192 bytes.[[email protected]: fix a crash due to vma_end_read() that should have been removed] Link: https://lkml.kernel.org/r/[email protected]Link: https://lkml.kernel.org/r/[email protected]Signed-off-by: Suren Baghdasaryan <[email protected]>Suggested-by: Peter Zijlstra <[email protected]>Suggested-by: Matthew Wilcox <[email protected]>Tested-by: Shivank Garg <[email protected]> Link: https://lkml.kernel.org/r/[email protected]Reviewed-by: Vlastimil Babka <[email protected]>Cc: Christian Brauner <[email protected]>Cc: David Hildenbrand <[email protected]>Cc: David Howells <[email protected]>Cc: Davidlohr Bueso <[email protected]>Cc: Hugh Dickins <[email protected]>Cc: Jann Horn <[email protected]>Cc: Johannes Weiner <[email protected]>Cc: Jonathan Corbet <[email protected]>Cc: Klara Modin <[email protected]>Cc: Liam R. Howlett <[email protected]>Cc: Lokesh Gidra <[email protected]>Cc: Lorenzo Stoakes <[email protected]>Cc: Mateusz Guzik <[email protected]>Cc: Mel Gorman <[email protected]>Cc: Michal Hocko <[email protected]>Cc: Minchan Kim <[email protected]>Cc: Oleg Nesterov <[email protected]>Cc: Pasha Tatashin <[email protected]>Cc: "Paul E . McKenney" <[email protected]>Cc: Peter Xu <[email protected]>Cc: Shakeel Butt <[email protected]>Cc: Sourav Panda <[email protected]>Cc: Wei Yang <[email protected]>Cc: Will Deacon <[email protected]>Cc: Heiko Carstens <[email protected]>Cc: Stephen Rothwell <[email protected]>Signed-off-by: Andrew Morton <[email protected]>
show more ...
tools: fix atomic_set() definition to set the value correctlyCurrently vma test is failing because of the new vma_assert_attached()assertion. The check is failing because previous refcount_set()
tools: fix atomic_set() definition to set the value correctlyCurrently vma test is failing because of the new vma_assert_attached()assertion. The check is failing because previous refcount_set() insidevma_mark_attached() is a NoOp. Fix the definition of atomic_set() tocorrectly set the value of the atomic.Link: https://lkml.kernel.org/r/[email protected]Fixes: 9325b8b5a1cb ("tools: add skeleton code for userland testing of VMA logic")Signed-off-by: Suren Baghdasaryan <[email protected]>Reviewed-by: Lorenzo Stoakes <[email protected]>Cc: Jann Horn <[email protected]>Cc: Liam R. Howlett <[email protected]>Cc: Vlastimil Babka <[email protected]>Cc: <[email protected]>Signed-off-by: Andrew Morton <[email protected]>
tools: add skeleton code for userland testing of VMA logicEstablish a new userland VMA unit testing implementation undertools/testing which utilises existing logic providing maple tree supportin
tools: add skeleton code for userland testing of VMA logicEstablish a new userland VMA unit testing implementation undertools/testing which utilises existing logic providing maple tree supportin userland utilising the now-shared code previously exclusive to radixtree testing.This provides fundamental VMA operations whose API is defined in mm/vma.h,while stubbing out superfluous functionality.This exists as a proof-of-concept, with the test implementation functionaland sufficient to allow userland compilation of vma.c, but containing onlycursory tests to demonstrate basic functionality.Link: https://lkml.kernel.org/r/533ffa2eec771cbe6b387dd049a7f128a53eb616.1722251717.git.lorenzo.stoakes@oracle.comSigned-off-by: Lorenzo Stoakes <[email protected]>Tested-by: SeongJae Park <[email protected]>Acked-by: Vlastimil Babka <[email protected]>Reviewed-by: Liam R. Howlett <[email protected]>Cc: Alexander Viro <[email protected]>Cc: Brendan Higgins <[email protected]>Cc: Christian Brauner <[email protected]>Cc: David Gow <[email protected]>Cc: Eric W. Biederman <[email protected]>Cc: Jan Kara <[email protected]>Cc: Kees Cook <[email protected]>Cc: Matthew Wilcox (Oracle) <[email protected]>Cc: Rae Moar <[email protected]>Cc: Shuah Khan <[email protected]>Cc: Suren Baghdasaryan <[email protected]>Cc: Pengfei Xu <[email protected]>Signed-off-by: Andrew Morton <[email protected]>