History log of /linux-6.15/mm/migrate_device.c (Results 1 – 25 of 47)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5
# 82ba975e 28-Feb-2025 Alistair Popple <[email protected]>

mm: allow compound zone device pages

Zone device pages are used to represent various type of device memory
managed by device drivers. Currently compound zone device pages are not
supported. This i

mm: allow compound zone device pages

Zone device pages are used to represent various type of device memory
managed by device drivers. Currently compound zone device pages are not
supported. This is because MEMORY_DEVICE_FS_DAX pages are the only user
of higher order zone device pages and have their own page reference
counting.

A future change will unify FS DAX reference counting with normal page
reference counting rules and remove the special FS DAX reference counting.
Supporting that requires compound zone device pages.

Supporting compound zone device pages requires compound_head() to
distinguish between head and tail pages whilst still preserving the
special struct page fields that are specific to zone device pages.

A tail page is distinguished by having bit zero being set in
page->compound_head, with the remaining bits pointing to the head page.
For zone device pages page->compound_head is shared with page->pgmap.

The page->pgmap field must be common to all pages within a folio, even if
the folio spans memory sections. Therefore pgmap is the same for both
head and tail pages and can be moved into the folio and we can use the
standard scheme to find compound_head from a tail page.

Link: https://lkml.kernel.org/r/67055d772e6102accf85161d0b57b0b3944292bf.1740713401.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <[email protected]>
Signed-off-by: Balbir Singh <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Tested-by: Alison Schofield <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Asahi Lina <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Chunyan Zhang <[email protected]>
Cc: "Darrick J. Wong" <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: linmiaohe <[email protected]>
Cc: Logan Gunthorpe <[email protected]>
Cc: Matthew Wilcow (Oracle) <[email protected]>
Cc: Michael "Camp Drill Sergeant" Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Ted Ts'o <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: WANG Xuerui <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 1afaeb82 06-Mar-2025 Matthew Brost <[email protected]>

mm/migrate: Trylock device page in do_swap_page

Avoid multiple CPU page faults to the same device page racing by trying
to lock the page in do_swap_page before taking an extra reference to the
page.

mm/migrate: Trylock device page in do_swap_page

Avoid multiple CPU page faults to the same device page racing by trying
to lock the page in do_swap_page before taking an extra reference to the
page. This prevents scenarios where multiple CPU page faults each take
an extra reference to a device page, which could abort migration in
folio_migrate_mapping. With the device page being locked in
do_swap_page, the migrate_vma_* functions need to be updated to avoid
locking the fault_page argument.

Prior to this change, a livelock scenario could occur in Xe's (Intel GPU
DRM driver) SVM implementation if enough threads faulted the same device
page.

v3:
- Put page after unlocking page (Alistair)
- Warn on spliting a TPH which is fault page (Alistair)
- Warn on dst page == fault page (Alistair)
v6:
- Add more verbose comment around THP (Alistair)
v7:
- Fix migrate_device_finalize alignment (Checkpatch)

Cc: Alistair Popple <[email protected]>
Cc: Philip Yang <[email protected]>
Cc: Felix Kuehling <[email protected]>
Cc: Christian König <[email protected]>
Cc: Andrew Morton <[email protected]>
Suggested-by: Simona Vetter <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Reviewed-by: Alistair Popple <[email protected]>
Tested-by: Alistair Popple <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# a14fa8ec 06-Mar-2025 Matthew Brost <[email protected]>

mm/migrate: Add migrate_device_pfns

Add migrate_device_pfns which prepares an array of pre-populated device
pages for migration. This is needed for eviction of known set of
non-contiguous devices pa

mm/migrate: Add migrate_device_pfns

Add migrate_device_pfns which prepares an array of pre-populated device
pages for migration. This is needed for eviction of known set of
non-contiguous devices pages to cpu pages which is a common case for SVM
in DRM drivers using TTM.

v2:
- s/migrate_device_vma_range/migrate_device_prepopulated_range
- Drop extra mmu invalidation (Vetter)
v3:
- s/migrate_device_prepopulated_range/migrate_device_pfns (Alistar)
- Use helper to lock device pages (Alistar)
- Update commit message with why this is required (Alistar)

Cc: Andrew Morton <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Reviewed-by: Alistair Popple <[email protected]>
Reviewed-by: Gwan-gyeong Mun <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.14-rc4, v6.14-rc3
# 41cddf83 10-Feb-2025 David Hildenbrand <[email protected]>

mm/migrate_device: don't add folio to be freed to LRU in migrate_device_finalize()

If migration succeeded, we called
folio_migrate_flags()->mem_cgroup_migrate() to migrate the memcg from the
old to

mm/migrate_device: don't add folio to be freed to LRU in migrate_device_finalize()

If migration succeeded, we called
folio_migrate_flags()->mem_cgroup_migrate() to migrate the memcg from the
old to the new folio. This will set memcg_data of the old folio to 0.

Similarly, if migration failed, memcg_data of the dst folio is left unset.

If we call folio_putback_lru() on such folios (memcg_data == 0), we will
add the folio to be freed to the LRU, making memcg code unhappy. Running
the hmm selftests:

# ./hmm-tests
...
# RUN hmm.hmm_device_private.migrate ...
[ 102.078007][T14893] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x7ff27d200 pfn:0x13cc00
[ 102.079974][T14893] anon flags: 0x17ff00000020018(uptodate|dirty|swapbacked|node=0|zone=2|lastcpupid=0x7ff)
[ 102.082037][T14893] raw: 017ff00000020018 dead000000000100 dead000000000122 ffff8881353896c9
[ 102.083687][T14893] raw: 00000007ff27d200 0000000000000000 00000001ffffffff 0000000000000000
[ 102.085331][T14893] page dumped because: VM_WARN_ON_ONCE_FOLIO(!memcg && !mem_cgroup_disabled())
[ 102.087230][T14893] ------------[ cut here ]------------
[ 102.088279][T14893] WARNING: CPU: 0 PID: 14893 at ./include/linux/memcontrol.h:726 folio_lruvec_lock_irqsave+0x10e/0x170
[ 102.090478][T14893] Modules linked in:
[ 102.091244][T14893] CPU: 0 UID: 0 PID: 14893 Comm: hmm-tests Not tainted 6.13.0-09623-g6c216bc522fd #151
[ 102.093089][T14893] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
[ 102.094848][T14893] RIP: 0010:folio_lruvec_lock_irqsave+0x10e/0x170
[ 102.096104][T14893] Code: ...
[ 102.099908][T14893] RSP: 0018:ffffc900236c37b0 EFLAGS: 00010293
[ 102.101152][T14893] RAX: 0000000000000000 RBX: ffffea0004f30000 RCX: ffffffff8183f426
[ 102.102684][T14893] RDX: ffff8881063cb880 RSI: ffffffff81b8117f RDI: ffff8881063cb880
[ 102.104227][T14893] RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000
[ 102.105757][T14893] R10: 0000000000000001 R11: 0000000000000002 R12: ffffc900236c37d8
[ 102.107296][T14893] R13: ffff888277a2bcb0 R14: 000000000000001f R15: 0000000000000000
[ 102.108830][T14893] FS: 00007ff27dbdd740(0000) GS:ffff888277a00000(0000) knlGS:0000000000000000
[ 102.110643][T14893] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 102.111924][T14893] CR2: 00007ff27d400000 CR3: 000000010866e000 CR4: 0000000000750ef0
[ 102.113478][T14893] PKRU: 55555554
[ 102.114172][T14893] Call Trace:
[ 102.114805][T14893] <TASK>
[ 102.115397][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170
[ 102.116547][T14893] ? __warn.cold+0x110/0x210
[ 102.117461][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170
[ 102.118667][T14893] ? report_bug+0x1b9/0x320
[ 102.119571][T14893] ? handle_bug+0x54/0x90
[ 102.120494][T14893] ? exc_invalid_op+0x17/0x50
[ 102.121433][T14893] ? asm_exc_invalid_op+0x1a/0x20
[ 102.122435][T14893] ? __wake_up_klogd.part.0+0x76/0xd0
[ 102.123506][T14893] ? dump_page+0x4f/0x60
[ 102.124352][T14893] ? folio_lruvec_lock_irqsave+0x10e/0x170
[ 102.125500][T14893] folio_batch_move_lru+0xd4/0x200
[ 102.126577][T14893] ? __pfx_lru_add+0x10/0x10
[ 102.127505][T14893] __folio_batch_add_and_move+0x391/0x720
[ 102.128633][T14893] ? __pfx_lru_add+0x10/0x10
[ 102.129550][T14893] folio_putback_lru+0x16/0x80
[ 102.130564][T14893] migrate_device_finalize+0x9b/0x530
[ 102.131640][T14893] dmirror_migrate_to_device.constprop.0+0x7c5/0xad0
[ 102.133047][T14893] dmirror_fops_unlocked_ioctl+0x89b/0xc80

Likely, nothing else goes wrong: putting the last folio reference will
remove the folio from the LRU again. So besides memcg complaining, adding
the folio to be freed to the LRU is just an unnecessary step.

The new flow resembles what we have in migrate_folio_move(): add the dst
to the lru, remove migration ptes, unlock and unref dst.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 8763cb45ab96 ("mm/migrate: new memory migration helper for use with device memory")
Signed-off-by: David Hildenbrand <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6
# b1f20206 30-Aug-2024 Yu Zhao <[email protected]>

mm: remap unused subpages to shared zeropage when splitting isolated thp

Patch series "mm: split underused THPs", v5.

The current upstream default policy for THP is always. However, Meta uses
madv

mm: remap unused subpages to shared zeropage when splitting isolated thp

Patch series "mm: split underused THPs", v5.

The current upstream default policy for THP is always. However, Meta uses
madvise in production as the current THP=always policy vastly
overprovisions THPs in sparsely accessed memory areas, resulting in
excessive memory pressure and premature OOM killing. Using madvise +
relying on khugepaged has certain drawbacks over THP=always. Using
madvise hints mean THPs aren't "transparent" and require userspace
changes. Waiting for khugepaged to scan memory and collapse pages into
THP can be slow and unpredictable in terms of performance (i.e. you dont
know when the collapse will happen), while production environments require
predictable performance. If there is enough memory available, its better
for both performance and predictability to have a THP from fault time,
i.e. THP=always rather than wait for khugepaged to collapse it, and deal
with sparsely populated THPs when the system is running out of memory.

This patch series is an attempt to mitigate the issue of running out of
memory when THP is always enabled. During runtime whenever a THP is being
faulted in or collapsed by khugepaged, the THP is added to a list.
Whenever memory reclaim happens, the kernel runs the deferred_split
shrinker which goes through the list and checks if the THP was underused,
i.e. how many of the base 4K pages of the entire THP were zero-filled.
If this number goes above a certain threshold, the shrinker will attempt
to split that THP. Then at remap time, the pages that were zero-filled
are mapped to the shared zeropage, hence saving memory. This method
avoids the downside of wasting memory in areas where THP is sparsely
filled when THP is always enabled, while still providing the upside THPs
like reduced TLB misses without having to use madvise.

Meta production workloads that were CPU bound (>99% CPU utilzation) were
tested with THP shrinker. The results after 2 hours are as follows:

| THP=madvise | THP=always | THP=always
| | | + shrinker series
| | | + max_ptes_none=409
-----------------------------------------------------------------------------
Performance improvement | - | +1.8% | +1.7%
(over THP=madvise) | | |
-----------------------------------------------------------------------------
Memory usage | 54.6G | 58.8G (+7.7%) | 55.9G (+2.4%)
-----------------------------------------------------------------------------
max_ptes_none=409 means that any THP that has more than 409 out of 512
(80%) zero filled filled pages will be split.

To test out the patches, the below commands without the shrinker will
invoke OOM killer immediately and kill stress, but will not fail with the
shrinker:

echo 450 > /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none
mkdir /sys/fs/cgroup/test
echo $$ > /sys/fs/cgroup/test/cgroup.procs
echo 20M > /sys/fs/cgroup/test/memory.max
echo 0 > /sys/fs/cgroup/test/memory.swap.max
# allocate twice memory.max for each stress worker and touch 40/512 of
# each THP, i.e. vm-stride 50K.
# With the shrinker, max_ptes_none of 470 and below won't invoke OOM
# killer.
# Without the shrinker, OOM killer is invoked immediately irrespective
# of max_ptes_none value and kills stress.
stress --vm 1 --vm-bytes 40M --vm-stride 50K


This patch (of 5):

Here being unused means containing only zeros and inaccessible to
userspace. When splitting an isolated thp under reclaim or migration, the
unused subpages can be mapped to the shared zeropage, hence saving memory.
This is particularly helpful when the internal fragmentation of a thp is
high, i.e. it has many untouched subpages.

This is also a prerequisite for THP low utilization shrinker which will be
introduced in later patches, where underutilized THPs are split, and the
zero-filled pages are freed saving memory.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Yu Zhao <[email protected]>
Signed-off-by: Usama Arif <[email protected]>
Tested-by: Shuang Zhai <[email protected]>
Cc: Alexander Zhu <[email protected]>
Cc: Barry Song <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Domenico Cerasuolo <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Kairui Song <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Nico Pache <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Shuang Zhai <[email protected]>
Cc: Hugh Dickins <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 775d28fd 26-Aug-2024 Kefeng Wang <[email protected]>

mm: remove isolate_lru_page()

There are no more callers of isolate_lru_page(), remove it.

[[email protected]: convert page to folio in comment and document, per Matthew]
Link: https://lk

mm: remove isolate_lru_page()

There are no more callers of isolate_lru_page(), remove it.

[[email protected]: convert page to folio in comment and document, per Matthew]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Reviewed-by: Vishal Moola (Oracle) <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 58bf8c2b 26-Aug-2024 Kefeng Wang <[email protected]>

mm: migrate_device: use more folio in migrate_device_finalize()

Saves a couple of calls to compound_head() and remove last two callers of
putback_lru_page().

Link: https://lkml.kernel.org/r/2024082

mm: migrate_device: use more folio in migrate_device_finalize()

Saves a couple of calls to compound_head() and remove last two callers of
putback_lru_page().

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Reviewed-by: Vishal Moola (Oracle) <[email protected]>
Reviewed-by: Alistair Popple <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 39e618d9 26-Aug-2024 Kefeng Wang <[email protected]>

mm: migrate_device: use more folio in migrate_device_unmap()

The page for migrate_device_unmap() already has a reference, so it is safe
to convert the page to folio to save a few calls to compound_h

mm: migrate_device: use more folio in migrate_device_unmap()

The page for migrate_device_unmap() already has a reference, so it is safe
to convert the page to folio to save a few calls to compound_head(), which
removes the last isolate_lru_page() call.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Reviewed-by: Vishal Moola (Oracle) <[email protected]>
Reviewed-by: Alistair Popple <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 53456b7b 26-Aug-2024 Kefeng Wang <[email protected]>

mm: migrate_device: use a folio in migrate_device_range()

Save two calls to compound_head() and use folio throughout.

Link: https://lkml.kernel.org/r/20240826065814.1336616-3-wangkefeng.wang@huawei

mm: migrate_device: use a folio in migrate_device_range()

Save two calls to compound_head() and use folio throughout.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Reviewed-by: Vishal Moola (Oracle) <[email protected]>
Reviewed-by: Alistair Popple <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 5c8525a3 26-Aug-2024 Kefeng Wang <[email protected]>

mm: migrate_device: convert to migrate_device_coherent_folio()

Patch series "mm: finish isolate/putback_lru_page()".

Convert to use more folios in migrate_device.c, then we could remove
isolate_lru

mm: migrate_device: convert to migrate_device_coherent_folio()

Patch series "mm: finish isolate/putback_lru_page()".

Convert to use more folios in migrate_device.c, then we could remove
isolate_lru_page() and putback_lru_page().


This patch (of 6):

Save a few calls to compound_head() and use folio throughout.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Reviewed-by: Vishal Moola (Oracle) <[email protected]>
Reviewed-by: Alistair Popple <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5
# 15bde4ab 17-Jun-2024 Barry Song <[email protected]>

mm: extend rmap flags arguments for folio_add_new_anon_rmap

Patch series "mm: clarify folio_add_new_anon_rmap() and
__folio_add_anon_rmap()", v2.

This patchset is preparatory work for mTHP swapin.

mm: extend rmap flags arguments for folio_add_new_anon_rmap

Patch series "mm: clarify folio_add_new_anon_rmap() and
__folio_add_anon_rmap()", v2.

This patchset is preparatory work for mTHP swapin.

folio_add_new_anon_rmap() assumes that new anon rmaps are always
exclusive. However, this assumption doesn’t hold true for cases like
do_swap_page(), where a new anon might be added to the swapcache and is
not necessarily exclusive.

The patchset extends the rmap flags to allow folio_add_new_anon_rmap() to
handle both exclusive and non-exclusive new anon folios. The
do_swap_page() function is updated to use this extended API with rmap
flags. Consequently, all new anon folios now consistently use
folio_add_new_anon_rmap(). The special case for !folio_test_anon() in
__folio_add_anon_rmap() can be safely removed.

In conclusion, new anon folios always use folio_add_new_anon_rmap(),
regardless of exclusivity. Old anon folios continue to use
__folio_add_anon_rmap() via folio_add_anon_rmap_pmd() and
folio_add_anon_rmap_ptes().


This patch (of 3):

In the case of a swap-in, a new anonymous folio is not necessarily
exclusive. This patch updates the rmap flags to allow a new anonymous
folio to be treated as either exclusive or non-exclusive. To maintain the
existing behavior, we always use EXCLUSIVE as the default setting.

[[email protected]: cleanup and constifications per David and akpm]
[[email protected]: fix missing doc for flags of folio_add_new_anon_rmap()]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: enhance doc for extend rmap flags arguments for folio_add_new_anon_rmap]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Barry Song <[email protected]>
Suggested-by: David Hildenbrand <[email protected]>
Tested-by: Shuai Yuan <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Chris Li <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Yosry Ahmed <[email protected]>
Cc: Yu Zhao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1
# 15b0c79c 24-May-2024 Kefeng Wang <[email protected]>

mm: migrate_device: unify migrate folio for MIGRATE_SYNC_NO_COPY

The __migrate_device_pages() won't copy page so MIGRATE_SYNC_NO_COPY
passed into migrate_folio()/migrate_folio_extra(), actually a ea

mm: migrate_device: unify migrate folio for MIGRATE_SYNC_NO_COPY

The __migrate_device_pages() won't copy page so MIGRATE_SYNC_NO_COPY
passed into migrate_folio()/migrate_folio_extra(), actually a easy way is
just to call folio_migrate_mapping()/folio_migrate_flags(), converting it
to unify and simplify the migrate device pages, which also remove the only
call for MIGRATE_SYNC_NO_COPY.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Reviewed-by: Jane Chu <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Benjamin LaHaise <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: Jiaqi Yan <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vishal Moola (Oracle) <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 6aaaef5b 24-May-2024 Kefeng Wang <[email protected]>

mm: migrate_device: use a newfolio in __migrate_device_pages()

Use a newfolio instead of newpage and convert to more folio api in
__migrate_device_pages().

Link: https://lkml.kernel.org/r/202405240

mm: migrate_device: use a newfolio in __migrate_device_pages()

Use a newfolio instead of newpage and convert to more folio api in
__migrate_device_pages().

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Reviewed-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Vishal Moola (Oracle) <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Benjamin LaHaise <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: Jiaqi Yan <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.9, v6.9-rc7, v6.9-rc6
# e18a9faf 23-Apr-2024 Matthew Wilcox (Oracle) <[email protected]>

migrate: expand the use of folio in __migrate_device_pages()

Removes a few calls to compound_head() and a call to page_mapping().

Link: https://lkml.kernel.org/r/20240423225552.4113447-5-willy@infr

migrate: expand the use of folio in __migrate_device_pages()

Removes a few calls to compound_head() and a call to page_mapping().

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Eric Biggers <[email protected]>
Cc: Sidhartha Kumar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.9-rc5, v6.9-rc4
# f2f8a7a0 09-Apr-2024 David Hildenbrand <[email protected]>

mm/migrate_device: use folio_mapcount() in migrate_vma_check_page()

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary. Let's convert migrate_vma_check_page(

mm/migrate_device: use folio_mapcount() in migrate_vma_check_page()

We want to limit the use of page_mapcount() to the places where it is
absolutely necessary. Let's convert migrate_vma_check_page() to work on a
folio internally so we can remove the page_mapcount() usage.

Note that we reject any large folios.

There is a lot more folio conversion to be had, but that has to wait for
another day. No functional change intended.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Richard Chang <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Yin Fengwei <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.9-rc3, v6.9-rc2
# b002a7b0 26-Mar-2024 Matthew Wilcox (Oracle) <[email protected]>

mm: convert migrate_vma_collect_pmd to use a folio

Convert the pmd directly to a folio and use it. Turns four calls to
compound_head() into one.

Link: https://lkml.kernel.org/r/20240326202833.5237

mm: convert migrate_vma_collect_pmd to use a folio

Convert the pmd directly to a folio and use it. Turns four calls to
compound_head() into one.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# f7842747 05-Apr-2024 Paolo Bonzini <[email protected]>

mm: replace set_pte_at_notify() with just set_pte_at()

With the demise of the .change_pte() MMU notifier callback, there is no
notification happening in set_pte_at_notify(). It is a synonym of
set_

mm: replace set_pte_at_notify() with just set_pte_at()

With the demise of the .change_pte() MMU notifier callback, there is no
notification happening in set_pte_at_notify(). It is a synonym of
set_pte_at() and can be replaced with it.

Signed-off-by: Paolo Bonzini <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
Message-ID: <[email protected]>
Acked-by: Andrew Morton <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>

show more ...


Revision tags: v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7
# e3b4b137 20-Dec-2023 David Hildenbrand <[email protected]>

mm: convert page_try_share_anon_rmap() to folio_try_share_anon_rmap_[pte|pmd]()

Let's convert it like we converted all the other rmap functions. Don't
introduce folio_try_share_anon_rmap_ptes() for

mm: convert page_try_share_anon_rmap() to folio_try_share_anon_rmap_[pte|pmd]()

Let's convert it like we converted all the other rmap functions. Don't
introduce folio_try_share_anon_rmap_ptes() for now, as we don't have a
user that wants rmap batching in sight. Pretty easy to add later.

All users are easy to convert -- only ksm.c doesn't use folios yet but
that is left for future work -- so let's just do it in a single shot.

While at it, turn the BUG_ON into a WARN_ON_ONCE.

Note that page_try_share_anon_rmap() so far didn't care about pte/pmd
mappings (no compound parameter). We're changing that so we can perform
better sanity checks and make the code actually more readable/consistent.
For example, __folio_rmap_sanity_checks() will make sure that a PMD range
actually falls completely into the folio.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: Yin Fengwei <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 5b205c7f 20-Dec-2023 David Hildenbrand <[email protected]>

mm/migrate_device: page_remove_rmap() -> folio_remove_rmap_pte()

Let's convert migrate_vma_collect_pmd(). While at it, perform more folio
conversion.

Link: https://lkml.kernel.org/r/20231220224504

mm/migrate_device: page_remove_rmap() -> folio_remove_rmap_pte()

Let's convert migrate_vma_collect_pmd(). While at it, perform more folio
conversion.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: Yin Fengwei <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.7-rc6
# d3b08273 11-Dec-2023 Matthew Wilcox (Oracle) <[email protected]>

mm: convert migrate_vma_insert_page() to use a folio

Replaces five calls to compound_head() with one.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mat

mm: convert migrate_vma_insert_page() to use a folio

Replaces five calls to compound_head() with one.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Alistair Popple <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5
# 49b06385 04-Aug-2023 Suren Baghdasaryan <[email protected]>

mm: enable page walking API to lock vmas during the walk

walk_page_range() and friends often operate under write-locked mmap_lock.
With introduction of vma locks, the vmas have to be locked as well

mm: enable page walking API to lock vmas during the walk

walk_page_range() and friends often operate under write-locked mmap_lock.
With introduction of vma locks, the vmas have to be locked as well during
such walks to prevent concurrent page faults in these areas. Add an
additional member to mm_walk_ops to indicate locking requirements for the
walk.

The change ensures that page walks which prevent concurrent page faults
by write-locking mmap_lock, operate correctly after introduction of
per-vma locks. With per-vma locks page faults can be handled under vma
lock without taking mmap_lock at all, so write locking mmap_lock would
not stop them. The change ensures vmas are properly locked during such
walks.

A sample issue this solves is do_mbind() performing queue_pages_range()
to queue pages for migration. Without this change a concurrent page
can be faulted into the area and be left out of migration.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Suren Baghdasaryan <[email protected]>
Suggested-by: Linus Torvalds <[email protected]>
Suggested-by: Jann Horn <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Laurent Dufour <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.5-rc4
# ec8832d0 25-Jul-2023 Alistair Popple <[email protected]>

mmu_notifiers: don't invalidate secondary TLBs as part of mmu_notifier_invalidate_range_end()

Secondary TLBs are now invalidated from the architecture specific TLB
invalidation functions. Therefore

mmu_notifiers: don't invalidate secondary TLBs as part of mmu_notifier_invalidate_range_end()

Secondary TLBs are now invalidated from the architecture specific TLB
invalidation functions. Therefore there is no need to explicitly notify
or invalidate as part of the range end functions. This means we can
remove mmu_notifier_invalidate_range_end_only() and some of the
ptep_*_notify() functions.

Link: https://lkml.kernel.org/r/90d749d03cbab256ca0edeb5287069599566d783.1690292440.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Cc: Andrew Donnellan <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chaitanya Kumar Borah <[email protected]>
Cc: Frederic Barrat <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Kevin Tian <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Nicolin Chen <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: SeongJae Park <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Zhi Wang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6
# df263d9a 07-Jun-2023 Mika Penttilä <[email protected]>

mm/migrate_device: try to handle swapcache pages

Migrating file pages and swapcache pages into device memory is not
supported. Try to get rid of the swap cache, and if successful, go ahead
as with

mm/migrate_device: try to handle swapcache pages

Migrating file pages and swapcache pages into device memory is not
supported. Try to get rid of the swap cache, and if successful, go ahead
as with other anonymous pages.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mika Penttilä <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Reviewed-by: Alistair Popple <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Ralph Campbell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 161e393c 13-Jun-2023 Rick Edgecombe <[email protected]>

mm: Make pte_mkwrite() take a VMA

The x86 Shadow stack feature includes a new type of memory called shadow
stack. This shadow stack memory has some unusual properties, which requires
some core mm ch

mm: Make pte_mkwrite() take a VMA

The x86 Shadow stack feature includes a new type of memory called shadow
stack. This shadow stack memory has some unusual properties, which requires
some core mm changes to function properly.

One of these unusual properties is that shadow stack memory is writable,
but only in limited ways. These limits are applied via a specific PTE
bit combination. Nevertheless, the memory is writable, and core mm code
will need to apply the writable permissions in the typical paths that
call pte_mkwrite(). Future patches will make pte_mkwrite() take a VMA, so
that the x86 implementation of it can know whether to create regular
writable or shadow stack mappings.

But there are a couple of challenges to this. Modifying the signatures of
each arch pte_mkwrite() implementation would be error prone because some
are generated with macros and would need to be re-implemented. Also, some
pte_mkwrite() callers operate on kernel memory without a VMA.

So this can be done in a three step process. First pte_mkwrite() can be
renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite()
added that just calls pte_mkwrite_novma(). Next callers without a VMA can
be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers
can be changed to take/pass a VMA.

Previous work pte_mkwrite() renamed pte_mkwrite_novma() and converted
callers that don't have a VMA were to use pte_mkwrite_novma(). So now
change pte_mkwrite() to take a VMA and change the remaining callers to
pass a VMA. Apply the same changes for pmd_mkwrite().

No functional change.

Suggested-by: David Hildenbrand <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Link: https://lore.kernel.org/all/20230613001108.3040476-4-rick.p.edgecombe%40intel.com

show more ...


# 1fec6890 21-Jun-2023 Matthew Wilcox (Oracle) <[email protected]>

mm: remove references to pagevec

Most of these should just refer to the LRU cache rather than the data
structure used to implement the LRU cache.

Link: https://lkml.kernel.org/r/20230621164557.3510

mm: remove references to pagevec

Most of these should just refer to the LRU cache rather than the data
structure used to implement the LRU cache.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


12