|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14 |
|
| #
9342bc13 |
| 24-Mar-2025 |
Jinjiang Tu <[email protected]> |
mm/memory_hotplug: fix call folio_test_large with tail page in do_migrate_range
We triggered the below BUG:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x2 pfn:0x240402 head: order
mm/memory_hotplug: fix call folio_test_large with tail page in do_migrate_range
We triggered the below BUG:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x2 pfn:0x240402 head: order:9 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x1ffffe0000000040(head|node=1|zone=3|lastcpupid=0x1ffff) page_type: f4(hugetlb) page dumped because: VM_BUG_ON_PAGE(page->compound_head & 1) ------------[ cut here ]------------ kernel BUG at ./include/linux/page-flags.h:310! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 7 UID: 0 PID: 166 Comm: sh Not tainted 6.14.0-rc7-dirty #374 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : const_folio_flags+0x3c/0x58 lr : const_folio_flags+0x3c/0x58 Call trace: const_folio_flags+0x3c/0x58 (P) do_migrate_range+0x164/0x720 offline_pages+0x63c/0x6fc memory_subsys_offline+0x190/0x1f4 device_offline+0xc0/0x13c state_store+0x90/0xd8 dev_attr_store+0x18/0x2c sysfs_kf_write+0x44/0x54 kernfs_fop_write_iter+0x120/0x1cc vfs_write+0x240/0x378 ksys_write+0x70/0x108 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x48/0x10c el0_svc_common.constprop.0+0x40/0xe0
When allocating a hugetlb folio, between the folio is taken from buddy and prep_compound_page() is called, start_isolate_page_range() and do_migrate_range() is called. When do_migrate_range() scans the head page of the hugetlb folio, the compound_head field isn't set, so scans the tail page next. And at this time, the compound_head field of tail page is set, folio_test_large() is called by tail page, thus triggers VM_BUG_ON().
To fix it, get folio refcount before calling folio_test_large().
Link: https://lkml.kernel.org/r/[email protected] Fixes: 8135d8926c08 ("mm: memory_hotplug: memory hotremove supports thp migration") Fixes: b62b51d2d159 ("mm: memory_hotplug: remove head variable in do_migrate_range()") Signed-off-by: Jinjiang Tu <[email protected]> Acked-by: Oscar Salvador <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
5f5ee52d |
| 18-Mar-2025 |
Jinjiang Tu <[email protected]> |
mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
Patch series "mm/vmscan: don't try to reclaim hwpoison folio".
Fix a bug during memory reclaim if folio is hwpoisoned.
This patch (of
mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper
Patch series "mm/vmscan: don't try to reclaim hwpoison folio".
Fix a bug during memory reclaim if folio is hwpoisoned.
This patch (of 2):
Introduce helper folio_contain_hwpoisoned_page() to check if the entire folio is hwpoisoned or it contains hwpoisoned pages.
Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jinjiang Tu <[email protected]> Acked-by: Miaohe Lin <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Nanyong Sun <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: <stable@vger,kernel.org> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4 |
|
| #
af288a42 |
| 17-Feb-2025 |
Ma Wupeng <[email protected]> |
hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio
Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined) add page poison checks in do_migrate_range i
hwpoison, memory_hotplug: lock folio before unmap hwpoisoned folio
Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined) add page poison checks in do_migrate_range in order to make offline hwpoisoned page possible by introducing isolate_lru_page and try_to_unmap for hwpoisoned page. However folio lock must be held before calling try_to_unmap. Add it to fix this problem.
Warning will be produced if folio is not locked during unmap:
------------[ cut here ]------------ kernel BUG at ./include/linux/swapops.h:400! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 4 UID: 0 PID: 411 Comm: bash Tainted: G W 6.13.0-rc1-00016-g3c434c7ee82a-dirty #41 Tainted: [W]=WARN Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : try_to_unmap_one+0xb08/0xd3c lr : try_to_unmap_one+0x3dc/0xd3c Call trace: try_to_unmap_one+0xb08/0xd3c (P) try_to_unmap_one+0x3dc/0xd3c (L) rmap_walk_anon+0xdc/0x1f8 rmap_walk+0x3c/0x58 try_to_unmap+0x88/0x90 unmap_poisoned_folio+0x30/0xa8 do_migrate_range+0x4a0/0x568 offline_pages+0x5a4/0x670 memory_block_action+0x17c/0x374 memory_subsys_offline+0x3c/0x78 device_offline+0xa4/0xd0 state_store+0x8c/0xf0 dev_attr_store+0x18/0x2c sysfs_kf_write+0x44/0x54 kernfs_fop_write_iter+0x118/0x1a8 vfs_write+0x3a8/0x4bc ksys_write+0x6c/0xf8 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x44/0x100 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x30/0xd0 el0t_64_sync_handler+0xc8/0xcc el0t_64_sync+0x198/0x19c Code: f9407be0 b5fff320 d4210000 17ffff97 (d4210000) ---[ end trace 0000000000000000 ]---
Link: https://lkml.kernel.org/r/[email protected] Fixes: b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined") Signed-off-by: Ma Wupeng <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Miaohe Lin <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
773b9a6a |
| 17-Feb-2025 |
Ma Wupeng <[email protected]> |
mm: memory-hotplug: check folio ref count first in do_migrate_range
If a folio has an increased reference count, folio_try_get() will acquire it, perform necessary operations, and then release it.
mm: memory-hotplug: check folio ref count first in do_migrate_range
If a folio has an increased reference count, folio_try_get() will acquire it, perform necessary operations, and then release it. In the case of a poisoned folio without an elevated reference count (which is unlikely for memory-failure), folio_try_get() will simply bypass it.
Therefore, relocate the folio_try_get() function, responsible for checking and acquiring this reference count at first.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ma Wupeng <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Miaohe Lin <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
b81679b1 |
| 17-Feb-2025 |
Ma Wupeng <[email protected]> |
mm: memory-failure: update ttu flag inside unmap_poisoned_folio
Patch series "mm: memory_failure: unmap poisoned folio during migrate properly", v3.
Fix two bugs during folio migration if the folio
mm: memory-failure: update ttu flag inside unmap_poisoned_folio
Patch series "mm: memory_failure: unmap poisoned folio during migrate properly", v3.
Fix two bugs during folio migration if the folio is poisoned.
This patch (of 3):
Commit 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON") introduce TTU_HWPOISON to replace TTU_IGNORE_HWPOISON in order to stop send SIGBUS signal when accessing an error page after a memory error on a clean folio. However during page migration, anon folio must be set with TTU_HWPOISON during unmap_*(). For pagecache we need some policy just like the one in hwpoison_user_mappings to set this flag. So move this policy from hwpoison_user_mappings to unmap_poisoned_folio to handle this warning properly.
Warning will be produced during unamp poison folio with the following log:
------------[ cut here ]------------ WARNING: CPU: 1 PID: 365 at mm/rmap.c:1847 try_to_unmap_one+0x8fc/0xd3c Modules linked in: CPU: 1 UID: 0 PID: 365 Comm: bash Tainted: G W 6.13.0-rc1-00018-gacdb4bbda7ab #42 Tainted: [W]=WARN Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : try_to_unmap_one+0x8fc/0xd3c lr : try_to_unmap_one+0x3dc/0xd3c Call trace: try_to_unmap_one+0x8fc/0xd3c (P) try_to_unmap_one+0x3dc/0xd3c (L) rmap_walk_anon+0xdc/0x1f8 rmap_walk+0x3c/0x58 try_to_unmap+0x88/0x90 unmap_poisoned_folio+0x30/0xa8 do_migrate_range+0x4a0/0x568 offline_pages+0x5a4/0x670 memory_block_action+0x17c/0x374 memory_subsys_offline+0x3c/0x78 device_offline+0xa4/0xd0 state_store+0x8c/0xf0 dev_attr_store+0x18/0x2c sysfs_kf_write+0x44/0x54 kernfs_fop_write_iter+0x118/0x1a8 vfs_write+0x3a8/0x4bc ksys_write+0x6c/0xf8 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x44/0x100 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x30/0xd0 el0t_64_sync_handler+0xc8/0xcc el0t_64_sync+0x198/0x19c ---[ end trace 0000000000000000 ]---
[[email protected]: unmap_poisoned_folio(): remove shadowed local `mapping', per Miaohe] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 6da6b1d4a7df ("mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON") Signed-off-by: Ma Wupeng <[email protected]> Suggested-by: David Hildenbrand <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Miaohe Lin <[email protected]> Cc: Ma Wupeng <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4 |
|
| #
44d46b76 |
| 20-Dec-2024 |
Gregory Price <[email protected]> |
mm: add build-time option for hotplug memory default online type
Memory hotplug presently auto-onlines memory into a zone the kernel deems appropriate if CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y.
The
mm: add build-time option for hotplug memory default online type
Memory hotplug presently auto-onlines memory into a zone the kernel deems appropriate if CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y.
The memhp_default_state boot param enables runtime config, but it's not possible to do this at build-time.
Remove CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE, and replace it with CONFIG_MHP_DEFAULT_ONLINE_TYPE_* choices that sync with the boot param.
Selections: CONFIG_MHP_DEFAULT_ONLINE_TYPE_OFFLINE => mhp_default_online_type = "offline" Memory will not be onlined automatically.
CONFIG_MHP_DEFAULT_ONLINE_TYPE_ONLINE_AUTO => mhp_default_online_type = "online" Memory will be onlined automatically in a zone deemed. appropriate by the kernel.
CONFIG_MHP_DEFAULT_ONLINE_TYPE_ONLINE_KERNEL => mhp_default_online_type = "online_kernel" Memory will be onlined automatically. The zone may allow kernel data (e.g. ZONE_NORMAL).
CONFIG_MHP_DEFAULT_ONLINE_TYPE_ONLINE_MOVABLE => mhp_default_online_type = "online_movable" Memory will be onlined automatically. The zone will be ZONE_MOVABLE.
Default to CONFIG_MHP_DEFAULT_ONLINE_TYPE_OFFLINE to match the existing default CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n behavior.
Existing users of CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=y should use CONFIG_MHP_DEFAULT_ONLINE_TYPE_ONLINE_AUTO.
[[email protected]: update KConfig comments] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Gregory Price <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: WANG Xuerui <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc3, v6.13-rc2 |
|
| #
a684d59a |
| 05-Dec-2024 |
David Hildenbrand <[email protected]> |
mm/memory_hotplug: don't use __GFP_HARDWALL when migrating pages via memory offlining
We'll migrate pages allocated by other context; respecting the cpuset of the memory offlining context when alloc
mm/memory_hotplug: don't use __GFP_HARDWALL when migrating pages via memory offlining
We'll migrate pages allocated by other context; respecting the cpuset of the memory offlining context when allocating a migration target does not make sense.
Drop the __GFP_HARDWALL by using GFP_KERNEL.
Note that in an ideal world, migration code could figure out the cpuset of the original context and take that into consideration.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Suggested-by: Vlastimil Babka <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Acked-by: Oscar Salvador <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
b9e40605 |
| 03-Dec-2024 |
David Hildenbrand <[email protected]> |
mm/page_isolation: don't pass gfp flags to start_isolate_page_range()
The parameter is unused, so let's stop passing it.
Link: https://lkml.kernel.org/r/[email protected] Sig
mm/page_isolation: don't pass gfp flags to start_isolate_page_range()
The parameter is unused, so let's stop passing it.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Zi Yan <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Reviewed-by: Vishal Moola (Oracle) <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N Rao <[email protected]> Cc: Nicholas Piggin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
dd467f92 |
| 03-Dec-2024 |
David Hildenbrand <[email protected]> |
mm/memory_hotplug: move debug_pagealloc_map_pages() into online_pages_range()
In the near future, we want to have a single way to handover PageOffline pages to the buddy, whereby they could have:
(
mm/memory_hotplug: move debug_pagealloc_map_pages() into online_pages_range()
In the near future, we want to have a single way to handover PageOffline pages to the buddy, whereby they could have:
(a) Never been exposed to the buddy before: kept PageOffline when onlining the memory block. (b) Been allocated from the buddy, for example using alloc_contig_range() to then be set PageOffline,
Let's start by making generic_online_page()->__free_pages_core() less special compared to ordinary page freeing (e.g., free_contig_range()), and perform the debug_pagealloc_map_pages() call unconditionally, even when the online callback might decide to keep the pages offline.
All pages are already initialized with PageOffline, so nobody touches them either way.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3 |
|
| #
afe789b7 |
| 09-Oct-2024 |
John Hubbard <[email protected]> |
kaslr: rename physmem_end and PHYSMEM_END to direct_map_physmem_end
For clarity. It's increasingly hard to reason about the code, when KASLR is moving around the boundaries. In this case where KAS
kaslr: rename physmem_end and PHYSMEM_END to direct_map_physmem_end
For clarity. It's increasingly hard to reason about the code, when KASLR is moving around the boundaries. In this case where KASLR is randomizing the location of the kernel image within physical memory, the maximum number of address bits for physical memory has not changed.
What has changed is the ending address of memory that is allowed to be directly mapped by the kernel.
Let's name the variable, and the associated macro accordingly.
Also, enhance the comment above the direct_map_physmem_end definition, to further clarify how this all works.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: John Hubbard <[email protected]> Reviewed-by: Pankaj Gupta <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Will Deacon <[email protected]> Reviewed-by: Mike Rapoport (Microsoft) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Jordan Niethe <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6 |
|
| #
6f1833b8 |
| 27-Aug-2024 |
Kefeng Wang <[email protected]> |
mm: memory_hotplug: unify Huge/LRU/non-LRU movable folio isolation
Use the isolate_folio_to_list() to unify hugetlb/LRU/non-LRU folio isolation, which cleanup code a bit and save a few calls to comp
mm: memory_hotplug: unify Huge/LRU/non-LRU movable folio isolation
Use the isolate_folio_to_list() to unify hugetlb/LRU/non-LRU folio isolation, which cleanup code a bit and save a few calls to compound_head().
[[email protected]: various fixes] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
e8a796fa |
| 27-Aug-2024 |
Kefeng Wang <[email protected]> |
mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range()
Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined") don't handle the hugetlb pages, the en
mm: memory_hotplug: check hwpoisoned page firstly in do_migrate_range()
Commit b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined") don't handle the hugetlb pages, the endless loop still occur if offline a hwpoison hugetlb page, luckly, after the commit e591ef7d96d6 ("mm, hwpoison,hugetlb,memory_hotplug: hotremove memory section with hwpoisoned hugepage"), the HPageMigratable of hugetlb page will be cleared, and the hwpoison hugetlb page will be skipped in scan_movable_pages(), so the endless loop issue is fixed.
However if the HPageMigratable() check passed(without reference and lock), the hugetlb page may be hwpoisoned, it won't cause issue since the hwpoisoned page will be handled correctly in the next movable pages scan loop, and it will be isolated in do_migrate_range() but fails to migrate. In order to avoid the unnecessary isolation and unify all hwpoisoned page handling, let's unconditionally check hwpoison firstly, and if it is a hwpoisoned hugetlb page, try to unmap it as the catch all safety net like normal page does.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Acked-by: David Hildenbrand <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
b62b51d2 |
| 27-Aug-2024 |
Kefeng Wang <[email protected]> |
mm: memory_hotplug: remove head variable in do_migrate_range()
Patch series "mm: memory_hotplug: improve do_migrate_range()", v3.
Unify hwpoisoned page handling and isolation of HugeTLB/LRU/non-LRU
mm: memory_hotplug: remove head variable in do_migrate_range()
Patch series "mm: memory_hotplug: improve do_migrate_range()", v3.
Unify hwpoisoned page handling and isolation of HugeTLB/LRU/non-LRU movable page, also convert to use folios in do_migrate_range().
This patch (of 5):
Directly use a folio for HugeTLB and THP when calculate the next pfn, then remove unused head variable.
Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1 |
|
| #
f732e242 |
| 26-Jul-2024 |
Wei Yang <[email protected]> |
mm/memory_hotplug: get rid of __ref
After commit 73db3abdca58 ("init/modpost: conditionally check section mismatch to __meminit*"), we can get rid of __ref annotations.
Link: https://lkml.kernel.or
mm/memory_hotplug: get rid of __ref
After commit 73db3abdca58 ("init/modpost: conditionally check section mismatch to __meminit*"), we can get rid of __ref annotations.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
ea72ce5d |
| 13-Aug-2024 |
Thomas Gleixner <[email protected]> |
x86/kaslr: Expose and use the end of the physical memory address space
iounmap() on x86 occasionally fails to unmap because the provided valid ioremap address is not below high_memory. It turned out
x86/kaslr: Expose and use the end of the physical memory address space
iounmap() on x86 occasionally fails to unmap because the provided valid ioremap address is not below high_memory. It turned out that this happens due to KASLR.
KASLR uses the full address space between PAGE_OFFSET and vaddr_end to randomize the starting points of the direct map, vmalloc and vmemmap regions. It thereby limits the size of the direct map by using the installed memory size plus an extra configurable margin for hot-plug memory. This limitation is done to gain more randomization space because otherwise only the holes between the direct map, vmalloc, vmemmap and vaddr_end would be usable for randomizing.
The limited direct map size is not exposed to the rest of the kernel, so the memory hot-plug and resource management related code paths still operate under the assumption that the available address space can be determined with MAX_PHYSMEM_BITS.
request_free_mem_region() allocates from (1 << MAX_PHYSMEM_BITS) - 1 downwards. That means the first allocation happens past the end of the direct map and if unlucky this address is in the vmalloc space, which causes high_memory to become greater than VMALLOC_START and consequently causes iounmap() to fail for valid ioremap addresses.
MAX_PHYSMEM_BITS cannot be changed for that because the randomization does not align with address bit boundaries and there are other places which actually require to know the maximum number of address bits. All remaining usage sites of MAX_PHYSMEM_BITS have been analyzed and found to be correct.
Cure this by exposing the end of the direct map via PHYSMEM_END and use that for the memory hot-plug and resource management related places instead of relying on MAX_PHYSMEM_BITS. In the KASLR case PHYSMEM_END maps to a variable which is initialized by the KASLR initialization and otherwise it is based on MAX_PHYSMEM_BITS as before.
To prevent future hickups add a check into add_pages() to catch callers trying to add memory above PHYSMEM_END.
Fixes: 0483e1fa6e09 ("x86/mm: Implement ASLR for kernel memory regions") Reported-by: Max Ramanouski <[email protected]> Reported-by: Alistair Popple <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-By: Max Ramanouski <[email protected]> Tested-by: Alistair Popple <[email protected]> Reviewed-by: Dan Williams <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Reviewed-by: Kees Cook <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/all/87ed6soy3z.ffs@tglx
show more ...
|
|
Revision tags: v6.10, v6.10-rc7 |
|
| #
f6953e22 |
| 06-Jul-2024 |
Wei Yang <[email protected]> |
mm/page_alloc: put __free_pages_core() in __meminit section
__free_pages_core() is only used in bootmem init and hot-add memory init path. Let's put it in __meminit section.
Link: https://lkml.ker
mm/page_alloc: put __free_pages_core() in __meminit section
__free_pages_core() is only used in bootmem init and hot-add memory init path. Let's put it in __meminit section.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wei Yang <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3 |
|
| #
50625744 |
| 07-Jun-2024 |
David Hildenbrand <[email protected]> |
mm/memory_hotplug: skip adjust_managed_page_count() for PageOffline() pages when offlining
We currently have a hack for virtio-mem in place to handle memory offlining with PageOffline pages for whic
mm/memory_hotplug: skip adjust_managed_page_count() for PageOffline() pages when offlining
We currently have a hack for virtio-mem in place to handle memory offlining with PageOffline pages for which we already adjusted the managed page count.
Let's enlighten memory offlining code so we can get rid of that hack, and document the situation.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Oscar Salvador <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dexuan Cui <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eugenio Pérez <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jason Wang <[email protected]> Cc: Juergen Gross <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Marco Elver <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Rapoport (IBM) <[email protected]> Cc: Oleksandr Tyshchenko <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Wei Liu <[email protected]> Cc: Xuan Zhuo <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
503b158f |
| 07-Jun-2024 |
David Hildenbrand <[email protected]> |
mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() instead of PageReserved()
We currently initialize the memmap such that PG_reserved is set and the refcount of the page is 1.
mm/memory_hotplug: initialize memmap of !ZONE_DEVICE with PageOffline() instead of PageReserved()
We currently initialize the memmap such that PG_reserved is set and the refcount of the page is 1. In virtio-mem code, we have to manually clear that PG_reserved flag to make memory offlining with partially hotplugged memory blocks possible: has_unmovable_pages() would otherwise bail out on such pages.
We want to avoid PG_reserved where possible and move to typed pages instead. Further, we want to further enlighten memory offlining code about PG_offline: offline pages in an online memory section. One example is handling managed page count adjustments in a cleaner way during memory offlining.
So let's initialize the pages with PG_offline instead of PG_reserved. generic_online_page()->__free_pages_core() will now clear that flag before handing that memory to the buddy.
Note that the page refcount is still 1 and would forbid offlining of such memory except when special care is take during GOING_OFFLINE as currently only implemented by virtio-mem.
With this change, we can now get non-PageReserved() pages in the XEN balloon list. From what I can tell, that can already happen via decrease_reservation(), so that should be fine.
HV-balloon should not really observe a change: partial online memory blocks still cannot get surprise-offlined, because the refcount of these PageOffline() pages is 1.
Update virtio-mem, HV-balloon and XEN-balloon code to be aware that hotplugged pages are now PageOffline() instead of PageReserved() before they are handed over to the buddy.
We'll leave the ZONE_DEVICE case alone for now.
Note that self-hosted vmemmap pages will no longer be marked as reserved. This matches ordinary vmemmap pages allocated from the buddy during memory hotplug. Now, really only vmemmap pages allocated from memblock during early boot will be marked reserved. Existing PageReserved() checks seem to be handling all relevant cases correctly even after this change.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Oscar Salvador <[email protected]> [generic memory-hotplug bits] Cc: Alexander Potapenko <[email protected]> Cc: Dexuan Cui <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eugenio Pérez <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jason Wang <[email protected]> Cc: Juergen Gross <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Marco Elver <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Rapoport (IBM) <[email protected]> Cc: Oleksandr Tyshchenko <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Wei Liu <[email protected]> Cc: Xuan Zhuo <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
13c52654 |
| 07-Jun-2024 |
David Hildenbrand <[email protected]> |
mm: pass meminit_context to __free_pages_core()
Patch series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE".
This can be a considered a long-overdue follow-up to
mm: pass meminit_context to __free_pages_core()
Patch series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE".
This can be a considered a long-overdue follow-up to some parts of [1]. The patches are based on [2], but they are not strictly required -- just makes it clearer why we can use adjust_managed_page_count() for memory hotplug without going into details about highmem.
We stop initializing pages with PageReserved() in memory hotplug code -- except when dealing with ZONE_DEVICE for now. Instead, we use PageOffline(): all pages are initialized to PageOffline() when onlining a memory section, and only the ones actually getting exposed to the system/page allocator will get PageOffline cleared.
This way, we enlighten memory hotplug more about PageOffline() pages and can cleanup some hacks we have in virtio-mem code.
What about ZONE_DEVICE? PageOffline() is wrong, but we might just stop using PageReserved() for them later by simply checking for is_zone_device_page() at suitable places. That will be a separate patch set / proposal.
This primarily affects virtio-mem, HV-balloon and XEN balloon. I only briefly tested with virtio-mem, which benefits most from these cleanups.
[1] https://lore.kernel.org/all/[email protected]/ [2] https://lkml.kernel.org/r/[email protected]
This patch (of 3):
In preparation for further changes, let's teach __free_pages_core() about the differences of memory hotplug handling.
Move the memory hotplug specific handling from generic_online_page() to __free_pages_core(), use adjust_managed_page_count() on the memory hotplug path, and spell out why memory freed via memblock cannot currently use adjust_managed_page_count().
[[email protected]: add missed CONFIG_DEFERRED_STRUCT_PAGE_INIT] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix up the memblock comment, per Oscar] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: add the parameter name also in the declaration] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dexuan Cui <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eugenio Pérez <[email protected]> Cc: Haiyang Zhang <[email protected]> Cc: Jason Wang <[email protected]> Cc: Juergen Gross <[email protected]> Cc: "K. Y. Srinivasan" <[email protected]> Cc: Marco Elver <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Mike Rapoport (IBM) <[email protected]> Cc: Oleksandr Tyshchenko <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Wei Liu <[email protected]> Cc: Xuan Zhuo <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
5958d359 |
| 06-Jun-2024 |
Anastasia Belova <[email protected]> |
mm/memory_hotplug: prevent accessing by index=-1
nid may be equal to NUMA_NO_NODE=-1. Prevent accessing node_data array by invalid index with check for nid.
Found by Linux Verification Center (lin
mm/memory_hotplug: prevent accessing by index=-1
nid may be equal to NUMA_NO_NODE=-1. Prevent accessing node_data array by invalid index with check for nid.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Link: https://lkml.kernel.org/r/[email protected] Fixes: e83a437faa62 ("mm/memory_hotplug: introduce "auto-movable" online policy") Signed-off-by: Anastasia Belova <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
7b09fa7e |
| 05-Jun-2024 |
Jonathan Cameron <[email protected]> |
mm/memory_hotplug: drop memblock_phys_free() call in try_remove_memory()
The call for memblock_phys_free() in try_remove_memory() does not balance any call to memblock_alloc() (or memblock_reserve()
mm/memory_hotplug: drop memblock_phys_free() call in try_remove_memory()
The call for memblock_phys_free() in try_remove_memory() does not balance any call to memblock_alloc() (or memblock_reserve() for that matter).
There are no memblock_reserve() calls in mm/memory_hotplug.c, no memblock allocations possible after mm_core_init(), and even if memblock_add_node() called from add_memory_resource() would need to allocate memory, that memory would ba allocated from slab.
The patch f9126ab9241f ("memory-hotplug: fix wrong edge when hot add a new node") that introduced that call to memblock_free() does not provide adequate description why that was required and tinkering with memblock in the context of memory hotplug on x86 seems bogus because x86 never kept memblock after boot anyway.
Drop memblock_phys_free() call in try_remove_memory().
[[email protected]: rewrite the commit message] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Mike Rapoport (IBM) <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc2 |
|
| #
16540dae |
| 30-May-2024 |
Sidhartha Kumar <[email protected]> |
mm/hugetlb: mm/memory_hotplug: use a folio in scan_movable_pages()
By using a folio in scan_movable_pages() we convert the last user of the page-based hugetlb information macro functions to the foli
mm/hugetlb: mm/memory_hotplug: use a folio in scan_movable_pages()
By using a folio in scan_movable_pages() we convert the last user of the page-based hugetlb information macro functions to the folio version. After this conversion, we can safely remove the page-based definitions from include/linux/hugetlb.h.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Muchun Song <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4 |
|
| #
d199483c |
| 12-Apr-2024 |
Sidhartha Kumar <[email protected]> |
mm/hugetlb: rename dissolve_free_huge_pages() to dissolve_free_hugetlb_folios()
dissolve_free_huge_pages() only uses folios internally, rename it to dissolve_free_hugetlb_folios() and change the com
mm/hugetlb: rename dissolve_free_huge_pages() to dissolve_free_hugetlb_folios()
dissolve_free_huge_pages() only uses folios internally, rename it to dissolve_free_hugetlb_folios() and change the comments which reference it.
[[email protected]: remove unneeded `extern'] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sidhartha Kumar <[email protected]> Reviewed-by: Vishal Moola (Oracle) <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Cc: Jane Chu <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Muchun Song <[email protected]> Cc: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8 |
|
| #
e42dfe4e |
| 06-Mar-2024 |
Baolin Wang <[email protected]> |
mm: record the migration reason for struct migration_target_control
Patch series "make the hugetlb migration strategy consistent", v2.
As discussed in previous thread [1], there is an inconsistency
mm: record the migration reason for struct migration_target_control
Patch series "make the hugetlb migration strategy consistent", v2.
As discussed in previous thread [1], there is an inconsistency when handling hugetlb migration. When handling the migration of freed hugetlb, it prevents fallback to other NUMA nodes in alloc_and_dissolve_hugetlb_folio(). However, when dealing with in-use hugetlb, it allows fallback to other NUMA nodes in alloc_hugetlb_folio_nodemask(), which can break the per-node hugetlb pool and might result in unexpected failures when node bound workloads doesn't get what is asssumed available.
This patchset tries to make the hugetlb migration strategy more clear and consistent. Please find details in each patch.
[1] https://lore.kernel.org/all/6f26ce22d2fcd523418a085f2c588fe0776d46e7.1706794035.git.baolin.wang@linux.alibaba.com/
This patch (of 2):
To support different hugetlb allocation strategies during hugetlb migration based on various migration reasons, record the migration reason in the migration_target_control structure as a preparation.
Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/7b95d4981e07211f57139fc5b1f7ce91b920cee4.1709719720.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2 |
|
| #
42d93582 |
| 24-Jan-2024 |
Vishal Verma <[email protected]> |
mm/memory_hotplug: export mhp_supports_memmap_on_memory()
In preparation for adding sysfs ABI to toggle memmap_on_memory semantics for drivers adding memory, export the mhp_supports_memmap_on_memory
mm/memory_hotplug: export mhp_supports_memmap_on_memory()
In preparation for adding sysfs ABI to toggle memmap_on_memory semantics for drivers adding memory, export the mhp_supports_memmap_on_memory() helper. This allows drivers to check if memmap_on_memory support is available before trying to request it, and display an appropriate message if it isn't available. As part of this, remove the size argument to this - with recent updates to allow memmap_on_memory for larger ranges, and the internal splitting of altmaps into respective memory blocks, the size argument is meaningless.
[[email protected]: fix build] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vishal Verma <[email protected]> Acked-by: David Hildenbrand <[email protected]> Suggested-by: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Li Zhijian <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Huang Ying <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|