History log of /linux-6.15/include/linux/memblock.h (Results 1 – 25 of 169)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7
# 8afa901c 13-Mar-2025 Mike Rapoport (Microsoft) <[email protected]>

arch, mm: make releasing of memory to page allocator more explicit

The point where the memory is released from memblock to the buddy
allocator is hidden inside arch-specific mem_init()s and the call

arch, mm: make releasing of memory to page allocator more explicit

The point where the memory is released from memblock to the buddy
allocator is hidden inside arch-specific mem_init()s and the call to
memblock_free_all() is needlessly duplicated in every artiste cure and
after introduction of arch_mm_preinit() hook, mem_init() implementation on
many architecture only contains the call to memblock_free_all().

Pull memblock_free_all() call into mm_core_init() and drop mem_init() on
relevant architectures to make it more explicit where the free memory is
released from memblock to the buddy allocator and to reduce code
duplication in architecture specific code.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: Dave Hansen <[email protected]> [x86]
Acked-by: Geert Uytterhoeven <[email protected]> [m68k]
Tested-by: Mark Brown <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Dinh Nguyen <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Guo Ren (csky) <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: Johannes Berg <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Russel King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6
# c6f23979 02-Jan-2025 Guo Weikang <[email protected]>

mm/memblock: add memblock_alloc_or_panic interface

Before SLUB initialization, various subsystems used memblock_alloc to
allocate memory. In most cases, when memory allocation fails, an
immediate p

mm/memblock: add memblock_alloc_or_panic interface

Before SLUB initialization, various subsystems used memblock_alloc to
allocate memory. In most cases, when memory allocation fails, an
immediate panic is required. To simplify this behavior and reduce
repetitive checks, introduce `memblock_alloc_or_panic`. This function
ensures that memory allocation failures result in a panic automatically,
improving code readability and consistency across subsystems that require
this behavior.

[[email protected]: arch/s390: save_area_alloc default failure behavior changed to panic]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Guo Weikang <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]> [m68k]
Reviewed-by: Alexander Gordeev <[email protected]> [s390]
Acked-by: Mike Rapoport (Microsoft) <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# b2aad24b 06-Jan-2025 Guo Weikang <[email protected]>

mm/memmap: prevent double scanning of memmap by kmemleak

kmemleak explicitly scans the mem_map through the valid struct page
objects. However, memmap_alloc() was also adding this memory to the gray

mm/memmap: prevent double scanning of memmap by kmemleak

kmemleak explicitly scans the mem_map through the valid struct page
objects. However, memmap_alloc() was also adding this memory to the gray
object list, causing it to be scanned twice. Remove memmap_alloc() from
the scan list and add a comment to clarify the behavior.

Link: https://lore.kernel.org/lkml/CAOm6qn=FVeTpH54wGDFMHuCOeYtvoTx30ktnv9-w3Nh8RMofEA@mail.gmail.com/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Guo Weikang <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Cc: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3
# d0f8a897 08-Aug-2024 Wei Yang <[email protected]>

mm/memblock: introduce a new helper memblock_estimated_nr_free_pages()

During bootup, system may need the number of free pages in the whole system
to do some calculation before all pages are freed t

mm/memblock: introduce a new helper memblock_estimated_nr_free_pages()

During bootup, system may need the number of free pages in the whole system
to do some calculation before all pages are freed to buddy system. Usually
this number is get from totalram_pages(). Since we plan to move the free
pages accounting in __free_pages_core(), this value may not represent
total free pages at the early stage, especially when
CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled.

Instead of using raw memblock api, let's introduce a new helper for user
to get the estimated number of free pages from memblock point of view.

Signed-off-by: Wei Yang <[email protected]>
CC: David Hildenbrand <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>

show more ...


Revision tags: v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1
# 188f87f2 22-May-2024 Eric Chanudet <[email protected]>

mm/mm_init: use node's number of cpus in deferred_page_init_max_threads

x86_64 is already using the node's cpu as maximum threads. Make that the
default for all archs setting DEFERRED_STRUCT_PAGE_I

mm/mm_init: use node's number of cpus in deferred_page_init_max_threads

x86_64 is already using the node's cpu as maximum threads. Make that the
default for all archs setting DEFERRED_STRUCT_PAGE_INIT.

This returns to the behavior prior making the function arch-specific with
commit ecd096506922 ("mm: make deferred init's max threads
arch-specific").

Setting DEFERRED_STRUCT_PAGE_INIT and testing on a few arm64 platforms
shows faster deferred_init_memmap completions:

| | x13s | SA8775p-ride | Ampere R137-P31 | Ampere HR330 |
| | Metal, 32GB | VM, 36GB | VM, 58GB | Metal, 128GB |
| | 8cpus | 8cpus | 8cpus | 32cpus |
|---------|-------------|--------------|-----------------|--------------|
| threads | ms (%) | ms (%) | ms (%) | ms (%) |
|---------|-------------|--------------|-----------------|--------------|
| 1 | 108 (0%) | 72 (0%) | 224 (0%) | 324 (0%) |
| cpus | 24 (-77%) | 36 (-50%) | 40 (-82%) | 56 (-82%) |

Michael Ellerman reported:

: On a machine here (1TB, 40 cores, 4KB pages) the existing code gives:
:
: [ 0.500124] node 2 deferred pages initialised in 210ms
: [ 0.515790] node 3 deferred pages initialised in 230ms
: [ 0.516061] node 0 deferred pages initialised in 230ms
: [ 0.516522] node 7 deferred pages initialised in 230ms
: [ 0.516672] node 4 deferred pages initialised in 230ms
: [ 0.516798] node 6 deferred pages initialised in 230ms
: [ 0.517051] node 5 deferred pages initialised in 230ms
: [ 0.523887] node 1 deferred pages initialised in 240ms
:
: vs with the patch:
:
: [ 0.379613] node 0 deferred pages initialised in 90ms
: [ 0.380388] node 1 deferred pages initialised in 90ms
: [ 0.380540] node 4 deferred pages initialised in 100ms
: [ 0.390239] node 6 deferred pages initialised in 100ms
: [ 0.390249] node 2 deferred pages initialised in 100ms
: [ 0.390786] node 3 deferred pages initialised in 110ms
: [ 0.396721] node 5 deferred pages initialised in 110ms
: [ 0.397095] node 7 deferred pages initialised in 110ms
:
: Which is a nice speedup.

[[email protected]: v3]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Eric Chanudet <[email protected]>
Tested-by: Michael Ellerman <[email protected]> (powerpc)
Reviewed-by: Baoquan He <[email protected]>
Acked-by: Alexander Gordeev <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov (AMD) <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# f1180fd2 05-Jun-2024 Wei Yang <[email protected]>

mm/mm_init.c: not always search next deferred_init_pfn from very beginning

In function deferred_init_memmap(), we call
deferred_init_mem_pfn_range_in_zone() to get the next deferred_init_pfn.
But we

mm/mm_init.c: not always search next deferred_init_pfn from very beginning

In function deferred_init_memmap(), we call
deferred_init_mem_pfn_range_in_zone() to get the next deferred_init_pfn.
But we always search it from the very beginning.

Since we save the index in i, we can leverage this to search from i next
time.

[rppt refine the comment]

Signed-off-by: Wei Yang <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Mike Rapoport (IBM) <[email protected]>

show more ...


# 93bbbcb1 25-May-2024 Wei Yang <[email protected]>

mm/memblock: fix a typo in description of for_each_mem_region()

No functional change.

Signed-off-by: Wei Yang <[email protected]>
Link: https://lore.kernel.org/r/20240525023040.13509-2-rich

mm/memblock: fix a typo in description of for_each_mem_region()

No functional change.

Signed-off-by: Wei Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport (IBM) <[email protected]>

show more ...


Revision tags: v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1
# 9b99c17f 12-Jan-2024 Alison Schofield <[email protected]>

x86/numa: Fix the address overlap check in numa_fill_memblks()

numa_fill_memblks() fills in the gaps in numa_meminfo memblks over a
physical address range. To do so, it first creates a list of exist

x86/numa: Fix the address overlap check in numa_fill_memblks()

numa_fill_memblks() fills in the gaps in numa_meminfo memblks over a
physical address range. To do so, it first creates a list of existing
memblks that overlap that address range. The issue is that it is off
by one when comparing to the end of the address range, so memblks
that do not overlap are selected.

The impact of selecting a memblk that does not actually overlap is
that an existing memblk may be filled when the expected action is to
do nothing and return NUMA_NO_MEMBLK to the caller. The caller can
then add a new NUMA node and memblk.

Replace the broken open-coded search for address overlap with the
memblock helper memblock_addrs_overlap(). Update the kernel doc
and in code comments.

Suggested by: "Huang, Ying" <[email protected]>

Fixes: 8f012db27c95 ("x86/numa: Introduce numa_fill_memblks()")
Signed-off-by: Alison Schofield <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Link: https://lore.kernel.org/r/10a3e6109c34c21a8dd4c513cf63df63481a2b07.1705085543.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <[email protected]>

show more ...


Revision tags: v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6
# ff6c3d81 26-Oct-2023 Liam Ni <[email protected]>

NUMA: optimize detection of memory with no node id assigned by firmware

Sanity check that makes sure the nodes cover all memory loops over
numa_meminfo to count the pages that have node id assigned

NUMA: optimize detection of memory with no node id assigned by firmware

Sanity check that makes sure the nodes cover all memory loops over
numa_meminfo to count the pages that have node id assigned by the
firmware, then loops again over memblock.memory to find the total amount
of memory and in the end checks that the difference between the total
memory and memory that covered by nodes is less than some threshold.
Worse, the loop over numa_meminfo calls __absent_pages_in_range() that
also partially traverses memblock.memory.

It's much simpler and more efficient to have a single traversal of
memblock.memory that verifies that amount of memory not covered by nodes
is less than a threshold.

Introduce memblock_validate_numa_coverage() that does exactly that and use
it instead of numa_meminfo_cover_memory().

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam Ni <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Bibo Mao <[email protected]>
Cc: Binbin Zhou <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Feiyang Chen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: WANG Xuerui <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2
# 77e6c43e 13-Sep-2023 Usama Arif <[email protected]>

memblock: introduce MEMBLOCK_RSRV_NOINIT flag

For reserved memory regions marked with this flag, reserve_bootmem_region
is not called during memmap_init_reserved_pages. This can be used to
avoid st

memblock: introduce MEMBLOCK_RSRV_NOINIT flag

For reserved memory regions marked with this flag, reserve_bootmem_region
is not called during memmap_init_reserved_pages. This can be used to
avoid struct page initialization for regions which won't need them, for
e.g. hugepages with Hugepage Vmemmap Optimization enabled.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Usama Arif <[email protected]>
Acked-by: Muchun Song <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Cc: Fam Zheng <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Punit Agrawal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6
# 3f32c49e 08-Aug-2023 Kefeng Wang <[email protected]>

mm: memtest: convert to memtest_report_meminfo()

It is better to not expose too many internal variables of memtest,
add a helper memtest_report_meminfo() to show memtest results.

Link: https://lkml

mm: memtest: convert to memtest_report_meminfo()

It is better to not expose too many internal variables of memtest,
add a helper memtest_report_meminfo() to show memtest results.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Tomas Mudrunka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4
# 3fade62b 25-Jun-2023 Miaohe Lin <[email protected]>

mm/mm_init.c: remove obsolete macro HASH_SMALL

HASH_SMALL only works when parameter numentries is 0. But the sole caller
futex_init() never calls alloc_large_system_hash() with numentries set to
0.

mm/mm_init.c: remove obsolete macro HASH_SMALL

HASH_SMALL only works when parameter numentries is 0. But the sole caller
futex_init() never calls alloc_large_system_hash() with numentries set to
0. So HASH_SMALL is obsolete and remove it.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.4-rc7, v6.4-rc6
# a668968f 07-Jun-2023 Haifeng Xu <[email protected]>

mm/memory_hotplug: remove reset_node_managed_pages() in hotadd_init_pgdat()

managed pages has already been set to 0 in free_area_init_core_hotplug(),
via zone_init_internals() on each zone. It's po

mm/memory_hotplug: remove reset_node_managed_pages() in hotadd_init_pgdat()

managed pages has already been set to 0 in free_area_init_core_hotplug(),
via zone_init_internals() on each zone. It's pointless to reset again.

Furthermore, reset_node_managed_pages() no longer needs to be exposed
outside of mm/memblock.c. Remove declaration in include/linux/memblock.h
and define it as static.

In addtion to this, the only caller of reset_node_managed_pages() is
reset_all_zones_managed_pages(), which is annotated with __init, so it
should be safe to also mark reset_node_managed_pages() as __init.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Haifeng Xu <[email protected]>
Suggested-by: David Hildenbrand <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Oscar Salvador <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4
# bd23024b 21-Mar-2023 Tomas Mudrunka <[email protected]>

mm/memtest: add results of early memtest to /proc/meminfo

Currently the memtest results were only presented in dmesg.

When running a large fleet of devices without ECC RAM it's currently not
easy t

mm/memtest: add results of early memtest to /proc/meminfo

Currently the memtest results were only presented in dmesg.

When running a large fleet of devices without ECC RAM it's currently not
easy to do bulk monitoring for memory corruption. You have to parse
dmesg, but that's a ring buffer so the error might disappear after some
time. In general I do not consider dmesg to be a great API to query RAM
status.

In several companies I've seen such errors remain undetected and cause
issues for way too long. So I think it makes sense to provide a
monitoring API, so that we can safely detect and act upon them.

This adds /proc/meminfo entry which can be easily used by scripts.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Tomas Mudrunka <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1
# a59466ee 11-Jan-2022 Karolina Drobnik <[email protected]>

memblock: Remove #ifdef __KERNEL__ from memblock.h

memblock.h is not a uAPI header, so __KERNEL__ guard can be deleted

Signed-off-by: Karolina Drobnik <[email protected]>
Signed-off-by: Mik

memblock: Remove #ifdef __KERNEL__ from memblock.h

memblock.h is not a uAPI header, so __KERNEL__ guard can be deleted

Signed-off-by: Karolina Drobnik <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6
# d7f55471 17-Dec-2021 Jackie Liu <[email protected]>

memblock: fix memblock_phys_alloc() section mismatch error

Fix modpost Section mismatch error in memblock_phys_alloc()

[...]
WARNING: modpost: vmlinux.o(.text.unlikely+0x1dcc): Section mismatch in

memblock: fix memblock_phys_alloc() section mismatch error

Fix modpost Section mismatch error in memblock_phys_alloc()

[...]
WARNING: modpost: vmlinux.o(.text.unlikely+0x1dcc): Section mismatch in reference
from the function memblock_phys_alloc() to the function .init.text:memblock_phys_alloc_range()
The function memblock_phys_alloc() references
the function __init memblock_phys_alloc_range().
This is often because memblock_phys_alloc lacks a __init
annotation or the annotation of memblock_phys_alloc_range is wrong.

ERROR: modpost: Section mismatches detected.
Set CONFIG_SECTION_MISMATCH_WARN_ONLY=y to allow them.
[...]

memblock_phys_alloc() is a one-line wrapper, make it __always_inline to
avoid these section mismatches.

Reported-by: k2ci <[email protected]>
Suggested-by: Mike Rapoport <[email protected]>
Signed-off-by: Jackie Liu <[email protected]>
[rppt: slightly massaged changelog ]
Signed-off-by: Mike Rapoport <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1
# c6975d7c 05-Nov-2021 Qian Cai <[email protected]>

arm64: Track no early_pgtable_alloc() for kmemleak

After switched page size from 64KB to 4KB on several arm64 servers here,
kmemleak starts to run out of early memory pool due to a huge number of
th

arm64: Track no early_pgtable_alloc() for kmemleak

After switched page size from 64KB to 4KB on several arm64 servers here,
kmemleak starts to run out of early memory pool due to a huge number of
those early_pgtable_alloc() calls:

kmemleak_alloc_phys()
memblock_alloc_range_nid()
memblock_phys_alloc_range()
early_pgtable_alloc()
init_pmd()
alloc_init_pud()
__create_pgd_mapping()
__map_memblock()
paging_init()
setup_arch()
start_kernel()

Increased the default value of DEBUG_KMEMLEAK_MEM_POOL_SIZE by 4 times
won't be enough for a server with 200GB+ memory. There isn't much
interesting to check memory leaks for those early page tables and those
early memory mappings should not reference to other memory. Hence, no
kmemleak false positives, and we can safely skip tracking those early
allocations from kmemleak like we did in the commit fed84c785270
("mm/memblock.c: skip kmemleak for kasan_init()") without needing to
introduce complications to automatically scale the value depends on the
runtime memory size etc. After the patch, the default value of
DEBUG_KMEMLEAK_MEM_POOL_SIZE becomes sufficient again.

Signed-off-by: Qian Cai <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>

show more ...


# f7892d8e 05-Nov-2021 David Hildenbrand <[email protected]>

memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED

Let's add a flag that corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED,
indicating that we're dealing with a memory reg

memblock: add MEMBLOCK_DRIVER_MANAGED to mimic IORESOURCE_SYSRAM_DRIVER_MANAGED

Let's add a flag that corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED,
indicating that we're dealing with a memory region that is never
indicated in the firmware-provided memory map, but always detected and
added by a driver.

Similar to MEMBLOCK_HOTPLUG, most infrastructure has to treat such
memory regions like ordinary MEMBLOCK_NONE memory regions -- for
example, when selecting memory regions to add to the vmcore for dumping
in the crashkernel via for_each_mem_range().

However, especially kexec_file is not supposed to select such memblocks
via for_each_free_mem_range() / for_each_free_mem_range_reverse() to
place kexec images, similar to how we handle
IORESOURCE_SYSRAM_DRIVER_MANAGED without CONFIG_ARCH_KEEP_MEMBLOCK.

We'll make sure that memory hotplug code sets the flag where applicable
(IORESOURCE_SYSRAM_DRIVER_MANAGED) next. This prepares architectures
that need CONFIG_ARCH_KEEP_MEMBLOCK, such as arm64, for virtio-mem
support.

Note that kexec *must not* indicate this memory to the second kernel and
*must not* place kexec-images on this memory. Let's add a comment to
kexec_walk_memblock(), documenting how we handle MEMBLOCK_DRIVER_MANAGED
now just like using IORESOURCE_SYSRAM_DRIVER_MANAGED in
locate_mem_hole_callback() for kexec_walk_resources().

Also note that MEMBLOCK_HOTPLUG cannot be reused due to different
semantics:
MEMBLOCK_HOTPLUG: memory is indicated as "System RAM" in the
firmware-provided memory map and added to the system early during
boot; kexec *has to* indicate this memory to the second kernel and
can place kexec-images on this memory. After memory hotunplug,
kexec has to be re-armed. We mostly ignore this flag when
"movable_node" is not set on the kernel command line, because
then we're told to not care about hotunpluggability of such
memory regions.

MEMBLOCK_DRIVER_MANAGED: memory is not indicated as "System RAM" in
the firmware-provided memory map; this memory is always detected
and added to the system by a driver; memory might not actually be
physically hotunpluggable. kexec *must not* indicate this memory to
the second kernel and *must not* place kexec-images on this memory.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Cc: "Aneesh Kumar K . V" <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Jianyong Wu <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Shahab Vahedi <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vineet Gupta <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# 952eea9b 05-Nov-2021 David Hildenbrand <[email protected]>

memblock: allow to specify flags with memblock_add_node()

We want to specify flags when hotplugging memory. Let's prepare to pass
flags to memblock_add_node() by adjusting all existing users.

Note

memblock: allow to specify flags with memblock_add_node()

We want to specify flags when hotplugging memory. Let's prepare to pass
flags to memblock_add_node() by adjusting all existing users.

Note that when hotplugging memory the system is already up and running
and we might have concurrent memblock users: for example, while we're
hotplugging memory, kexec_file code might search for suitable memory
regions to place kexec images. It's important to add the memory
directly to memblock via a single call with the right flags, instead of
adding the memory first and apply flags later: otherwise, concurrent
memblock users might temporarily stumble over memblocks with wrong
flags, which will be important in a follow-up patch that introduces a
new flag to properly handle add_memory_driver_managed().

Link: https://lkml.kernel.org/r/[email protected]
Acked-by: Geert Uytterhoeven <[email protected]>
Acked-by: Heiko Carstens <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Acked-by: Shahab Vahedi <[email protected]> [arch/arc]
Reviewed-by: Mike Rapoport <[email protected]>
Cc: "Aneesh Kumar K . V" <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Jianyong Wu <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vineet Gupta <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# e14b4155 05-Nov-2021 David Hildenbrand <[email protected]>

memblock: improve MEMBLOCK_HOTPLUG documentation

The description of MEMBLOCK_HOTPLUG is currently short and consequently
misleading: we're actually dealing with a memory region that might get
hotunp

memblock: improve MEMBLOCK_HOTPLUG documentation

The description of MEMBLOCK_HOTPLUG is currently short and consequently
misleading: we're actually dealing with a memory region that might get
hotunplugged later (i.e., the platform+firmware supports it), yet it is
indicated in the firmware-provided memory map as system ram that will
just get used by the system for any purpose when not taking special
care. The firmware marked this memory region as a hot(un)plugged (e.g.,
hotplugged before reboot), implying that it might get hotunplugged again
later.

Whether we consider this information depends on the "movable_node"
kernel commandline parameter: only with "movable_node" set, we'll try
keeping this memory hotunpluggable, for example, by not serving early
allocations from this memory region and by letting the buddy manage it
using the ZONE_MOVABLE.

Let's make this clearer by extending the documentation.

Note: kexec *has to* indicate this memory to the second kernel. With
"movable_node" set, we don't want to place kexec-images on this memory.
Without "movable_node" set, we don't care and can place kexec-images on
this memory. In both cases, after successful memory hotunplug, kexec
has to be re-armed to update the memory map for the second kernel and to
place the kexec-images somewhere else.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Hildenbrand <[email protected]>
Reviewed-by: Mike Rapoport <[email protected]>
Cc: "Aneesh Kumar K . V" <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Jianyong Wu <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Shahab Vahedi <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vineet Gupta <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# 4421cca0 05-Nov-2021 Mike Rapoport <[email protected]>

memblock: use memblock_free for freeing virtual pointers

Rename memblock_free_ptr() to memblock_free() and use memblock_free()
when freeing a virtual pointer so that memblock_free() will be a
counte

memblock: use memblock_free for freeing virtual pointers

Rename memblock_free_ptr() to memblock_free() and use memblock_free()
when freeing a virtual pointer so that memblock_free() will be a
counterpart of memblock_alloc()

The callers are updated with the below semantic patch and manual
addition of (void *) casting to pointers that are represented by
unsigned long variables.

@@
identifier vaddr;
expression size;
@@
(
- memblock_phys_free(__pa(vaddr), size);
+ memblock_free(vaddr, size);
|
- memblock_free_ptr(vaddr, size);
+ memblock_free(vaddr, size);
)

[[email protected]: fixup]
Link: https://lkml.kernel.org/r/[email protected]

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Shahab Vahedi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# 3ecc6834 05-Nov-2021 Mike Rapoport <[email protected]>

memblock: rename memblock_free to memblock_phys_free

Since memblock_free() operates on a physical range, make its name
reflect it and rename it to memblock_phys_free(), so it will be a
logical count

memblock: rename memblock_free to memblock_phys_free

Since memblock_free() operates on a physical range, make its name
reflect it and rename it to memblock_phys_free(), so it will be a
logical counterpart to memblock_phys_alloc().

The callers are updated with the below semantic patch:

@@
expression addr;
expression size;
@@
- memblock_free(addr, size);
+ memblock_phys_free(addr, size);

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Shahab Vahedi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# 621d9739 05-Nov-2021 Mike Rapoport <[email protected]>

memblock: stop aliasing __memblock_free_late with memblock_free_late

memblock_free_late() is a NOP wrapper for __memblock_free_late(), there
is no point to keep this indirection.

Drop the wrapper a

memblock: stop aliasing __memblock_free_late with memblock_free_late

memblock_free_late() is a NOP wrapper for __memblock_free_late(), there
is no point to keep this indirection.

Drop the wrapper and rename __memblock_free_late() to
memblock_free_late().

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Shahab Vahedi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# fa277171 05-Nov-2021 Mike Rapoport <[email protected]>

memblock: drop memblock_free_early_nid() and memblock_free_early()

memblock_free_early_nid() is unused and memblock_free_early() is an
alias for memblock_free().

Replace calls to memblock_free_earl

memblock: drop memblock_free_early_nid() and memblock_free_early()

memblock_free_early_nid() is unused and memblock_free_early() is an
alias for memblock_free().

Replace calls to memblock_free_early() with calls to memblock_free() and
remove memblock_free_early() and memblock_free_early_nid().

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Shahab Vahedi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2
# 77e02cf5 14-Sep-2021 Linus Torvalds <[email protected]>

memblock: introduce saner 'memblock_free_ptr()' interface

The boot-time allocation interface for memblock is a mess, with
'memblock_alloc()' returning a virtual pointer, but then you are
supposed to

memblock: introduce saner 'memblock_free_ptr()' interface

The boot-time allocation interface for memblock is a mess, with
'memblock_alloc()' returning a virtual pointer, but then you are
supposed to free it with 'memblock_free()' that takes a _physical_
address.

Not only is that all kinds of strange and illogical, but it actually
causes bugs, when people then use it like a normal allocation function,
and it fails spectacularly on a NULL pointer:

https://lore.kernel.org/all/20210912140820.GD25450@xsang-OptiPlex-9020/

or just random memory corruption if the debug checks don't catch it:

https://lore.kernel.org/all/[email protected]/

I really don't want to apply patches that treat the symptoms, when the
fundamental cause is this horribly confusing interface.

I started out looking at just automating a sane replacement sequence,
but because of this mix or virtual and physical addresses, and because
people have used the "__pa()" macro that can take either a regular
kernel pointer, or just the raw "unsigned long" address, it's all quite
messy.

So this just introduces a new saner interface for freeing a virtual
address that was allocated using 'memblock_alloc()', and that was kept
as a regular kernel pointer. And then it converts a couple of users
that are obvious and easy to test, including the 'xbc_nodes' case in
lib/bootconfig.c that caused problems.

Reported-by: kernel test robot <[email protected]>
Fixes: 40caa127f3c7 ("init: bootconfig: Remove all bootconfig data when the init memory is removed")
Cc: Steven Rostedt <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


1234567