History log of /linux-6.15/arch/x86/kernel/setup.c (Results 1 – 25 of 529)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7
# e120d1bc 13-Mar-2025 Mike Rapoport (Microsoft) <[email protected]>

arch, mm: set high_memory in free_area_init()

high_memory defines upper bound on the directly mapped memory. This bound
is defined by the beginning of ZONE_HIGHMEM when a system has high memory
and

arch, mm: set high_memory in free_area_init()

high_memory defines upper bound on the directly mapped memory. This bound
is defined by the beginning of ZONE_HIGHMEM when a system has high memory
and by the end of memory otherwise.

All this is known to generic memory management initialization code that
can set high_memory while initializing core mm structures.

Add a generic calculation of high_memory to free_area_init() and remove
per-architecture calculation except for the architectures that set and use
high_memory earlier than that.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Acked-by: Dave Hansen <[email protected]> [x86]
Tested-by: Mark Brown <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Borislav Betkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Dinh Nguyen <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Guo Ren (csky) <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: Johannes Berg <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Russel King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleinxer <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1
# 7b54a96f 31-Jan-2025 Sourabh Jain <[email protected]>

crash: remove an unused argument from reserve_crashkernel_generic()

cmdline argument is not used in reserve_crashkernel_generic() so remove
it. Correspondingly, all the callers have been updated as

crash: remove an unused argument from reserve_crashkernel_generic()

cmdline argument is not used in reserve_crashkernel_generic() so remove
it. Correspondingly, all the callers have been updated as well.

No functional change intended.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Sourabh Jain <[email protected]>
Acked-by: Hari Bathini <[email protected]>
Acked-by: Baoquan He <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mahesh Salgaonkar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 665eaf31 28-Feb-2025 Frank van der Linden <[email protected]>

x86/setup: call hugetlb_bootmem_alloc early

Call hugetlb_bootmem_allloc in an earlier spot in setup, after
hugelb_cma_reserve. This will make vmemmap preinit of the sections
covered by the allocate

x86/setup: call hugetlb_bootmem_alloc early

Call hugetlb_bootmem_allloc in an earlier spot in setup, after
hugelb_cma_reserve. This will make vmemmap preinit of the sections
covered by the allocated hugetlb pages possible.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Frank van der Linden <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Dan Carpenter <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Roman Gushchin (Cruise) <[email protected]>
Cc: Usama Arif <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Yu Zhao <[email protected]>
Cc: Zi Yan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# efe659ac 14-Feb-2025 Mike Rapoport (Microsoft) <[email protected]>

x86/e820: Drop obsolete E820_TYPE_RESERVED_KERN and related code

E820_TYPE_RESERVED_KERN is a relict from the ancient history that was used
to early reserve setup_data, see:

28bb22379513 ("x86: m

x86/e820: Drop obsolete E820_TYPE_RESERVED_KERN and related code

E820_TYPE_RESERVED_KERN is a relict from the ancient history that was used
to early reserve setup_data, see:

28bb22379513 ("x86: move reserve_setup_data to setup.c")

Nowadays setup_data is anyway reserved in memblock and there is no point in
carrying E820_TYPE_RESERVED_KERN that behaves exactly like E820_TYPE_RAM
but only complicates the code.

A bonus for removing E820_TYPE_RESERVED_KERN is a small but measurable
speedup of 20 microseconds in init_mem_mappings() on a VM with 32GB or RAM.

Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Kees Cook <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# d45dd0a9 14-Feb-2025 Mike Rapoport (Microsoft) <[email protected]>

x86/boot: Split parsing of boot_params into the parse_boot_params() helper function

Makes setup_arch() a bit easier to comprehend.

No functional changes.

Signed-off-by: Mike Rapoport (Microsoft) <

x86/boot: Split parsing of boot_params into the parse_boot_params() helper function

Makes setup_arch() a bit easier to comprehend.

No functional changes.

Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 297fb82e 14-Feb-2025 Mike Rapoport (Microsoft) <[email protected]>

x86/boot: Split kernel resources setup into the setup_kernel_resources() helper function

Makes setup_arch() a bit easier to comprehend.

No functional changes.

Signed-off-by: Mike Rapoport (Microso

x86/boot: Split kernel resources setup into the setup_kernel_resources() helper function

Makes setup_arch() a bit easier to comprehend.

No functional changes.

Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 2d6bff31 14-Feb-2025 Mike Rapoport (Microsoft) <[email protected]>

x86/boot: Move setting of memblock parameters to e820__memblock_setup()

Changing memblock parameters, namely bottom_up and allocation upper
limit does not have any effect before memblock initializat

x86/boot: Move setting of memblock parameters to e820__memblock_setup()

Changing memblock parameters, namely bottom_up and allocation upper
limit does not have any effect before memblock initialization in
e820__memblock_setup().

Move the calls to memblock_set_bottom_up() and memblock_set_current_limit()
to e820__memblock_setup() to group all the memblock initial setup and make
setup_arch() more readable.

No functional changes.

Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# c305a4e9 18-Feb-2025 Joel Granados <[email protected]>

x86: Move sysctls into arch/x86

Move the following sysctl tables into arch/x86/kernel/setup.c:

panic_on_{unrecoverable_nmi,io_nmi}
bootloader_{type,version}
io_delay_type
unknown_nmi_panic

x86: Move sysctls into arch/x86

Move the following sysctl tables into arch/x86/kernel/setup.c:

panic_on_{unrecoverable_nmi,io_nmi}
bootloader_{type,version}
io_delay_type
unknown_nmi_panic
acpi_realmode_flags

Variables moved from include/linux/ to arch/x86/include/asm/ because there
is no longer need for them outside arch/x86/kernel:

acpi_realmode_flags
panic_on_{unrecoverable_nmi,io_nmi}

Include <asm/nmi.h> in arch/s86/kernel/setup.h in order to bring in
panic_on_{io_nmi,unrecovered_nmi}.

This is part of a greater effort to move ctl tables into their
respective subsystems which will reduce the merge conflicts in
kerenel/sysctl.c.

Signed-off-by: Joel Granados <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3
# ccd58205 12-Dec-2024 Guo Weikang <[email protected]>

mm/early_ioremap: add null pointer checks to prevent NULL-pointer dereference

The early_ioremap interface can fail and return NULL in certain cases. To
prevent NULL-pointer dereference crashes, fix

mm/early_ioremap: add null pointer checks to prevent NULL-pointer dereference

The early_ioremap interface can fail and return NULL in certain cases. To
prevent NULL-pointer dereference crashes, fixed issues in the acpi_extlog
and copy_early_mem interfaces, improving robustness when handling early
memory.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Guo Weikang <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Borislav Petkov (AMD) <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jason A. Donenfeld <[email protected]>
Cc: Julian Stecklina <[email protected]>
Cc: Kevin Loughlin <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Xin Li (Intel) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10
# a97756cb 09-Jul-2024 Xin Li (Intel) <[email protected]>

x86/fred: Enable FRED right after init_mem_mapping()

On 64-bit init_mem_mapping() relies on the minimal page fault handler
provided by the early IDT mechanism. The real page fault handler is
install

x86/fred: Enable FRED right after init_mem_mapping()

On 64-bit init_mem_mapping() relies on the minimal page fault handler
provided by the early IDT mechanism. The real page fault handler is
installed right afterwards into the IDT.

This is problematic on CPUs which have X86_FEATURE_FRED set because the
real page fault handler retrieves the faulting address from the FRED
exception stack frame and not from CR2, but that does obviously not work
when FRED is not yet enabled in the CPU.

To prevent this enable FRED right after init_mem_mapping() without
interrupt stacks. Those are enabled later in trap_init() after the CPU
entry area is set up.

[ tglx: Encapsulate the FRED details ]

Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code")
Reported-by: Hou Wenlong <[email protected]>
Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Xin Li (Intel) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/[email protected]

show more ...


# bf514327 30-Jul-2024 Borislav Petkov (AMD) <[email protected]>

x86/setup: Parse the builtin command line before merging

Commit in Fixes was added as a catch-all for cases where the cmdline is
parsed before being merged with the builtin one.

And promptly one is

x86/setup: Parse the builtin command line before merging

Commit in Fixes was added as a catch-all for cases where the cmdline is
parsed before being merged with the builtin one.

And promptly one issue appeared, see Link below. The microcode loader
really needs to parse it that early, but the merging happens later.

Reshuffling the early boot nightmare^W code to handle that properly would
be a painful exercise for another day so do the chicken thing and parse the
builtin cmdline too before it has been merged.

Fixes: 0c40b1c7a897 ("x86/setup: Warn when option parsing is done too early")
Reported-by: Mike Lothian <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/all/20240730152108.GAZqkE5Dfi9AuKllRw@fat_crate.local
Link: https://lore.kernel.org/r/20240722152330.GCZp55ck8E_FT4kPnC@fat_crate.local

show more ...


Revision tags: v6.10-rc7, v6.10-rc6, v6.10-rc5
# 37aee82c 20-Jun-2024 Ard Biesheuvel <[email protected]>

x86/efi: Drop support for fake EFI memory maps

Between kexec and confidential VM support, handling the EFI memory maps
correctly on x86 is already proving to be rather difficult (as opposed
to other

x86/efi: Drop support for fake EFI memory maps

Between kexec and confidential VM support, handling the EFI memory maps
correctly on x86 is already proving to be rather difficult (as opposed
to other EFI architectures which manage to never modify the EFI memory
map to begin with)

EFI fake memory map support is essentially a development hack (for
testing new support for the 'special purpose' and 'more reliable' EFI
memory attributes) that leaked into production code. The regions marked
in this manner are not actually recognized as such by the firmware
itself or the EFI stub (and never have), and marking memory as 'more
reliable' seems rather futile if the underlying memory is just ordinary
RAM.

Marking memory as 'special purpose' in this way is also dubious, but may
be in use in production code nonetheless. However, the same should be
achievable by using the memmap= command line option with the ! operator.

EFI fake memmap support is not enabled by any of the major distros
(Debian, Fedora, SUSE, Ubuntu) and does not exist on other
architectures, so let's drop support for it.

Acked-by: Borislav Petkov (AMD) <[email protected]>
Acked-by: Dan Williams <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>

show more ...


Revision tags: v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4
# 0c40b1c7 08-Apr-2024 Borislav Petkov (AMD) <[email protected]>

x86/setup: Warn when option parsing is done too early

Commit

4faa0e5d6d79 ("x86/boot: Move kernel cmdline setup earlier in the boot process (again)")

fixed and issue where cmdline parsing would

x86/setup: Warn when option parsing is done too early

Commit

4faa0e5d6d79 ("x86/boot: Move kernel cmdline setup earlier in the boot process (again)")

fixed and issue where cmdline parsing would happen before the final
boot_command_line string has been built from the builtin and boot
cmdlines and thus cmdline arguments would get lost.

Add a check to catch any future wrong use ordering so that such issues
can be caught in time.

Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/20240409152541.GCZhVd9XIPXyTNd9vc@fat_crate.local

show more ...


Revision tags: v6.9-rc3, v6.9-rc2
# fdb022f6 25-Mar-2024 Baoquan He <[email protected]>

x86: remove unneeded memblock_find_dma_reserve()

Patch series "mm/mm_init.c: refactor free_area_init_core()".

In function free_area_init_core(), the code calculating
zone->managed_pages and the sub

x86: remove unneeded memblock_find_dma_reserve()

Patch series "mm/mm_init.c: refactor free_area_init_core()".

In function free_area_init_core(), the code calculating
zone->managed_pages and the subtracting dma_reserve from DMA zone looks
very confusing.

From git history, the code calculating zone->managed_pages was for
zone->present_pages originally. The early rough assignment is for
optimize zone's pcp and water mark setting. Later, managed_pages was
introduced into zone to represent the number of managed pages by buddy.
Now, zone->managed_pages is zeroed out and reset in mem_init() when
calling memblock_free_all(). zone's pcp and wmark setting relying on
actual zone->managed_pages are done later than mem_init() invocation. So
we don't need rush to early calculate and set zone->managed_pages, just
set it as zone->present_pages, will adjust it in mem_init().

And also add a new function calc_nr_kernel_pages() to count up free but
not reserved pages in memblock, then assign it to nr_all_pages and
nr_kernel_pages after memmap pages are allocated.


This patch (of 6):

Variable dma_reserve and its usage was introduced in commit 0e0b864e069c
("[PATCH] Account for memmap and optionally the kernel image as holes").
Its original purpose was to accounting for the reserved pages in DMA zone
to make DMA zone's watermarks calculation more accurate on x86.

However, currently there's zone->managed_pages to account for all
available pages for buddy, zone->present_pages to account for all present
physical pages in zone. What is more important, on x86, calculating and
setting the zone->managed_pages is a temporary move, all zone's
managed_pages will be zeroed out and reset to the actual value according
to how many pages are added to buddy allocator in mem_init(). Before
mem_init(), no buddy alloction is requested. And zone's pcp and watermark
setting are all done after mem_init(). So, no need to worry about the DMA
zone's setting accuracy during free_area_init().

Hence, remove memblock_find_dma_reserve() to stop calculating and
setting dma_reserve.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Baoquan He <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 99485c4c 26-Mar-2024 Jason A. Donenfeld <[email protected]>

x86/coco: Require seeding RNG with RDRAND on CoCo systems

There are few uses of CoCo that don't rely on working cryptography and
hence a working RNG. Unfortunately, the CoCo threat model means that

x86/coco: Require seeding RNG with RDRAND on CoCo systems

There are few uses of CoCo that don't rely on working cryptography and
hence a working RNG. Unfortunately, the CoCo threat model means that the
VM host cannot be trusted and may actively work against guests to
extract secrets or manipulate computation. Since a malicious host can
modify or observe nearly all inputs to guests, the only remaining source
of entropy for CoCo guests is RDRAND.

If RDRAND is broken -- due to CPU hardware fault -- the RNG as a whole
is meant to gracefully continue on gathering entropy from other sources,
but since there aren't other sources on CoCo, this is catastrophic.
This is mostly a concern at boot time when initially seeding the RNG, as
after that the consequences of a broken RDRAND are much more
theoretical.

So, try at boot to seed the RNG using 256 bits of RDRAND output. If this
fails, panic(). This will also trigger if the system is booted without
RDRAND, as RDRAND is essential for a safe CoCo boot.

Add this deliberately to be "just a CoCo x86 driver feature" and not
part of the RNG itself. Many device drivers and platforms have some
desire to contribute something to the RNG, and add_device_randomness()
is specifically meant for this purpose.

Any driver can call it with seed data of any quality, or even garbage
quality, and it can only possibly make the quality of the RNG better or
have no effect, but can never make it worse.

Rather than trying to build something into the core of the RNG, consider
the particular CoCo issue just a CoCo issue, and therefore separate it
all out into driver (well, arch/platform) code.

[ bp: Massage commit message. ]

Signed-off-by: Jason A. Donenfeld <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Elena Reshetova <[email protected]>
Reviewed-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Theodore Ts'o <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 4faa0e5d 28-Mar-2024 Julian Stecklina <[email protected]>

x86/boot: Move kernel cmdline setup earlier in the boot process (again)

When split_lock_detect=off (or similar) is specified in
CONFIG_CMDLINE, its effect is lost. The flow is currently this:

setu

x86/boot: Move kernel cmdline setup earlier in the boot process (again)

When split_lock_detect=off (or similar) is specified in
CONFIG_CMDLINE, its effect is lost. The flow is currently this:

setup_arch():
-> early_cpu_init()
-> early_identify_cpu()
-> sld_setup()
-> sld_state_setup()
-> Looks for split_lock_detect in boot_command_line

-> e820__memory_setup()

-> Assemble final command line:
boot_command_line = builtin_cmdline + boot_cmdline

-> parse_early_param()

There were earlier attempts at fixing this in:

8d48bf8206f7 ("x86/boot: Pull up cmdline preparation and early param parsing")

later reverted in:

fbe618399854 ("Revert "x86/boot: Pull up cmdline preparation and early param parsing"")

... because parse_early_param() can't be called before
e820__memory_setup().

In this patch, we just move the command line concatenation to the
beginning of early_cpu_init(). This should fix sld_state_setup(), while
not running in the same issues as the earlier attempt.

The order is now:

setup_arch():
-> Assemble final command line:
boot_command_line = builtin_cmdline + boot_cmdline

-> early_cpu_init()
-> early_identify_cpu()
-> sld_setup()
-> sld_state_setup()
-> Looks for split_lock_detect in boot_command_line

-> e820__memory_setup()

-> parse_early_param()

Signed-off-by: Julian Stecklina <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: [email protected]

show more ...


Revision tags: v6.9-rc1
# 0f4a1e80 13-Mar-2024 Kevin Loughlin <[email protected]>

x86/sev: Skip ROM range scans and validation for SEV-SNP guests

SEV-SNP requires encrypted memory to be validated before access.
Because the ROM memory range is not part of the e820 table, it is not

x86/sev: Skip ROM range scans and validation for SEV-SNP guests

SEV-SNP requires encrypted memory to be validated before access.
Because the ROM memory range is not part of the e820 table, it is not
pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes
to access this range, the guest must first validate the range.

The current SEV-SNP code does indeed scan the ROM range during early
boot and thus attempts to validate the ROM range in probe_roms().
However, this behavior is neither sufficient nor necessary for the
following reasons:

* With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and
CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will
attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which
falls in the ROM range) prior to validation.

For example, Project Oak Stage 0 provides a minimal guest firmware
that currently meets these configuration conditions, meaning guests
booting atop Oak Stage 0 firmware encounter a problematic call chain
during dmi_setup() -> dmi_scan_machine() that results in a crash
during boot if SEV-SNP is enabled.

* With regards to necessity, SEV-SNP guests generally read garbage
(which changes across boots) from the ROM range, meaning these scans
are unnecessary. The guest reads garbage because the legacy ROM range
is unencrypted data but is accessed via an encrypted PMD during early
boot (where the PMD is marked as encrypted due to potentially mapping
actually-encrypted data in other PMD-contained ranges).

In one exceptional case, EISA probing treats the ROM range as
unencrypted data, which is inconsistent with other probing.

Continuing to allow SEV-SNP guests to use garbage and to inconsistently
classify ROM range encryption status can trigger undesirable behavior.
For instance, if garbage bytes appear to be a valid signature, memory
may be unnecessarily reserved for the ROM range. Future code or other
use cases may result in more problematic (arbitrary) behavior that
should be avoided.

While one solution would be to overhaul the early PMD mapping to always
treat the ROM region of the PMD as unencrypted, SEV-SNP guests do not
currently rely on data from the ROM region during early boot (and even
if they did, they would be mostly relying on garbage data anyways).

As a simpler solution, skip the ROM range scans (and the otherwise-
necessary range validation) during SEV-SNP guest early boot. The
potential SEV-SNP guest crash due to lack of ROM range validation is
thus avoided by simply not accessing the ROM range.

In most cases, skip the scans by overriding problematic x86_init
functions during sme_early_init() to SNP-safe variants, which can be
likened to x86_init overrides done for other platforms (ex: Xen); such
overrides also avoid the spread of cc_platform_has() checks throughout
the tree.

In the exceptional EISA case, still use cc_platform_has() for the
simplest change, given (1) checks for guest type (ex: Xen domain status)
are already performed here, and (2) these checks occur in a subsys
initcall instead of an x86_init function.

[ bp: Massage commit message, remove "we"s. ]

Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active")
Signed-off-by: Kevin Loughlin <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# c90399fb 22-Mar-2024 Thomas Gleixner <[email protected]>

x86/cpu: Ensure that CPU info updates are propagated on UP

The boot sequence evaluates CPUID information twice:

1) During early boot

2) When finalizing the early setup right before
mitiga

x86/cpu: Ensure that CPU info updates are propagated on UP

The boot sequence evaluates CPUID information twice:

1) During early boot

2) When finalizing the early setup right before
mitigations are selected and alternatives are patched.

In both cases the evaluation is stored in boot_cpu_data, but on UP the
copying of boot_cpu_data to the per CPU info of the boot CPU happens
between #1 and #2. So any update which happens in #2 is never propagated to
the per CPU info instance.

Consolidate the whole logic and copy boot_cpu_data right before applying
alternatives as that's the point where boot_cpu_data is in it's final
state and not supposed to change anymore.

This also removes the voodoo mb() from smp_prepare_cpus_common() which
had absolutely no purpose.

Fixes: 71eb4893cfaf ("x86/percpu: Cure per CPU madness on UP")
Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# edf66a3c 18-Mar-2024 Rafael J. Wysocki <[email protected]>

x86/cpu: Move leftover contents of topology.c to setup.c

The only useful piece of arch/x86/kernel/topology.c is the definition
of arch_cpu_is_hotpluggable() that can be moved elsewhere (other
archit

x86/cpu: Move leftover contents of topology.c to setup.c

The only useful piece of arch/x86/kernel/topology.c is the definition
of arch_cpu_is_hotpluggable() that can be moved elsewhere (other
architectures tend to put it into setup.c), so do that and delete
the rest of the file.

Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/12422874.O9o76ZdvQC@kreacher

show more ...


Revision tags: v6.8
# 71eb4893 04-Mar-2024 Thomas Gleixner <[email protected]>

x86/percpu: Cure per CPU madness on UP

On UP builds Sparse complains rightfully about accesses to cpu_info with
per CPU accessors:

cacheinfo.c:282:30: sparse: warning: incorrect type in initializ

x86/percpu: Cure per CPU madness on UP

On UP builds Sparse complains rightfully about accesses to cpu_info with
per CPU accessors:

cacheinfo.c:282:30: sparse: warning: incorrect type in initializer (different address spaces)
cacheinfo.c:282:30: sparse: expected void const [noderef] __percpu *__vpp_verify
cacheinfo.c:282:30: sparse: got unsigned int *

The reason is that on UP builds cpu_info which is a per CPU variable on SMP
is mapped to boot_cpu_info which is a regular variable. There is a hideous
accessor cpu_data() which tries to hide this, but it's not sufficient as
some places require raw accessors and generates worse code than the regular
per CPU accessors.

Waste sizeof(struct x86_cpuinfo) memory on UP and provide the per CPU
cpu_info unconditionally. This requires to update the CPU info on the boot
CPU as SMP does. (Ab)use the weakly defined smp_prepare_boot_cpu() function
and implement exactly that.

This allows to use regular per CPU accessors uncoditionally and paves the
way to remove the cpu_data() hackery.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2
# a4eeb217 24-Jan-2024 Baoquan He <[email protected]>

x86, crash: wrap crash dumping code into crash related ifdefs

Now crash codes under kernel/ folder has been split out from kexec
code, crash dumping can be separated from kexec reboot in config
item

x86, crash: wrap crash dumping code into crash related ifdefs

Now crash codes under kernel/ folder has been split out from kexec
code, crash dumping can be separated from kexec reboot in config
items on x86 with some adjustments.

Here, also change some ifdefs or IS_ENABLED() check to more appropriate
ones, e,g
- #ifdef CONFIG_KEXEC_CORE -> #ifdef CONFIG_CRASH_DUMP
- (!IS_ENABLED(CONFIG_KEXEC_CORE)) - > (!IS_ENABLED(CONFIG_CRASH_RESERVE))

[[email protected]: don't nest CONFIG_CRASH_DUMP ifdef inside CONFIG_KEXEC_CODE ifdef scope]
Link: https://lore.kernel.org/all/SN6PR02MB4157931105FA68D72E3D3DB8D47B2@SN6PR02MB4157.namprd02.prod.outlook.com/T/#u
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Baoquan He <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Hari Bathini <[email protected]>
Cc: Pingfan Liu <[email protected]>
Cc: Klara Modin <[email protected]>
Cc: Michael Kelley <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Cc: Yang Li <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 7c0edad3 13-Feb-2024 Thomas Gleixner <[email protected]>

x86/cpu/topology: Rework possible CPU management

Managing possible CPUs is an unreadable and uncomprehensible maze. Aside of
that it's backwards because it applies command line limits after
register

x86/cpu/topology: Rework possible CPU management

Managing possible CPUs is an unreadable and uncomprehensible maze. Aside of
that it's backwards because it applies command line limits after
registering all APICs.

Rewrite it so that it:

- Applies the command line limits upfront so that only the allowed amount
of APIC IDs can be registered.

- Applies eventual late restrictions in an understandable way

- Uses simple min_t() calculations which are trivial to follow.

- Provides a separate function for resetting to UP mode late in the
bringup process.

Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Michael Kelley <[email protected]>
Tested-by: Sohil Mehta <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# de6aec24 13-Feb-2024 Thomas Gleixner <[email protected]>

x86/mm/numa: Move early mptable evaluation into common code

There is no reason to have the early mptable evaluation conditionally
invoked only from the AMD numa topology code.

Make it explicit and

x86/mm/numa: Move early mptable evaluation into common code

There is no reason to have the early mptable evaluation conditionally
invoked only from the AMD numa topology code.

Make it explicit and invoke it from setup_arch() right after the
corresponding ACPI init call. Remove the pointless wrapper and invoke
x86_init::mpparse::early_parse_smp_config() directly.

Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Michael Kelley <[email protected]>
Tested-by: Sohil Mehta <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# dcb76008 13-Feb-2024 Thomas Gleixner <[email protected]>

x86/mpparse: Switch to new init callbacks

Now that all platforms have the new split SMP configuration callbacks set
up, flip the switch and remove the old callback pointer and mop up the
platform co

x86/mpparse: Switch to new init callbacks

Now that all platforms have the new split SMP configuration callbacks set
up, flip the switch and remove the old callback pointer and mop up the
platform code.

Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Michael Kelley <[email protected]>
Tested-by: Sohil Mehta <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 5faf8ec7 13-Feb-2024 Thomas Gleixner <[email protected]>

x86/dtb: Rename x86_dtb_init()

x86_dtb_init() is a misnomer and it really should be used as a SMP
configuration parser which is selected by the platform via
x86_init::mpparse:parse_smp_config().

Re

x86/dtb: Rename x86_dtb_init()

x86_dtb_init() is a misnomer and it really should be used as a SMP
configuration parser which is selected by the platform via
x86_init::mpparse:parse_smp_config().

Rename it to x86_dtb_parse_smp_config() in preparation for that.

Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Michael Kelley <[email protected]>
Tested-by: Sohil Mehta <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


12345678910>>...22