|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7 |
|
| #
e120d1bc |
| 13-Mar-2025 |
Mike Rapoport (Microsoft) <[email protected]> |
arch, mm: set high_memory in free_area_init()
high_memory defines upper bound on the directly mapped memory. This bound is defined by the beginning of ZONE_HIGHMEM when a system has high memory and
arch, mm: set high_memory in free_area_init()
high_memory defines upper bound on the directly mapped memory. This bound is defined by the beginning of ZONE_HIGHMEM when a system has high memory and by the end of memory otherwise.
All this is known to generic memory management initialization code that can set high_memory while initializing core mm structures.
Add a generic calculation of high_memory to free_area_init() and remove per-architecture calculation except for the architectures that set and use high_memory earlier than that.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> Acked-by: Dave Hansen <[email protected]> [x86] Tested-by: Mark Brown <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Betkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: David S. Miller <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Guo Ren (csky) <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiaxun Yang <[email protected]> Cc: Johannes Berg <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Russel King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleinxer <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1 |
|
| #
7b54a96f |
| 31-Jan-2025 |
Sourabh Jain <[email protected]> |
crash: remove an unused argument from reserve_crashkernel_generic()
cmdline argument is not used in reserve_crashkernel_generic() so remove it. Correspondingly, all the callers have been updated as
crash: remove an unused argument from reserve_crashkernel_generic()
cmdline argument is not used in reserve_crashkernel_generic() so remove it. Correspondingly, all the callers have been updated as well.
No functional change intended.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sourabh Jain <[email protected]> Acked-by: Hari Bathini <[email protected]> Acked-by: Baoquan He <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mahesh Salgaonkar <[email protected]> Cc: Michael Ellerman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
665eaf31 |
| 28-Feb-2025 |
Frank van der Linden <[email protected]> |
x86/setup: call hugetlb_bootmem_alloc early
Call hugetlb_bootmem_allloc in an earlier spot in setup, after hugelb_cma_reserve. This will make vmemmap preinit of the sections covered by the allocate
x86/setup: call hugetlb_bootmem_alloc early
Call hugetlb_bootmem_allloc in an earlier spot in setup, after hugelb_cma_reserve. This will make vmemmap preinit of the sections covered by the allocated hugetlb pages possible.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Frank van der Linden <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Joao Martins <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Muchun Song <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Roman Gushchin (Cruise) <[email protected]> Cc: Usama Arif <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Yu Zhao <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
efe659ac |
| 14-Feb-2025 |
Mike Rapoport (Microsoft) <[email protected]> |
x86/e820: Drop obsolete E820_TYPE_RESERVED_KERN and related code
E820_TYPE_RESERVED_KERN is a relict from the ancient history that was used to early reserve setup_data, see:
28bb22379513 ("x86: m
x86/e820: Drop obsolete E820_TYPE_RESERVED_KERN and related code
E820_TYPE_RESERVED_KERN is a relict from the ancient history that was used to early reserve setup_data, see:
28bb22379513 ("x86: move reserve_setup_data to setup.c")
Nowadays setup_data is anyway reserved in memblock and there is no point in carrying E820_TYPE_RESERVED_KERN that behaves exactly like E820_TYPE_RAM but only complicates the code.
A bonus for removing E820_TYPE_RESERVED_KERN is a small but measurable speedup of 20 microseconds in init_mem_mappings() on a VM with 32GB or RAM.
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
d45dd0a9 |
| 14-Feb-2025 |
Mike Rapoport (Microsoft) <[email protected]> |
x86/boot: Split parsing of boot_params into the parse_boot_params() helper function
Makes setup_arch() a bit easier to comprehend.
No functional changes.
Signed-off-by: Mike Rapoport (Microsoft) <
x86/boot: Split parsing of boot_params into the parse_boot_params() helper function
Makes setup_arch() a bit easier to comprehend.
No functional changes.
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
297fb82e |
| 14-Feb-2025 |
Mike Rapoport (Microsoft) <[email protected]> |
x86/boot: Split kernel resources setup into the setup_kernel_resources() helper function
Makes setup_arch() a bit easier to comprehend.
No functional changes.
Signed-off-by: Mike Rapoport (Microso
x86/boot: Split kernel resources setup into the setup_kernel_resources() helper function
Makes setup_arch() a bit easier to comprehend.
No functional changes.
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
2d6bff31 |
| 14-Feb-2025 |
Mike Rapoport (Microsoft) <[email protected]> |
x86/boot: Move setting of memblock parameters to e820__memblock_setup()
Changing memblock parameters, namely bottom_up and allocation upper limit does not have any effect before memblock initializat
x86/boot: Move setting of memblock parameters to e820__memblock_setup()
Changing memblock parameters, namely bottom_up and allocation upper limit does not have any effect before memblock initialization in e820__memblock_setup().
Move the calls to memblock_set_bottom_up() and memblock_set_current_limit() to e820__memblock_setup() to group all the memblock initial setup and make setup_arch() more readable.
No functional changes.
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
c305a4e9 |
| 18-Feb-2025 |
Joel Granados <[email protected]> |
x86: Move sysctls into arch/x86
Move the following sysctl tables into arch/x86/kernel/setup.c:
panic_on_{unrecoverable_nmi,io_nmi} bootloader_{type,version} io_delay_type unknown_nmi_panic
x86: Move sysctls into arch/x86
Move the following sysctl tables into arch/x86/kernel/setup.c:
panic_on_{unrecoverable_nmi,io_nmi} bootloader_{type,version} io_delay_type unknown_nmi_panic acpi_realmode_flags
Variables moved from include/linux/ to arch/x86/include/asm/ because there is no longer need for them outside arch/x86/kernel:
acpi_realmode_flags panic_on_{unrecoverable_nmi,io_nmi}
Include <asm/nmi.h> in arch/s86/kernel/setup.h in order to bring in panic_on_{io_nmi,unrecovered_nmi}.
This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kerenel/sysctl.c.
Signed-off-by: Joel Granados <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3 |
|
| #
ccd58205 |
| 12-Dec-2024 |
Guo Weikang <[email protected]> |
mm/early_ioremap: add null pointer checks to prevent NULL-pointer dereference
The early_ioremap interface can fail and return NULL in certain cases. To prevent NULL-pointer dereference crashes, fix
mm/early_ioremap: add null pointer checks to prevent NULL-pointer dereference
The early_ioremap interface can fail and return NULL in certain cases. To prevent NULL-pointer dereference crashes, fixed issues in the acpi_extlog and copy_early_mem interfaces, improving robustness when handling early memory.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Guo Weikang <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Julian Stecklina <[email protected]> Cc: Kevin Loughlin <[email protected]> Cc: Len Brown <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Xin Li (Intel) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10 |
|
| #
a97756cb |
| 09-Jul-2024 |
Xin Li (Intel) <[email protected]> |
x86/fred: Enable FRED right after init_mem_mapping()
On 64-bit init_mem_mapping() relies on the minimal page fault handler provided by the early IDT mechanism. The real page fault handler is install
x86/fred: Enable FRED right after init_mem_mapping()
On 64-bit init_mem_mapping() relies on the minimal page fault handler provided by the early IDT mechanism. The real page fault handler is installed right afterwards into the IDT.
This is problematic on CPUs which have X86_FEATURE_FRED set because the real page fault handler retrieves the faulting address from the FRED exception stack frame and not from CR2, but that does obviously not work when FRED is not yet enabled in the CPU.
To prevent this enable FRED right after init_mem_mapping() without interrupt stacks. Those are enabled later in trap_init() after the CPU entry area is set up.
[ tglx: Encapsulate the FRED details ]
Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code") Reported-by: Hou Wenlong <[email protected]> Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Xin Li (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
show more ...
|
| #
bf514327 |
| 30-Jul-2024 |
Borislav Petkov (AMD) <[email protected]> |
x86/setup: Parse the builtin command line before merging
Commit in Fixes was added as a catch-all for cases where the cmdline is parsed before being merged with the builtin one.
And promptly one is
x86/setup: Parse the builtin command line before merging
Commit in Fixes was added as a catch-all for cases where the cmdline is parsed before being merged with the builtin one.
And promptly one issue appeared, see Link below. The microcode loader really needs to parse it that early, but the merging happens later.
Reshuffling the early boot nightmare^W code to handle that properly would be a painful exercise for another day so do the chicken thing and parse the builtin cmdline too before it has been merged.
Fixes: 0c40b1c7a897 ("x86/setup: Warn when option parsing is done too early") Reported-by: Mike Lothian <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/20240730152108.GAZqkE5Dfi9AuKllRw@fat_crate.local Link: https://lore.kernel.org/r/20240722152330.GCZp55ck8E_FT4kPnC@fat_crate.local
show more ...
|
|
Revision tags: v6.10-rc7, v6.10-rc6, v6.10-rc5 |
|
| #
37aee82c |
| 20-Jun-2024 |
Ard Biesheuvel <[email protected]> |
x86/efi: Drop support for fake EFI memory maps
Between kexec and confidential VM support, handling the EFI memory maps correctly on x86 is already proving to be rather difficult (as opposed to other
x86/efi: Drop support for fake EFI memory maps
Between kexec and confidential VM support, handling the EFI memory maps correctly on x86 is already proving to be rather difficult (as opposed to other EFI architectures which manage to never modify the EFI memory map to begin with)
EFI fake memory map support is essentially a development hack (for testing new support for the 'special purpose' and 'more reliable' EFI memory attributes) that leaked into production code. The regions marked in this manner are not actually recognized as such by the firmware itself or the EFI stub (and never have), and marking memory as 'more reliable' seems rather futile if the underlying memory is just ordinary RAM.
Marking memory as 'special purpose' in this way is also dubious, but may be in use in production code nonetheless. However, the same should be achievable by using the memmap= command line option with the ! operator.
EFI fake memmap support is not enabled by any of the major distros (Debian, Fedora, SUSE, Ubuntu) and does not exist on other architectures, so let's drop support for it.
Acked-by: Borislav Petkov (AMD) <[email protected]> Acked-by: Dan Williams <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4 |
|
| #
0c40b1c7 |
| 08-Apr-2024 |
Borislav Petkov (AMD) <[email protected]> |
x86/setup: Warn when option parsing is done too early
Commit
4faa0e5d6d79 ("x86/boot: Move kernel cmdline setup earlier in the boot process (again)")
fixed and issue where cmdline parsing would
x86/setup: Warn when option parsing is done too early
Commit
4faa0e5d6d79 ("x86/boot: Move kernel cmdline setup earlier in the boot process (again)")
fixed and issue where cmdline parsing would happen before the final boot_command_line string has been built from the builtin and boot cmdlines and thus cmdline arguments would get lost.
Add a check to catch any future wrong use ordering so that such issues can be caught in time.
Signed-off-by: Borislav Petkov (AMD) <[email protected]> Acked-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/20240409152541.GCZhVd9XIPXyTNd9vc@fat_crate.local
show more ...
|
|
Revision tags: v6.9-rc3, v6.9-rc2 |
|
| #
fdb022f6 |
| 25-Mar-2024 |
Baoquan He <[email protected]> |
x86: remove unneeded memblock_find_dma_reserve()
Patch series "mm/mm_init.c: refactor free_area_init_core()".
In function free_area_init_core(), the code calculating zone->managed_pages and the sub
x86: remove unneeded memblock_find_dma_reserve()
Patch series "mm/mm_init.c: refactor free_area_init_core()".
In function free_area_init_core(), the code calculating zone->managed_pages and the subtracting dma_reserve from DMA zone looks very confusing.
From git history, the code calculating zone->managed_pages was for zone->present_pages originally. The early rough assignment is for optimize zone's pcp and water mark setting. Later, managed_pages was introduced into zone to represent the number of managed pages by buddy. Now, zone->managed_pages is zeroed out and reset in mem_init() when calling memblock_free_all(). zone's pcp and wmark setting relying on actual zone->managed_pages are done later than mem_init() invocation. So we don't need rush to early calculate and set zone->managed_pages, just set it as zone->present_pages, will adjust it in mem_init().
And also add a new function calc_nr_kernel_pages() to count up free but not reserved pages in memblock, then assign it to nr_all_pages and nr_kernel_pages after memmap pages are allocated.
This patch (of 6):
Variable dma_reserve and its usage was introduced in commit 0e0b864e069c ("[PATCH] Account for memmap and optionally the kernel image as holes"). Its original purpose was to accounting for the reserved pages in DMA zone to make DMA zone's watermarks calculation more accurate on x86.
However, currently there's zone->managed_pages to account for all available pages for buddy, zone->present_pages to account for all present physical pages in zone. What is more important, on x86, calculating and setting the zone->managed_pages is a temporary move, all zone's managed_pages will be zeroed out and reset to the actual value according to how many pages are added to buddy allocator in mem_init(). Before mem_init(), no buddy alloction is requested. And zone's pcp and watermark setting are all done after mem_init(). So, no need to worry about the DMA zone's setting accuracy during free_area_init().
Hence, remove memblock_find_dma_reserve() to stop calculating and setting dma_reserve.
Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Baoquan He <[email protected]> Reviewed-by: Mike Rapoport (IBM) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
99485c4c |
| 26-Mar-2024 |
Jason A. Donenfeld <[email protected]> |
x86/coco: Require seeding RNG with RDRAND on CoCo systems
There are few uses of CoCo that don't rely on working cryptography and hence a working RNG. Unfortunately, the CoCo threat model means that
x86/coco: Require seeding RNG with RDRAND on CoCo systems
There are few uses of CoCo that don't rely on working cryptography and hence a working RNG. Unfortunately, the CoCo threat model means that the VM host cannot be trusted and may actively work against guests to extract secrets or manipulate computation. Since a malicious host can modify or observe nearly all inputs to guests, the only remaining source of entropy for CoCo guests is RDRAND.
If RDRAND is broken -- due to CPU hardware fault -- the RNG as a whole is meant to gracefully continue on gathering entropy from other sources, but since there aren't other sources on CoCo, this is catastrophic. This is mostly a concern at boot time when initially seeding the RNG, as after that the consequences of a broken RDRAND are much more theoretical.
So, try at boot to seed the RNG using 256 bits of RDRAND output. If this fails, panic(). This will also trigger if the system is booted without RDRAND, as RDRAND is essential for a safe CoCo boot.
Add this deliberately to be "just a CoCo x86 driver feature" and not part of the RNG itself. Many device drivers and platforms have some desire to contribute something to the RNG, and add_device_randomness() is specifically meant for this purpose.
Any driver can call it with seed data of any quality, or even garbage quality, and it can only possibly make the quality of the RNG better or have no effect, but can never make it worse.
Rather than trying to build something into the core of the RNG, consider the particular CoCo issue just a CoCo issue, and therefore separate it all out into driver (well, arch/platform) code.
[ bp: Massage commit message. ]
Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Elena Reshetova <[email protected]> Reviewed-by: Kirill A. Shutemov <[email protected]> Reviewed-by: Theodore Ts'o <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
4faa0e5d |
| 28-Mar-2024 |
Julian Stecklina <[email protected]> |
x86/boot: Move kernel cmdline setup earlier in the boot process (again)
When split_lock_detect=off (or similar) is specified in CONFIG_CMDLINE, its effect is lost. The flow is currently this:
setu
x86/boot: Move kernel cmdline setup earlier in the boot process (again)
When split_lock_detect=off (or similar) is specified in CONFIG_CMDLINE, its effect is lost. The flow is currently this:
setup_arch(): -> early_cpu_init() -> early_identify_cpu() -> sld_setup() -> sld_state_setup() -> Looks for split_lock_detect in boot_command_line
-> e820__memory_setup()
-> Assemble final command line: boot_command_line = builtin_cmdline + boot_cmdline
-> parse_early_param()
There were earlier attempts at fixing this in:
8d48bf8206f7 ("x86/boot: Pull up cmdline preparation and early param parsing")
later reverted in:
fbe618399854 ("Revert "x86/boot: Pull up cmdline preparation and early param parsing"")
... because parse_early_param() can't be called before e820__memory_setup().
In this patch, we just move the command line concatenation to the beginning of early_cpu_init(). This should fix sld_state_setup(), while not running in the same issues as the earlier attempt.
The order is now:
setup_arch(): -> Assemble final command line: boot_command_line = builtin_cmdline + boot_cmdline
-> early_cpu_init() -> early_identify_cpu() -> sld_setup() -> sld_state_setup() -> Looks for split_lock_detect in boot_command_line
-> e820__memory_setup()
-> parse_early_param()
Signed-off-by: Julian Stecklina <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Kees Cook <[email protected]> Cc: [email protected]
show more ...
|
|
Revision tags: v6.9-rc1 |
|
| #
0f4a1e80 |
| 13-Mar-2024 |
Kevin Loughlin <[email protected]> |
x86/sev: Skip ROM range scans and validation for SEV-SNP guests
SEV-SNP requires encrypted memory to be validated before access. Because the ROM memory range is not part of the e820 table, it is not
x86/sev: Skip ROM range scans and validation for SEV-SNP guests
SEV-SNP requires encrypted memory to be validated before access. Because the ROM memory range is not part of the e820 table, it is not pre-validated by the BIOS. Therefore, if a SEV-SNP guest kernel wishes to access this range, the guest must first validate the range.
The current SEV-SNP code does indeed scan the ROM range during early boot and thus attempts to validate the ROM range in probe_roms(). However, this behavior is neither sufficient nor necessary for the following reasons:
* With regards to sufficiency, if EFI_CONFIG_TABLES are not enabled and CONFIG_DMI_SCAN_MACHINE_NON_EFI_FALLBACK is set, the kernel will attempt to access the memory at SMBIOS_ENTRY_POINT_SCAN_START (which falls in the ROM range) prior to validation.
For example, Project Oak Stage 0 provides a minimal guest firmware that currently meets these configuration conditions, meaning guests booting atop Oak Stage 0 firmware encounter a problematic call chain during dmi_setup() -> dmi_scan_machine() that results in a crash during boot if SEV-SNP is enabled.
* With regards to necessity, SEV-SNP guests generally read garbage (which changes across boots) from the ROM range, meaning these scans are unnecessary. The guest reads garbage because the legacy ROM range is unencrypted data but is accessed via an encrypted PMD during early boot (where the PMD is marked as encrypted due to potentially mapping actually-encrypted data in other PMD-contained ranges).
In one exceptional case, EISA probing treats the ROM range as unencrypted data, which is inconsistent with other probing.
Continuing to allow SEV-SNP guests to use garbage and to inconsistently classify ROM range encryption status can trigger undesirable behavior. For instance, if garbage bytes appear to be a valid signature, memory may be unnecessarily reserved for the ROM range. Future code or other use cases may result in more problematic (arbitrary) behavior that should be avoided.
While one solution would be to overhaul the early PMD mapping to always treat the ROM region of the PMD as unencrypted, SEV-SNP guests do not currently rely on data from the ROM region during early boot (and even if they did, they would be mostly relying on garbage data anyways).
As a simpler solution, skip the ROM range scans (and the otherwise- necessary range validation) during SEV-SNP guest early boot. The potential SEV-SNP guest crash due to lack of ROM range validation is thus avoided by simply not accessing the ROM range.
In most cases, skip the scans by overriding problematic x86_init functions during sme_early_init() to SNP-safe variants, which can be likened to x86_init overrides done for other platforms (ex: Xen); such overrides also avoid the spread of cc_platform_has() checks throughout the tree.
In the exceptional EISA case, still use cc_platform_has() for the simplest change, given (1) checks for guest type (ex: Xen domain status) are already performed here, and (2) these checks occur in a subsys initcall instead of an x86_init function.
[ bp: Massage commit message, remove "we"s. ]
Fixes: 9704c07bf9f7 ("x86/kernel: Validate ROM memory before accessing when SEV-SNP is active") Signed-off-by: Kevin Loughlin <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
c90399fb |
| 22-Mar-2024 |
Thomas Gleixner <[email protected]> |
x86/cpu: Ensure that CPU info updates are propagated on UP
The boot sequence evaluates CPUID information twice:
1) During early boot
2) When finalizing the early setup right before mitiga
x86/cpu: Ensure that CPU info updates are propagated on UP
The boot sequence evaluates CPUID information twice:
1) During early boot
2) When finalizing the early setup right before mitigations are selected and alternatives are patched.
In both cases the evaluation is stored in boot_cpu_data, but on UP the copying of boot_cpu_data to the per CPU info of the boot CPU happens between #1 and #2. So any update which happens in #2 is never propagated to the per CPU info instance.
Consolidate the whole logic and copy boot_cpu_data right before applying alternatives as that's the point where boot_cpu_data is in it's final state and not supposed to change anymore.
This also removes the voodoo mb() from smp_prepare_cpus_common() which had absolutely no purpose.
Fixes: 71eb4893cfaf ("x86/percpu: Cure per CPU madness on UP") Reported-by: Guenter Roeck <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Tested-by: Guenter Roeck <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
edf66a3c |
| 18-Mar-2024 |
Rafael J. Wysocki <[email protected]> |
x86/cpu: Move leftover contents of topology.c to setup.c
The only useful piece of arch/x86/kernel/topology.c is the definition of arch_cpu_is_hotpluggable() that can be moved elsewhere (other archit
x86/cpu: Move leftover contents of topology.c to setup.c
The only useful piece of arch/x86/kernel/topology.c is the definition of arch_cpu_is_hotpluggable() that can be moved elsewhere (other architectures tend to put it into setup.c), so do that and delete the rest of the file.
Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/12422874.O9o76ZdvQC@kreacher
show more ...
|
|
Revision tags: v6.8 |
|
| #
71eb4893 |
| 04-Mar-2024 |
Thomas Gleixner <[email protected]> |
x86/percpu: Cure per CPU madness on UP
On UP builds Sparse complains rightfully about accesses to cpu_info with per CPU accessors:
cacheinfo.c:282:30: sparse: warning: incorrect type in initializ
x86/percpu: Cure per CPU madness on UP
On UP builds Sparse complains rightfully about accesses to cpu_info with per CPU accessors:
cacheinfo.c:282:30: sparse: warning: incorrect type in initializer (different address spaces) cacheinfo.c:282:30: sparse: expected void const [noderef] __percpu *__vpp_verify cacheinfo.c:282:30: sparse: got unsigned int *
The reason is that on UP builds cpu_info which is a per CPU variable on SMP is mapped to boot_cpu_info which is a regular variable. There is a hideous accessor cpu_data() which tries to hide this, but it's not sufficient as some places require raw accessors and generates worse code than the regular per CPU accessors.
Waste sizeof(struct x86_cpuinfo) memory on UP and provide the per CPU cpu_info unconditionally. This requires to update the CPU info on the boot CPU as SMP does. (Ab)use the weakly defined smp_prepare_boot_cpu() function and implement exactly that.
This allows to use regular per CPU accessors uncoditionally and paves the way to remove the cpu_data() hackery.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2 |
|
| #
a4eeb217 |
| 24-Jan-2024 |
Baoquan He <[email protected]> |
x86, crash: wrap crash dumping code into crash related ifdefs
Now crash codes under kernel/ folder has been split out from kexec code, crash dumping can be separated from kexec reboot in config item
x86, crash: wrap crash dumping code into crash related ifdefs
Now crash codes under kernel/ folder has been split out from kexec code, crash dumping can be separated from kexec reboot in config items on x86 with some adjustments.
Here, also change some ifdefs or IS_ENABLED() check to more appropriate ones, e,g - #ifdef CONFIG_KEXEC_CORE -> #ifdef CONFIG_CRASH_DUMP - (!IS_ENABLED(CONFIG_KEXEC_CORE)) - > (!IS_ENABLED(CONFIG_CRASH_RESERVE))
[[email protected]: don't nest CONFIG_CRASH_DUMP ifdef inside CONFIG_KEXEC_CODE ifdef scope] Link: https://lore.kernel.org/all/SN6PR02MB4157931105FA68D72E3D3DB8D47B2@SN6PR02MB4157.namprd02.prod.outlook.com/T/#u Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Baoquan He <[email protected]> Cc: Al Viro <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Pingfan Liu <[email protected]> Cc: Klara Modin <[email protected]> Cc: Michael Kelley <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Yang Li <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
7c0edad3 |
| 13-Feb-2024 |
Thomas Gleixner <[email protected]> |
x86/cpu/topology: Rework possible CPU management
Managing possible CPUs is an unreadable and uncomprehensible maze. Aside of that it's backwards because it applies command line limits after register
x86/cpu/topology: Rework possible CPU management
Managing possible CPUs is an unreadable and uncomprehensible maze. Aside of that it's backwards because it applies command line limits after registering all APICs.
Rewrite it so that it:
- Applies the command line limits upfront so that only the allowed amount of APIC IDs can be registered.
- Applies eventual late restrictions in an understandable way
- Uses simple min_t() calculations which are trivial to follow.
- Provides a separate function for resetting to UP mode late in the bringup process.
Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Michael Kelley <[email protected]> Tested-by: Sohil Mehta <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
de6aec24 |
| 13-Feb-2024 |
Thomas Gleixner <[email protected]> |
x86/mm/numa: Move early mptable evaluation into common code
There is no reason to have the early mptable evaluation conditionally invoked only from the AMD numa topology code.
Make it explicit and
x86/mm/numa: Move early mptable evaluation into common code
There is no reason to have the early mptable evaluation conditionally invoked only from the AMD numa topology code.
Make it explicit and invoke it from setup_arch() right after the corresponding ACPI init call. Remove the pointless wrapper and invoke x86_init::mpparse::early_parse_smp_config() directly.
Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Michael Kelley <[email protected]> Tested-by: Sohil Mehta <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
dcb76008 |
| 13-Feb-2024 |
Thomas Gleixner <[email protected]> |
x86/mpparse: Switch to new init callbacks
Now that all platforms have the new split SMP configuration callbacks set up, flip the switch and remove the old callback pointer and mop up the platform co
x86/mpparse: Switch to new init callbacks
Now that all platforms have the new split SMP configuration callbacks set up, flip the switch and remove the old callback pointer and mop up the platform code.
Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Michael Kelley <[email protected]> Tested-by: Sohil Mehta <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
5faf8ec7 |
| 13-Feb-2024 |
Thomas Gleixner <[email protected]> |
x86/dtb: Rename x86_dtb_init()
x86_dtb_init() is a misnomer and it really should be used as a SMP configuration parser which is selected by the platform via x86_init::mpparse:parse_smp_config().
Re
x86/dtb: Rename x86_dtb_init()
x86_dtb_init() is a misnomer and it really should be used as a SMP configuration parser which is selected by the platform via x86_init::mpparse:parse_smp_config().
Rename it to x86_dtb_parse_smp_config() in preparation for that.
Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Michael Kelley <[email protected]> Tested-by: Sohil Mehta <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|