| 6f5c71cc | 14-Dec-2024 |
Kai Huang <[email protected]> |
x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation
Old TDX modules can clobber RBP in the TDH.VP.ENTER SEAMCALL. However RBP is used as frame pointer in the x86_64 calling
x86/virt/tdx: Require the module to assert it has the NO_RBP_MOD mitigation
Old TDX modules can clobber RBP in the TDH.VP.ENTER SEAMCALL. However RBP is used as frame pointer in the x86_64 calling convention, and clobbering RBP could result in bad things like being unable to unwind the stack if any non-maskable exceptions (NMI, #MC etc) happens in that gap.
A new "NO_RBP_MOD" feature was introduced to more recent TDX modules to not clobber RBP. KVM will need to use the TDH.VP.ENTER SEAMCALL to run TDX guests. It won't be safe to run TDX guests w/o this feature. To prevent it, just don't initialize the TDX module if this feature is not supported [1].
Note the bit definitions of TDX_FEATURES0 are not auto-generated in tdx_global_metadata.h. Manually define a macro for it in "tdx.h".
Signed-off-by: Kai Huang <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Nikolay Borisov <[email protected]> Reviewed-by: Adrian Hunter <[email protected]> Reviewed-by: Dan Williams <[email protected]> Link: https://lore.kernel.org/[email protected]/ [1] Link: https://lore.kernel.org/all/76ae5025502c84d799e3a56a6fc4f69a82da8f93.1734188033.git.kai.huang%40intel.com
show more ...
|
| fae43b24 | 14-Dec-2024 |
Kai Huang <[email protected]> |
x86/virt/tdx: Switch to use auto-generated global metadata reading code
Continue the process to have a centralized solution for TDX global metadata reading. Now that the new autogenerated solution
x86/virt/tdx: Switch to use auto-generated global metadata reading code
Continue the process to have a centralized solution for TDX global metadata reading. Now that the new autogenerated solution is ready for use, switch to it and remove the old one.
Signed-off-by: Kai Huang <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Nikolay Borisov <[email protected]> Reviewed-by: Dan Williams <[email protected]> Link: https://lore.kernel.org/all/fc025d1e13b92900323f47cfe9aac3157bf08ee7.1734188033.git.kai.huang%40intel.com
show more ...
|
| 6bfb77f4 | 14-Dec-2024 |
Kai Huang <[email protected]> |
x86/virt/tdx: Use dedicated struct members for PAMT entry sizes
Currently, the 'struct tdmr_sys_info_tdmr' which includes TDMR related fields defines the PAMT entry sizes for TDX supported page size
x86/virt/tdx: Use dedicated struct members for PAMT entry sizes
Currently, the 'struct tdmr_sys_info_tdmr' which includes TDMR related fields defines the PAMT entry sizes for TDX supported page sizes (4KB, 2MB and 1GB) as an array:
struct tdx_sys_info_tdmr { ... u16 pamt_entry_sizes[TDX_PS_NR]; };
PAMT entry sizes are needed when allocating PAMTs for each TDMR. Using the array to contain PAMT entry sizes reduces the number of arguments that need to be passed when calling tdmr_set_up_pamt(). It also makes the code pattern like below clearer:
for (pgsz = TDX_PS_4K; pgsz < TDX_PS_NR; pgsz++) { pamt_size[pgsz] = tdmr_get_pamt_sz(tdmr, pgsz, pamt_entry_size[pgsz]); tdmr_pamt_size += pamt_size[pgsz]; }
However, the auto-generated metadata reading code generates a structure member for each field. The 'global_metadata.json' has a dedicated field for each PAMT entry size, and the new 'struct tdx_sys_info_tdmr' looks like:
struct tdx_sys_info_tdmr { ... u16 pamt_4k_entry_size; u16 pamt_2m_entry_size; u16 pamt_1g_entry_size; };
Prepare to use the autogenerated code by making the existing 'struct tdx_sys_info_tdmr' look like the generated one. When passing to tdmrs_set_up_pamt_all(), build a local array of PAMT entry sizes from the structure so the code to allocate PAMTs can stay the same.
Signed-off-by: Kai Huang <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Nikolay Borisov <[email protected]> Reviewed-by: Dan Williams <[email protected]> Link: https://lore.kernel.org/all/ccf46f3dacb01be1fb8309592616d443ac17caba.1734188033.git.kai.huang%40intel.com
show more ...
|
| 04a7bc73 | 14-Dec-2024 |
Paolo Bonzini <[email protected]> |
x86/virt/tdx: Use auto-generated code to read global metadata
The TDX module provides a set of "Global Metadata Fields". They report things like TDX module version, supported features, and fields r
x86/virt/tdx: Use auto-generated code to read global metadata
The TDX module provides a set of "Global Metadata Fields". They report things like TDX module version, supported features, and fields related to create/run TDX guests and so on.
Currently the kernel only reads "TD Memory Region" (TDMR) related fields for module initialization. There are needs to read more global metadata fields for future use:
- Supported features ("TDX_FEATURES0") to fail module initialization when the module doesn't support "not clobbering host RBP when exiting from TDX guest" feature [1]. - KVM TDX baseline support and other features like TDX Connect will need to read more.
The current global metadata reading code has limitations (e.g., it only has a primitive helper to read metadata field with 16-bit element size, while TDX supports 8/16/32/64 bits metadata element sizes). It needs tweaks in order to read more metadata fields.
But even with the tweaks, when new code is added to read a new field, the reviewers will still need to review against the spec to make sure the new code doesn't screw up things like using the wrong metadata field ID (each metadata field is associated with a unique field ID, which is a TDX-defined u64 constant) etc.
TDX documents all global metadata fields in a 'global_metadata.json' file as part of TDX spec [2]. JSON format is machine readable. Instead of tweaking the metadata reading code, use a script to generate the code so that:
1) Using the generated C is simple. 2) Adding a field is simple, e.g., the script just pulls the field ID out of the JSON for a given field thus no manual review is needed.
Specifically, to match the layout of the 'struct tdx_sys_info' and its sub-structures, the script uses a table with each entry containing the the name of the sub-structures (which reflects the "Class") and the "Field Name" of all its fields, and auto-generate:
1) The 'struct tdx_sys_info' and all 'struct tdx_sys_info_xx' sub-structures in 'tdx_global_metadata.h'.
2) The main function 'get_tdx_sys_info()' which reads all metadata to 'struct tdx_sys_info' and the 'get_tdx_sys_info_xx()' functions which read 'struct tdx_sys_info_xx()' in 'tdx_global_metadata.c'.
Using the generated C is simple: 1) include "tdx_global_metadata.h" to the local "tdx.h"; 2) explicitly include "tdx_global_metadata.c" to the local "tdx.c" after the read_sys_metadata_field() primitive (which is a wrapper of TDH.SYS.RD SEAMCALL to read global metadata).
Adding a field is also simple: 1) just add the new field to an existing structure, or add it with a new structure; 2) re-run the script to generate the new code; 3) update the existing tdx_global_metadata.{hc} with the new ones.
For now, use the auto-generated code to read the TDMR related fields and the aforesaid metadata field "TDX_FEATURES0".
The tdx_global_metadata.{hc} can be generated by running below:
#python tdx_global_metadata.py global_metadata.json \ tdx_global_metadata.h tdx_global_metadata.c
.. where the 'global_metadata.json' can be fetched from [2] and the 'tdx_global_metadata.py' can be found from [3].
Co-developed-by: Kai Huang <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Signed-off-by: Kai Huang <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Dan Williams <[email protected]> Link: https://lore.kernel.org/[email protected]/ [1] Link: https://cdrdv2.intel.com/v1/dl/getContent/795381 [2] Link: https://lore.kernel.org/[email protected]/ [3] Link: https://lore.kernel.org/all/cbe3f12b1e5479399b53f4873f2ff783d9fc669b.1734188033.git.kai.huang%40intel.com
show more ...
|
| c4e0862a | 14-Dec-2024 |
Kai Huang <[email protected]> |
x86/virt/tdx: Start to track all global metadata in one structure
The TDX module provides a set of "Global Metadata Fields". They report things like TDX module version, supported features, and fiel
x86/virt/tdx: Start to track all global metadata in one structure
The TDX module provides a set of "Global Metadata Fields". They report things like TDX module version, supported features, and fields related to create/run TDX guests and so on.
Today the kernel only reads "TD Memory Region" (TDMR) related fields for module initialization. KVM will need to read additional metadata fields to run TDX guests. Move towards having the TDX host core-kernel provide a centralized, canonical, and immutable structure for the global metadata that comes out from the TDX module for all kernel components to use.
As the first step, introduce a new 'struct tdx_sys_info' to track all global metadata fields.
TDX categorizes global metadata fields into different "Classes". E.g., the TDMR related fields are under class "TDMR Info". Instead of making 'struct tdx_sys_info' a plain structure to contain all metadata fields, organize them in smaller structures based on the "Class".
This allows those metadata fields to be used in finer granularity thus makes the code clearer. E.g., construct_tdmrs() can just take the structure which contains "TDMR Info" metadata fields.
Add get_tdx_sys_info() as the placeholder to read all metadata fields. Have it only call get_tdx_sys_info_tdmr() to read TDMR related fields for now.
Place get_tdx_sys_info() as the first step of init_tdx_module() to enable early prerequisite checks on the metadata to support early module initialization abort. This results in moving get_tdx_sys_info_tdmr() to be before build_tdx_memlist(), but this is fine because there are no dependencies between these two functions.
Signed-off-by: Kai Huang <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Adrian Hunter <[email protected]> Reviewed-by: Dan Williams <[email protected]> Link: https://lore.kernel.org/all/bfacb4e90527cf79d4be0d1753e6f318eea21118.1734188033.git.kai.huang%40intel.com
show more ...
|
| 8ae3291f | 13-Dec-2024 |
Tom Lendacky <[email protected]> |
x86/sev: Add full support for a segmented RMP table
A segmented RMP table allows for improved locality of reference between the memory protected by the RMP and the RMP entries themselves.
Add suppo
x86/sev: Add full support for a segmented RMP table
A segmented RMP table allows for improved locality of reference between the memory protected by the RMP and the RMP entries themselves.
Add support to detect and initialize a segmented RMP table with multiple segments as configured by the system BIOS. While the RMPREAD instruction will be used to read an RMP entry in a segmented RMP, initialization and debugging capabilities will require the mapping of the segments.
The RMP_CFG MSR indicates if segmented RMP support is enabled and, if enabled, the amount of memory that an RMP segment covers. When segmented RMP support is enabled, the RMP_BASE MSR points to the start of the RMP bookkeeping area, which is 16K in size. The RMP Segment Table (RST) is located immediately after the bookkeeping area and is 4K in size. The RST contains up to 512 8-byte entries that identify the location of the RMP segment and amount of memory mapped by the segment (which must be less than or equal to the configured segment size). The physical address that is covered by a segment is based on the segment size and the index of the segment in the RST. The RMP entry for a physical address is based on the offset within the segment.
For example, if the segment size is 64GB (0x1000000000 or 1 << 36), then physical address 0x9000800000 is RST entry 9 (0x9000800000 >> 36) and RST entry 9 covers physical memory 0x9000000000 to 0x9FFFFFFFFF.
The RMP entry index within the RMP segment is the physical address AND-ed with the segment mask, 64GB - 1 (0xFFFFFFFFF), and then right-shifted 12 bits or PHYS_PFN(0x9000800000 & 0xFFFFFFFFF), which is 0x800.
CPUID 0x80000025_EBX[9:0] describes the number of RMP segments that can be cached by the hardware. Additionally, if CPUID 0x80000025_EBX[10] is set, then the number of actual RMP segments defined cannot exceed the number of RMP segments that can be cached and can be used as a maximum RST index.
[ bp: Unify printk hex format specifiers. ]
Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Nikunj A Dadhania <[email protected]> Reviewed-by: Neeraj Upadhyay <[email protected]> Link: https://lore.kernel.org/r/02afd0ffd097a19cb6e5fb1bb76eb110496c5b11.1734101742.git.thomas.lendacky@amd.com
show more ...
|
| 0f14af0d | 02-Dec-2024 |
Tom Lendacky <[email protected]> |
x86/sev: Treat the contiguous RMP table as a single RMP segment
In preparation for support of a segmented RMP table, treat the contiguous RMP table as a segmented RMP table with a single segment cov
x86/sev: Treat the contiguous RMP table as a single RMP segment
In preparation for support of a segmented RMP table, treat the contiguous RMP table as a segmented RMP table with a single segment covering all of memory. By treating a contiguous RMP table as a single segment, much of the code that initializes and accesses the RMP can be re-used.
Segmented RMP tables can have up to 512 segment entries. Each segment will have metadata associated with it to identify the segment location, the segment size, etc. The segment data and the physical address are used to determine the index of the segment within the table and then the RMP entry within the segment. For an actual segmented RMP table environment, much of the segment information will come from a configuration MSR. For the contiguous RMP, though, much of the information will be statically defined.
[ bp: Touchups, explain array_index_nospec() usage. ]
Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Nikunj A Dadhania <[email protected]> Reviewed-by: Neeraj Upadhyay <[email protected]> Link: https://lore.kernel.org/r/8c40fbc9c5217f0d79b37cf861eff03ab0330bef.1733172653.git.thomas.lendacky@amd.com
show more ...
|
| ac517965 | 02-Dec-2024 |
Tom Lendacky <[email protected]> |
x86/sev: Map only the RMP table entries instead of the full RMP range
In preparation for support of a segmented RMP table, map only the RMP table entries. The RMP bookkeeping area is only ever acces
x86/sev: Map only the RMP table entries instead of the full RMP range
In preparation for support of a segmented RMP table, map only the RMP table entries. The RMP bookkeeping area is only ever accessed when first enabling SNP and does not need to remain mapped. To accomplish this, split the initialization of the RMP bookkeeping area and the initialization of the RMP entry area. The RMP bookkeeping area will be mapped only while it is being initialized.
Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Nikunj A Dadhania <[email protected]> Reviewed-by: Neeraj Upadhyay <[email protected]> Reviewed-by: Ashish Kalra <[email protected]> Link: https://lore.kernel.org/r/22f179998d319834f49c13a8c01187fbf0fd308d.1733172653.git.thomas.lendacky@amd.com
show more ...
|
| e2f3d40d | 02-Dec-2024 |
Tom Lendacky <[email protected]> |
x86/sev: Move the SNP probe routine out of the way
To make patch review easier for the segmented RMP support, move the SNP probe function out from in between the initialization-related routines.
No
x86/sev: Move the SNP probe routine out of the way
To make patch review easier for the segmented RMP support, move the SNP probe function out from in between the initialization-related routines.
No functional change.
Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Nikunj A Dadhania <[email protected]> Reviewed-by: Neeraj Upadhyay <[email protected]> Link: https://lore.kernel.org/r/6c2975bbf132d567dd12e1435be1d18c0bf9131c.1733172653.git.thomas.lendacky@amd.com
show more ...
|
| 8ef97958 | 26-Jan-2024 |
Ashish Kalra <[email protected]> |
crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump
Add a kdump safe version of sev_firmware_shutdown() and register it as a crash_kexec_post_notifier so it will be invoked during
crypto: ccp: Add panic notifier for SEV/SNP firmware shutdown on kdump
Add a kdump safe version of sev_firmware_shutdown() and register it as a crash_kexec_post_notifier so it will be invoked during panic/crash to do SEV/SNP shutdown. This is required for transitioning all IOMMU pages to reclaim/hypervisor state, otherwise re-init of IOMMU pages during crashdump kernel boot fails and panics the crashdump kernel.
This panic notifier runs in atomic context, hence it ensures not to acquire any locks/mutexes and polls for PSP command completion instead of depending on PSP command completion interrupt.
[ mdr: Remove use of "we" in comments. ]
Signed-off-by: Ashish Kalra <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|