| 9b8e73cd | 07-Mar-2025 |
Dave Jiang <[email protected]> |
cxl: Move cxl feature command structs to user header
In preparation for cxl fwctl enabling, move data structures related to cxl feature commands to a user header file.
Reviewed-by; Jonathan Cameron
cxl: Move cxl feature command structs to user header
In preparation for cxl fwctl enabling, move data structures related to cxl feature commands to a user header file.
Reviewed-by; Jonathan Cameron <[email protected]>
Link: https://patch.msgid.link/r/[email protected] Reviewed-by: Dan Williams <[email protected]> Reviewed-by: Li Ming <[email protected]> Signed-off-by: Dave Jiang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
show more ...
|
| 36f257e3 | 10-Mar-2025 |
Smita Koralahalli <[email protected]> |
acpi/ghes, cxl/pci: Process CXL CPER Protocol Errors
When PCIe AER is in FW-First, OS should process CXL Protocol errors from CPER records. Introduce support for handling and logging CXL Protocol er
acpi/ghes, cxl/pci: Process CXL CPER Protocol Errors
When PCIe AER is in FW-First, OS should process CXL Protocol errors from CPER records. Introduce support for handling and logging CXL Protocol errors.
The defined trace events cxl_aer_uncorrectable_error and cxl_aer_correctable_error trace native CXL AER endpoint errors. Reuse them to trace FW-First Protocol errors.
Since the CXL code is required to be called from process context and GHES is in interrupt context, use workqueues for processing.
Similar to CXL CPER event handling, use kfifo to handle errors as it simplifies queue processing by providing lock free fifo operations.
Add the ability for the CXL sub-system to register a workqueue to process CXL CPER protocol errors.
[DJ: return cxl_cper_register_prot_err_work() directly in cxl_ras_init()]
Signed-off-by: Smita Koralahalli <[email protected]> Reviewed-by: Li Ming <[email protected]> Reviewed-by: Alison Schofield <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Tony Luck <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Dave Jiang <[email protected]>
show more ...
|
| cbbca60a | 20-Feb-2025 |
Dave Jiang <[email protected]> |
cxl: Enumerate feature commands
Add feature commands enumeration code in order to detect and enumerate the 3 feature related commands "get supported features", "get feature", and "set feature". The
cxl: Enumerate feature commands
Add feature commands enumeration code in order to detect and enumerate the 3 feature related commands "get supported features", "get feature", and "set feature". The enumeration will help determine whether the driver can issue any of the 3 commands to the device.
Reviewed-by: Dan Williams <[email protected]> Reviewed-by: Li Ming <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Tested-by: Shiju Jose <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Dave Jiang <[email protected]>
show more ...
|
| 315c2f0b | 23-Jan-2025 |
Smita Koralahalli <[email protected]> |
acpi/ghes, cper: Recognize and cache CXL Protocol errors
Add support in GHES to detect and process CXL CPER Protocol errors, as defined in UEFI v2.10, section N.2.13.
Define struct cxl_cper_prot_er
acpi/ghes, cper: Recognize and cache CXL Protocol errors
Add support in GHES to detect and process CXL CPER Protocol errors, as defined in UEFI v2.10, section N.2.13.
Define struct cxl_cper_prot_err_work_data to cache CXL protocol error information, including RAS capabilities and severity, for further handling.
These cached CXL CPER records will later be processed by workqueues within the CXL subsystem.
Signed-off-by: Smita Koralahalli <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Gregory Price <[email protected]> Reviewed-by: Dan Williams <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Dave Jiang <[email protected]>
show more ...
|
| 4c6e20eb | 11-Jan-2025 |
Shiju Jose <[email protected]> |
cxl/events: Update Memory Module Event Record to CXL spec rev 3.1
CXL spec 3.1 section 8.2.9.2.1.3 Table 8-47, Memory Module Event Record has updated with following new fields and new info for Devic
cxl/events: Update Memory Module Event Record to CXL spec rev 3.1
CXL spec 3.1 section 8.2.9.2.1.3 Table 8-47, Memory Module Event Record has updated with following new fields and new info for Device Event Type and Device Health Information fields. 1. Validity Flags 2. Component Identifier 3. Device Event Sub-Type
Update the Memory Module event record and Memory Module trace event for the above spec changes. The new fields are inserted in logical places.
Example trace print of cxl_memory_module trace event,
cxl_memory_module: memdev=mem3 host=0000:0f:00.0 serial=3 log=Fatal : \ time=371709344709 uuid=fe927475-dd59-4339-a586-79bab113b774 len=128 \ flags='0x1' handle=2 related_handle=0 maint_op_class=0 \ maint_op_sub_class=0 : event_type='Temperature Change' \ event_sub_type='Unsupported Config Data' \ health_status='MAINTENANCE_NEEDED|REPLACEMENT_NEEDED' \ media_status='All Data Loss in Event of Power Loss' as_life_used=0x3 \ as_dev_temp=Normal as_cor_vol_err_cnt=Normal as_cor_per_err_cnt=Normal \ life_used=8 device_temp=3 dirty_shutdown_cnt=33 cor_vol_err_cnt=25 \ cor_per_err_cnt=45 validity_flags='COMPONENT|COMPONENT PLDM FORMAT' \ comp_id=03 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \ comp_id_pldm_valid_flags='Resource ID' \ pldm_entity_id=0x00 pldm_resource_id=fc d2 7e 2f
Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Shiju Jose <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Dave Jiang <[email protected]>
show more ...
|
| 24ec41f7 | 11-Jan-2025 |
Shiju Jose <[email protected]> |
cxl/events: Update DRAM Event Record to CXL spec rev 3.1
CXL spec 3.1 section 8.2.9.2.1.2 Table 8-46, DRAM Event Record has updated with following new fields and new types for Memory Event Type, Tra
cxl/events: Update DRAM Event Record to CXL spec rev 3.1
CXL spec 3.1 section 8.2.9.2.1.2 Table 8-46, DRAM Event Record has updated with following new fields and new types for Memory Event Type, Transaction Type and Validity Flags fields. 1. Component Identifier 2. Sub-channel 3. Advanced Programmable Corrected Memory Error Threshold Event Flags 4. Corrected Memory Error Count at Event 5. Memory Event Sub-Type
Update DRAM events record and DRAM trace event for the above spec changes. The new fields are inserted in logical places. Includes trivial consistency of white space improvements.
Example trace print of cxl_dram trace event,
cxl_dram: memdev=mem0 host=0000:0f:00.0 serial=3 log=Informational : \ time=54799339519 uuid=601dcbb3-9c06-4eab-b8af-4e9bfb5c9624 len=128 \ flags='0x1' handle=1 related_handle=0 maint_op_class=1 \ maint_op_sub_class=3 : dpa=18680 dpa_flags='' \ descriptor='UNCORRECTABLE_EVENT|THRESHOLD_EVENT' type='Data Path Error' \ sub_type='Media Link CRC Error' transaction_type='Internal Media Scrub' \ channel=3 rank=17 nibble_mask=3b00b2 bank_group=7 bank=11 row=2 \ column=77 cor_mask=21 00 00 00 00 00 00 00 2c 00 00 00 00 00 00 00 37 00 \ 00 00 00 00 00 00 42 00 00 00 00 00 00 00 validity_flags='CHANNEL|RANK|NIBBLE|\ BANK GROUP|BANK|ROW|COLUMN|CORRECTION MASK|COMPONENT|COMPONENT PLDM FORMAT' \ comp_id=01 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \ comp_id_pldm_valid_flags='PLDM Entity ID' pldm_entity_id=74 c5 08 9a 1a 0b \ pldm_resource_id=0x00 hpa=ffffffffffffffff region= \ region_uuid=00000000-0000-0000-0000-000000000000 sub_channel=5 \ cme_threshold_ev_flags='Corrected Memory Errors in Multiple Media Components|\ Exceeded Programmable Threshold' cvme_count=148
Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Signed-off-by: Shiju Jose <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Dave Jiang <[email protected]>
show more ...
|
| ae834131 | 11-Jan-2025 |
Shiju Jose <[email protected]> |
cxl/events: Update General Media Event Record to CXL spec rev 3.1
CXL spec rev 3.1 section 8.2.9.2.1.1 Table 8-45, General Media Event Record has updated with following new fields and new types for
cxl/events: Update General Media Event Record to CXL spec rev 3.1
CXL spec rev 3.1 section 8.2.9.2.1.1 Table 8-45, General Media Event Record has updated with following new fields and new types for Memory Event Type and Transaction Type fields. 1. Advanced Programmable Corrected Memory Error Threshold Event Flags 2. Corrected Memory Error Count at Event 3. Memory Event Sub-Type
The format of component identifier has changed (CXL spec 3.1 section 8.2.9.2.1 Table 8-44).
Update the general media event record and general media trace event for the above spec changes. The new fields are inserted in logical places.
Example trace log of cxl_general_media trace event,
cxl_general_media: memdev=mem0 host=0000:0f:00.0 serial=3 log=Fatal : \ time=156831237413 uuid=fbcd0a77-c260-417f-85a9-088b1621eba6 len=128 \ flags='0x1' handle=1 related_handle=0 maint_op_class=2 \ maint_op_sub_class=4 : dpa=30d40 dpa_flags='' \ descriptor='UNCORRECTABLE_EVENT|THRESHOLD_EVENT|POISON_LIST_OVERFLOW' \ type='TE State Violation' sub_type='Media Link Command Training Error' \ transaction_type='Host Inject Poison' channel=3 rank=33 device=5 \ validity_flags='CHANNEL|RANK|DEVICE|COMPONENT|COMPONENT PLDM FORMAT' \ comp_id=03 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \ comp_id_pldm_valid_flags='PLDM Entity ID | Resource ID' \ pldm_entity_id=74 c5 08 9a 1a 0b pldm_resource_id=fc d2 7e 2f \ hpa=ffffffffffffffff region= \ region_uuid=00000000-0000-0000-0000-000000000000 \ cme_threshold_ev_flags='Corrected Memory Errors in Multiple Media \ Components|Exceeded Programmable Threshold' cme_count=120
Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Shiju Jose <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Dave Jiang <[email protected]>
show more ...
|