|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6 |
|
| #
05d50ea3 |
| 04-Mar-2025 |
Tao Zhou <[email protected]> |
drm/amdgpu: format old RAS eeprom data into V3 version
Clear old data and save it in V3 format.
v2: only format eeprom data for new ASICs.
Signed-off-by: Tao Zhou <[email protected]> Reviewed-by:
drm/amdgpu: format old RAS eeprom data into V3 version
Clear old data and save it in V3 format.
v2: only format eeprom data for new ASICs.
Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc5 |
|
| #
a8f921a1 |
| 24-Feb-2025 |
ganglxie <[email protected]> |
drm/amdgpu: Change page/record number calculation based on nps
save only one record to save eeprom space,and bad_page_num = pa_rec_num + mca_rec_num*16
Signed-off-by: ganglxie <[email protected]> Re
drm/amdgpu: Change page/record number calculation based on nps
save only one record to save eeprom space,and bad_page_num = pa_rec_num + mca_rec_num*16
Signed-off-by: ganglxie <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1 |
|
| #
ae756cd8 |
| 29-Nov-2024 |
Tao Zhou <[email protected]> |
drm/amdgpu: correct the calculation of RAS bad page
After the introduction of NPS RAS, one bad page record on eeprom may be related to 1 or 16 bad pages, so the bad page record and bad page are two
drm/amdgpu: correct the calculation of RAS bad page
After the introduction of NPS RAS, one bad page record on eeprom may be related to 1 or 16 bad pages, so the bad page record and bad page are two different concepts, define a new variable to store bad page number.
Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
1f06e7f3 |
| 28-Nov-2024 |
Tao Zhou <[email protected]> |
drm/amdgpu: split ras_eeprom_init into init and check functions
Init function is for ras table header read and check function is responsible for the validation of the header. Call them in different
drm/amdgpu: split ras_eeprom_init into init and check functions
Init function is for ras table header read and check function is responsible for the validation of the header. Call them in different stages.
Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4 |
|
| #
772df3df |
| 18-Oct-2024 |
Tao Zhou <[email protected]> |
drm/amdgpu: add flag to indicate the type of RAS eeprom record
One UMC MCA address could map to multiply physical address (PA):
AMDGPU_RAS_EEPROM_REC_PA: one record store one PA AMDGPU_RAS_EEPROM_R
drm/amdgpu: add flag to indicate the type of RAS eeprom record
One UMC MCA address could map to multiply physical address (PA):
AMDGPU_RAS_EEPROM_REC_PA: one record store one PA AMDGPU_RAS_EEPROM_REC_MCA: one record store one MCA address, PA is not cared about
Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1 |
|
| #
b95fa494 |
| 23-May-2024 |
Tao Zhou <[email protected]> |
drm/amdgpu: add RAS is_rma flag
Set the flag to true if bad page number reaches threshold.
Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-
drm/amdgpu: add RAS is_rma flag
Set the flag to true if bad page number reaches threshold.
Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5 |
|
| #
0bc3137b |
| 01-Jun-2023 |
Stanley.Yang <[email protected]> |
drm/amdgpu: Set EEPROM ras info
Set EEPROM ras info: rma status, health percent and bad page threshold.
Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <Hawking.Zhang@
drm/amdgpu: Set EEPROM ras info
Set EEPROM ras info: rma status, health percent and bad page threshold.
Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
7f599fed |
| 31-May-2023 |
Stanley.Yang <[email protected]> |
drm/amdgpu: Add support EEPROM table v2.1
Add ras info to EEPROM table, app can analyse device ECC status without GPU driver through EEPROM table ras info.
Signed-off-by: Stanley.Yang <Stanley.Yang
drm/amdgpu: Add support EEPROM table v2.1
Add ras info to EEPROM table, app can analyse device ECC status without GPU driver through EEPROM table ras info.
Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
65183fae |
| 30-May-2023 |
Stanley.Yang <[email protected]> |
drm/amdgpu: Add RAS table v2.1 macro definition
Add RAS EEPROM table version 2.1 macro definition.
Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]
drm/amdgpu: Add RAS table v2.1 macro definition
Add RAS EEPROM table version 2.1 macro definition.
Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
71c79a19 |
| 30-May-2023 |
Stanley.Yang <[email protected]> |
drm/amdgpu: Rename ras table version
Rename RAS_TABLE_VER to RAS_TABLE_VER_V1, move RAS_TABLE_VER_V1 from amdgpu_ras_eeprom.c to amdgpu_ras_eeprom.h.
Signed-off-by: Stanley.Yang <[email protected]
drm/amdgpu: Rename ras table version
Rename RAS_TABLE_VER to RAS_TABLE_VER_V1, move RAS_TABLE_VER_V1 from amdgpu_ras_eeprom.c to amdgpu_ras_eeprom.h.
Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7 |
|
| #
69691c82 |
| 03-Mar-2022 |
Stanley.Yang <[email protected]> |
drm/amdgpu: message smu to update bad channel info
It should notice SMU to update bad channel info when detected uncorrectable error in UMC block
Signed-off-by: Stanley.Yang <[email protected]>
drm/amdgpu: message smu to update bad channel info
It should notice SMU to update bad channel info when detected uncorrectable error in UMC block
Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1 |
|
| #
6cd1f9b4 |
| 09-Sep-2021 |
Michel Dänzer <[email protected]> |
drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
This was unusual; normally, inline functions are declared static as well, and defined in a header file if used by multiple compilation
drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
This was unusual; normally, inline functions are declared static as well, and defined in a header file if used by multiple compilation units. The latter would be more involved in this case, so just drop the inline declaration for now.
Fixes compile failure building for ppc64le on RHEL 8:
In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32, from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_recovery_init’: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed in call to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not available 90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here 1985 | max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count(); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: c84d46707ebb "drm/amdgpu: validate bad page threshold in ras(v3)" Reviewed-by: Lyude Paul <[email protected]> Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
114518ff |
| 09-Sep-2021 |
Michel Dänzer <[email protected]> |
drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
This was unusual; normally, inline functions are declared static as well, and defined in a header file if used by multiple compilation
drm/amdgpu: Drop inline from amdgpu_ras_eeprom_max_record_count
This was unusual; normally, inline functions are declared static as well, and defined in a header file if used by multiple compilation units. The latter would be more involved in this case, so just drop the inline declaration for now.
Fixes compile failure building for ppc64le on RHEL 8:
In file included from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h:32, from ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:33: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c: In function ‘amdgpu_ras_recovery_init’: ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.h:90:17: error: inlining failed in call to ‘always_inline’ ‘amdgpu_ras_eeprom_max_record_count’: function body not available 90 | inline uint32_t amdgpu_ras_eeprom_max_record_count(void); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1985:34: note: called from here 1985 | max_eeprom_records_len = amdgpu_ras_eeprom_max_record_count(); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: c84d46707ebb "drm/amdgpu: validate bad page threshold in ras(v3)" Reviewed-by: Lyude Paul <[email protected]> Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7 |
|
| #
c65b0805 |
| 08-Apr-2021 |
Luben Tuikov <[email protected]> |
drm/amdgpu: RAS EEPROM table is now in debugfs
Add "ras_eeprom_size" file in debugfs, which reports the maximum size allocated to the RAS table in EEROM, as the number of bytes and the number of rec
drm/amdgpu: RAS EEPROM table is now in debugfs
Add "ras_eeprom_size" file in debugfs, which reports the maximum size allocated to the RAS table in EEROM, as the number of bytes and the number of records it could store. For instance,
$cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size 262144 bytes or 10921 records $_
Add "ras_eeprom_table" file in debugfs, which dumps the RAS table stored EEPROM, in a formatted way. For instance,
$cat ras_eeprom_table Signature Version FirstOffs Size Checksum 0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA Index Offset ErrType Bank/CU TimeStamp Offs/Addr MemChl MCUMCID RetiredPage 0 0x00014 ue 0x00 0x00000000607608DC 0x000000000000 0x00 0x00 0x000000000000 1 0x0002C ue 0x00 0x00000000607608DC 0x000000001000 0x00 0x00 0x000000000001 2 0x00044 ue 0x00 0x00000000607608DC 0x000000002000 0x00 0x00 0x000000000002 3 0x0005C ue 0x00 0x00000000607608DC 0x000000003000 0x00 0x00 0x000000000003 4 0x00074 ue 0x00 0x00000000607608DC 0x000000004000 0x00 0x00 0x000000000004 5 0x0008C ue 0x00 0x00000000607608DC 0x000000005000 0x00 0x00 0x000000000005 6 0x000A4 ue 0x00 0x00000000607608DC 0x000000006000 0x00 0x00 0x000000000006 7 0x000BC ue 0x00 0x00000000607608DC 0x000000007000 0x00 0x00 0x000000000007 8 0x000D4 ue 0x00 0x00000000607608DD 0x000000008000 0x00 0x00 0x000000000008 $_
Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Cc: John Clements <[email protected]> Cc: Hawking Zhang <[email protected]> Cc: Xinhui Pan <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Alexander Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
63d4c081 |
| 06-Apr-2021 |
Luben Tuikov <[email protected]> |
drm/amdgpu: Optimize EEPROM RAS table I/O
Split functionality between read and write, which simplifies the code and exposes areas of optimization and more or less complexity, and take advantage of t
drm/amdgpu: Optimize EEPROM RAS table I/O
Split functionality between read and write, which simplifies the code and exposes areas of optimization and more or less complexity, and take advantage of that.
Read and write the table in one go; use a separate stage to decode or encode the data, as opposed to on the fly, which keeps the I2C bus busy. Use a single read/write to read/write the table or at most two if the number of records we're reading/writing wraps around.
Check the check-sum of a table in EEPROM on init.
Update the checksum at the same time as when updating the table header signature, when the threshold was increased on boot.
Take advantage of arithmetic modulo 256, that is, use a byte!, to greatly simplify checksum arithmetic.
Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Alexander Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
017dad64 |
| 16-Jun-2021 |
Luben Tuikov <[email protected]> |
drm/amdgpu: Get rid of test function
The code is now tested from userspace. Remove already macroed out test function.
Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <Andrey
drm/amdgpu: Get rid of test function
The code is now tested from userspace. Remove already macroed out test function.
Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Alexander Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
0686627b |
| 16-Jun-2021 |
Luben Tuikov <[email protected]> |
drm/amdgpu: Some renames
Qualify with "ras_". Use kernel's own--don't redefine your own.
Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Signed-o
drm/amdgpu: Some renames
Qualify with "ras_". Use kernel's own--don't redefine your own.
Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Alexander Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v5.12-rc6, v5.12-rc5 |
|
| #
e4e6a589 |
| 27-Mar-2021 |
Luben Tuikov <[email protected]> |
drm/amdgpu: Use explicit cardinality for clarity
RAS_MAX_RECORD_NUM may mean the maximum record number, as in the maximum house number on your street, or it may mean the maximum number of records, a
drm/amdgpu: Use explicit cardinality for clarity
RAS_MAX_RECORD_NUM may mean the maximum record number, as in the maximum house number on your street, or it may mean the maximum number of records, as in the count of records, which is also a number. To make this distinction whether the number is ordinal (index) or cardinal (count), rename this macro to RAS_MAX_RECORD_COUNT.
This makes it easy to understand what it refers to, especially when we compute quantities such as, how many records do we have left in the table, especially when there are so many other numbers, quantities and numerical macros around.
Also rename the long, amdgpu_ras_eeprom_get_record_max_length() to the more succinct and clear, amdgpu_ras_eeprom_max_record_count().
When computing the threshold, which also deals with counts, i.e. "how many", use cardinal "max_eeprom_records_count", than the quantitative "max_eeprom_records_len".
Simplify the logic here and there, as well.
Cc: Guchun Chen <[email protected]> Cc: John Clements <[email protected]> Cc: Hawking Zhang <[email protected]> Cc: Alexander Deucher <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Alexander Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
803c6ebd |
| 27-Mar-2021 |
Luben Tuikov <[email protected]> |
drm/amdgpu: Simplify RAS EEPROM checksum calculations
Rename update_table_header() to write_table_header() as this function is actually writing it to EEPROM.
Use kernel types; use u8 to carry aroun
drm/amdgpu: Simplify RAS EEPROM checksum calculations
Rename update_table_header() to write_table_header() as this function is actually writing it to EEPROM.
Use kernel types; use u8 to carry around the checksum, in order to take advantage of arithmetic modulo 8-bits (256).
Tidy up to 80 columns.
When updating the checksum, just recalculate the whole thing.
Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Alexander Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v5.12-rc4, v5.12-rc3 |
|
| #
1fab841f |
| 11-Mar-2021 |
Luben Tuikov <[email protected]> |
drm/amdgpu: RAS xfer to read/write
Wrap amdgpu_ras_eeprom_xfer(..., bool write), into amdgpu_ras_eeprom_read() and amdgpu_ras_eeprom_write(), as that makes reading and understanding the code clearer
drm/amdgpu: RAS xfer to read/write
Wrap amdgpu_ras_eeprom_xfer(..., bool write), into amdgpu_ras_eeprom_read() and amdgpu_ras_eeprom_write(), as that makes reading and understanding the code clearer.
Cc: Jean Delvare <[email protected]> Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Cc: Lijo Lazar <[email protected]> Cc: Stanley Yang <[email protected]> Cc: Hawking Zhang <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Alexander Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7 |
|
| #
a4399657 |
| 05-Feb-2021 |
Luben Tuikov <[email protected]> |
drm/amdgpu: Rename misspelled function
Instead of fixing the spelling in amdgpu_ras_eeprom_process_recods(), rename it to, amdgpu_ras_eeprom_xfer(), to look similar to other I2C and protocol tra
drm/amdgpu: Rename misspelled function
Instead of fixing the spelling in amdgpu_ras_eeprom_process_recods(), rename it to, amdgpu_ras_eeprom_xfer(), to look similar to other I2C and protocol transfer (read/write) functions.
Also to keep the column span to within reason by using a shorter name.
Change the "num" function parameter from "int" to "const u32" since it is the number of items (records) to xfer, i.e. their count, which cannot be a negative number.
Also swap the order of parameters, keeping the pointer to records and their number next to each other, while the direction now becomes the last parameter.
Cc: Jean Delvare <[email protected]> Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Cc: Lijo Lazar <[email protected]> Cc: Stanley Yang <[email protected]> Cc: Hawking Zhang <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Alexander Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
ccdfbfec |
| 02-Feb-2021 |
Luben Tuikov <[email protected]> |
drm/amdgpu: RAS and FRU now use 19-bit I2C address
Convert RAS and FRU code to use the 19-bit I2C memory address and remove all "slave_addr", as this is now absolved into the 19-bit address.
Cc: Je
drm/amdgpu: RAS and FRU now use 19-bit I2C address
Convert RAS and FRU code to use the 19-bit I2C memory address and remove all "slave_addr", as this is now absolved into the 19-bit address.
Cc: Jean Delvare <[email protected]> Cc: John Clements <[email protected]> Cc: Alexander Deucher <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Cc: Lijo Lazar <[email protected]> Cc: Stanley Yang <[email protected]> Cc: Hawking Zhang <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Alexander Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
11003c68 |
| 26-Feb-2021 |
Dennis Li <[email protected]> |
drm/amdgpu: remove unnecessary reading for epprom header
If the number of badpage records exceed the threshold, driver has updated both epprom header and control->tbl_hdr.header before gpu reset, th
drm/amdgpu: remove unnecessary reading for epprom header
If the number of badpage records exceed the threshold, driver has updated both epprom header and control->tbl_hdr.header before gpu reset, therefore GPU recovery thread no need to read epprom header directly.
v2: merge amdgpu_ras_check_err_threshold into amdgpu_ras_eeprom_check_err_threshold
Signed-off-by: Dennis Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7 |
|
| #
e8fbaf03 |
| 23-Jul-2020 |
Guchun Chen <[email protected]> |
drm/amdgpu: break GPU recovery once it's in bad state(v4)
When GPU executes recovery and retriving bad GPU tag from external eerpom device, the recovery will be broken and error message is printed a
drm/amdgpu: break GPU recovery once it's in bad state(v4)
When GPU executes recovery and retriving bad GPU tag from external eerpom device, the recovery will be broken and error message is printed as well for user's awareness.
v2: Refine warning message in threshold reaching case, and fix spelling typo.
v3: Fix explicit calling of bad gpu.
v4: Rename function names.
Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
b82e65a9 |
| 23-Jul-2020 |
Guchun Chen <[email protected]> |
drm/amdgpu: break driver init process when it's bad GPU(v5)
When retrieving bad gpu tag from eeprom, GPU init should fail as the GPU needs to be retired for further check.
v2: Fix spelling typo, co
drm/amdgpu: break driver init process when it's bad GPU(v5)
When retrieving bad gpu tag from eeprom, GPU init should fail as the GPU needs to be retired for further check.
v2: Fix spelling typo, correct the condition to detect bad gpu tag and refine error message.
v3: Refine function argument name.
v4: Fix missing check of returning value of i2c initialization error case.
v5: Use dev_err to print PCI information in dmesg instead of DRM_ERROR.
Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|