History log of /linux-6.15/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h (Results 1 – 21 of 21)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5
# 8c9ee180 18-Jun-2024 Yang Wang <[email protected]>

Revert "drm/amdgpu: change bank cache lock type to spinlock"

This reverts commit 258ed689bc3163f86204f75df6c23f92b59b3fad

revert this patch to modify lock type back to 'mutex' to avoid kernel
callt

Revert "drm/amdgpu: change bank cache lock type to spinlock"

This reverts commit 258ed689bc3163f86204f75df6c23f92b59b3fad

revert this patch to modify lock type back to 'mutex' to avoid kernel
calltrace issue.

[ 602.668806] Workqueue: amdgpu-reset-dev amdgpu_ras_do_recovery [amdgpu]
[ 602.668939] Call Trace:
[ 602.668940] <TASK>
[ 602.668941] dump_stack_lvl+0x4c/0x70
[ 602.668945] dump_stack+0x14/0x20
[ 602.668946] __schedule_bug+0x5a/0x70
[ 602.668950] __schedule+0x940/0xb30
[ 602.668952] ? srso_alias_return_thunk+0x5/0xfbef5
[ 602.668955] ? hrtimer_reprogram+0x77/0xb0
[ 602.668957] ? srso_alias_return_thunk+0x5/0xfbef5
[ 602.668959] ? hrtimer_start_range_ns+0x126/0x370
[ 602.668961] schedule+0x39/0xe0
[ 602.668962] schedule_hrtimeout_range_clock+0xb1/0x140
[ 602.668964] ? __pfx_hrtimer_wakeup+0x10/0x10
[ 602.668966] schedule_hrtimeout_range+0x17/0x20
[ 602.668967] usleep_range_state+0x69/0x90
[ 602.668970] psp_cmd_submit_buf+0x132/0x570 [amdgpu]
[ 602.669066] psp_ras_invoke+0x75/0x1a0 [amdgpu]
[ 602.669156] psp_ras_query_address+0x9c/0x120 [amdgpu]
[ 602.669245] umc_v12_0_update_ecc_status+0x16d/0x520 [amdgpu]
[ 602.669337] ? srso_alias_return_thunk+0x5/0xfbef5
[ 602.669339] ? stack_depot_save+0x12/0x20
[ 602.669342] ? srso_alias_return_thunk+0x5/0xfbef5
[ 602.669343] ? set_track_prepare+0x52/0x70
[ 602.669346] ? kmemleak_alloc+0x4f/0x90
[ 602.669348] ? __kmalloc_node+0x34b/0x450
[ 602.669352] amdgpu_umc_update_ecc_status+0x23/0x40 [amdgpu]
[ 602.669438] mca_umc_mca_get_err_count+0x85/0xc0 [amdgpu]
[ 602.669554] mca_smu_parse_mca_error_count+0x120/0x1d0 [amdgpu]
[ 602.669655] amdgpu_mca_dispatch_mca_set.part.0+0x141/0x250 [amdgpu]
[ 602.669743] ? kmemleak_free+0x36/0x60
[ 602.669745] ? kvfree+0x32/0x40
[ 602.669747] ? srso_alias_return_thunk+0x5/0xfbef5
[ 602.669749] ? kfree+0x15d/0x2a0
[ 602.669752] amdgpu_mca_smu_log_ras_error+0x1f6/0x210 [amdgpu]
[ 602.669839] amdgpu_ras_query_error_status_helper+0x2ad/0x390 [amdgpu]
[ 602.669924] ? srso_alias_return_thunk+0x5/0xfbef5
[ 602.669925] ? __call_rcu_common.constprop.0+0xa6/0x2b0
[ 602.669929] amdgpu_ras_query_error_status+0xf3/0x620 [amdgpu]
[ 602.670014] ? srso_alias_return_thunk+0x5/0xfbef5
[ 602.670017] amdgpu_ras_log_on_err_counter+0xe1/0x170 [amdgpu]
[ 602.670103] amdgpu_ras_do_recovery+0xd2/0x2c0 [amdgpu]
[ 602.670187] ? srso_alias_return_thunk+0x5/0xfbef5
[ 602.670189] ? __schedule+0x37d/0xb30
[ 602.670191] process_one_work+0x176/0x350
[ 602.670194] worker_thread+0x2f7/0x420
[ 602.670197] ?

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: YiPeng Chai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1
# 258ed689 16-May-2024 Yang Wang <[email protected]>

drm/amdgpu: change bank cache lock type to spinlock

modify the lock type to 'spinlock' to avoid schedule issue
in interrupt context.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: T

drm/amdgpu: change bank cache lock type to spinlock

modify the lock type to 'spinlock' to avoid schedule issue
in interrupt context.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.9, v6.9-rc7, v6.9-rc6
# 5eccab32 23-Apr-2024 Yang Wang <[email protected]>

drm/amdgpu: avoid dump mca bank log muti times during ras ISR

because the ue valid mca count will only be cleared after gpu reset,
so only dump mca log on the first time to get mca bank after receiv

drm/amdgpu: avoid dump mca bank log muti times during ras ISR

because the ue valid mca count will only be cleared after gpu reset,
so only dump mca log on the first time to get mca bank after receive RAS interrupt.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.9-rc5
# 76ad30f5 18-Apr-2024 Yang Wang <[email protected]>

drm/amdgpu: add MCA smu cache support

v1:
because SMU CE valid mca bank will be cleared after reading,
this patch adds mca cache at the driver level to ensure that the mca bank is not lost.

v2:
ref

drm/amdgpu: add MCA smu cache support

v1:
because SMU CE valid mca bank will be cleared after reading,
this patch adds mca cache at the driver level to ensure that the mca bank is not lost.

v2:
refine amdgpu_mca_init/fini/reset() function name.

v3:
add mca_cache.lock support
only add CE bank to mca bank cache.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# 7e0357be 18-Apr-2024 Yang Wang <[email protected]>

drm/amdgpu: remove unused MCA driver codes

- remove unused callback functions.
- make part of mca functions static and refine the function order.

Signed-off-by: Yang Wang <[email protected]>
R

drm/amdgpu: remove unused MCA driver codes

- remove unused callback functions.
- make part of mca functions static and refine the function order.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1
# 9dc57c2a 13-Mar-2024 Yang Wang <[email protected]>

drm/amdgpu: add ras event id support

add amdgpu ras event id support to better distinguish different
error information sources in dmesg logs.

the following log will be identify by event id:
{event_

drm/amdgpu: add ras event id support

add amdgpu ras event id support to better distinguish different
error information sources in dmesg logs.

the following log will be identify by event id:
{event_id} interrupt to inform RAS event
{event_id} ACA logs
{event_id} errors statistic since from current injection/error query
{event_id} errors statistic since from gpu load

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1
# afb617f3 15-Jan-2024 YiPeng Chai <[email protected]>

drm/amdgpu: add interface to check mca umc status

Add interface to check mca umc status.

Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-o

drm/amdgpu: add interface to check mca umc status

Add interface to check mca umc status.

Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6
# 058eb519 12-Dec-2023 Hawking Zhang <[email protected]>

drm/amdgpu: Switch to aca bank for xgmi pcs err cnt

Instead of software managed counters.

Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Review

drm/amdgpu: Switch to aca bank for xgmi pcs err cnt

Instead of software managed counters.

Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.7-rc5
# 37c57631 04-Dec-2023 Yang Wang <[email protected]>

drm/amd/pm: support new mca smu error code decoding

support new mca smu error code decoding from smu 85.86.0 for smu v13.0.6

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Z

drm/amd/pm: support new mca smu error code decoding

support new mca smu error code decoding from smu 85.86.0 for smu v13.0.6

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# 0d65efcb 04-Dec-2023 Yang Wang <[email protected]>

drm/amd/pm: support new mca smu error code decoding

support new mca smu error code decoding from smu 85.86.0 for smu v13.0.6

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Z

drm/amd/pm: support new mca smu error code decoding

support new mca smu error code decoding from smu 85.86.0 for smu v13.0.6

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1
# 76d2da18 07-Nov-2023 Yang Wang <[email protected]>

drm/amdgpu: add smu v13.0.6 pcs xgmi ras error query support

add pcs xgmi ras error query support for smu v13.0.6.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <Hawk

drm/amdgpu: add smu v13.0.6 pcs xgmi ras error query support

add pcs xgmi ras error query support for smu v13.0.6.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# d406aec8 08-Nov-2023 Hawking Zhang <[email protected]>

drm/amdgpu: correct acclerator check architecutre dump

So driver doesn't touch invalid aca entries.

Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]

drm/amdgpu: correct acclerator check architecutre dump

So driver doesn't touch invalid aca entries.

Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# 07c1db70 02-Nov-2023 Yang Wang <[email protected]>

drm/amdgpu: refine smu v13.0.6 mca dump driver

refine smu mca driver to support query ras error from pmfw path.
- correct gfx smu bank hwid (from mp5 to smu bank)
- retire unused callback function i

drm/amdgpu: refine smu v13.0.6 mca dump driver

refine smu mca driver to support query ras error from pmfw path.
- correct gfx smu bank hwid (from mp5 to smu bank)
- retire unused callback function in amdgpu_mca_smu_funcs{}
- add new mca_bank_set{} structure to collect mca bank
- move enum mca_reg_idx into amdgpu_mca.h header
- add mca status register field decode macro

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1
# 4051844c 06-Sep-2023 Yang Wang <[email protected]>

drm/amdgpu: add amdgpu mca debug sysfs support

add amdgpu mca debug sysfs support.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by

drm/amdgpu: add amdgpu mca debug sysfs support

add amdgpu mca debug sysfs support.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# 7ff607e2 05-Sep-2023 Yang Wang <[email protected]>

drm/amdgpu: add amdgpu smu mca dump feature support

add amdgpu smu mca dump feature support.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Sig

drm/amdgpu: add amdgpu smu mca dump feature support

add amdgpu smu mca dump feature support.

Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3
# 7f544c54 15-Mar-2023 Hawking Zhang <[email protected]>

drm/amdgpu: Rework mca ras sw_init

To align with other IP blocks

Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Stanley Yang <[email protected]>
Reviewed-by: Tao Zhou <tao.zho

drm/amdgpu: Rework mca ras sw_init

To align with other IP blocks

Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Stanley Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5
# 30e58102 17-Feb-2022 yipechai <[email protected]>

drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block

Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block.

Signed-off-by: yipechai <[email protected]

drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block

Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block.

Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# 20c43547 14-Feb-2022 yipechai <[email protected]>

drm/amdgpu: Remove redundant calls of ras_late_init in mca ras block

Remove redundant calls of ras_late_init in mca ras block.

Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <t

drm/amdgpu: Remove redundant calls of ras_late_init in mca ras block

Remove redundant calls of ras_late_init in mca ras block.

Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16
# b0e2062d 05-Jan-2022 yipechai <[email protected]>

drm/amdgpu: Modify mca block to fit for the unified ras block data and ops

1.Modify mca block to fit for the unified ras block data and ops.
2.Define special .ras_block_match function for mca block

drm/amdgpu: Modify mca block to fit for the unified ras block data and ops

1.Modify mca block to fit for the unified ras block data and ops.
2.Define special .ras_block_match function for mca block to identify itself.
3.Change amdgpu_mca_ras_funcs to amdgpu_mca_ras_block(amdgpu_mca_ras had been used), and the corresponding variable name remove _funcs suffix.
4.Remove the const flag of cma ras variable so that cma ras block can be able to be inserted into amdgpu device ras block link list.
5.Invoke amdgpu_ras_register_ras_block function to register cma ras block into amdgpu device ras block link list.
6.Remove the redundant code about cma in amdgpu_ras.c after using the unified ras block.

Signed-off-by: yipechai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3
# 640ae42e 22-Sep-2021 John Clements <[email protected]>

drm/amdgpu: Updated RAS infrastructure

Update RAS infrastructure to support RAS query for MCA subblocks

Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: John Clements <john.clement

drm/amdgpu: Updated RAS infrastructure

Update RAS infrastructure to support RAS query for MCA subblocks

Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v5.15-rc2, v5.15-rc1, v5.14
# 3907c492 24-Aug-2021 John Clements <[email protected]>

drm/amdgpu: Add driver infrastructure for MCA RAS

Add MCA specific IP blocks targetting RAS features

Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: John Clements <john.clements@a

drm/amdgpu: Add driver infrastructure for MCA RAS

Add MCA specific IP blocks targetting RAS features

Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...