|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5 |
|
| #
d4c60219 |
| 26-Feb-2025 |
Tony Yi <[email protected]> |
drm/amdgpu: Update headers for CPER support on SRIOV
Update amdgv_sriovmsg.h and mxgpu_nv.h to add new definitions for CPER support on VFs. PMFW ACA messages are not available on VFs, and VFs must q
drm/amdgpu: Update headers for CPER support on SRIOV
Update amdgv_sriovmsg.h and mxgpu_nv.h to add new definitions for CPER support on VFs. PMFW ACA messages are not available on VFs, and VFs must query CPERs from host.
Signed-off-by: Tony Yi <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6 |
|
| #
60c58d72 |
| 30-Oct-2024 |
Victor Skvortsov <[email protected]> |
drm/amdgpu: Update SRIOV Exchange Headers for RAS Telemetry Support
The SRIOV PF/VF Data exchange is extended by 64KB for VF RAS Telemetry data. Add Host RAS Telemetry enable capabilities bitfields.
drm/amdgpu: Update SRIOV Exchange Headers for RAS Telemetry Support
The SRIOV PF/VF Data exchange is extended by 64KB for VF RAS Telemetry data. Add Host RAS Telemetry enable capabilities bitfields. Add a new VF msg REQ_RAS_ERROR_COUNT, the host response data will be populated in the RAS Telemetry region.
Signed-off-by: Victor Skvortsov <[email protected]> Reviewed-by: Zhigang Luo <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3 |
|
| #
ef6c2cb3 |
| 07-Aug-2024 |
Victor Zhao <[email protected]> |
drm/amd/sriov: extend NV_MAILBOX_POLL_MSG_TIMEDOUT
on MI300/MI308 UBB products, when doing mode1 reset, since 1 gpu need to wait all 8 gpus finish mode1 reset and then do re-init. As observed, somet
drm/amd/sriov: extend NV_MAILBOX_POLL_MSG_TIMEDOUT
on MI300/MI308 UBB products, when doing mode1 reset, since 1 gpu need to wait all 8 gpus finish mode1 reset and then do re-init. As observed, sometimes the gpu which triggered the reset need to wait 15s for all gpus to finish.
If poll msg timeout, guest driver will send the reset message again, and may mess up the following reinit sequence on other gpus.
So extend the time to cover the maximum time needed to recover.
Signed-off-by: Victor Zhao <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6 |
|
| #
cbda2758 |
| 24-Jun-2024 |
Vignesh Chander <[email protected]> |
drm/amdgpu: process RAS fatal error MB notification
For RAS error scenario, VF guest driver will check mailbox and set fed flag to avoid unnecessary HW accesses. additionally, poll for reset complet
drm/amdgpu: process RAS fatal error MB notification
For RAS error scenario, VF guest driver will check mailbox and set fed flag to avoid unnecessary HW accesses. additionally, poll for reset completion message first to avoid accidentally spamming multiple reset requests to host.
v2: add another mailbox check for handling case where kfd detects timeout first
v3: set host_flr bit and use wait_for_reset
Signed-off-by: Vignesh Chander <[email protected]> Reviewed-by: Zhigang Luo <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1 |
|
| #
2474414c |
| 21-Jan-2024 |
Victor Skvortsov <[email protected]> |
drm/amdgpu: Add RAS_POISON_READY host response message
In a non-FLR page avoidance scenario, the host driver will provide the bad pages in the pf2vf exchange region.
Adding a new host response mess
drm/amdgpu: Add RAS_POISON_READY host response message
In a non-FLR page avoidance scenario, the host driver will provide the bad pages in the pf2vf exchange region.
Adding a new host response message to indicate when the pf2vf exchange region has been updated.
Signed-off-by: Victor Skvortsov <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1 |
|
| #
ae844dd7 |
| 06-Dec-2022 |
Tao Zhou <[email protected]> |
drm/amdgpu: add RAS poison consumption handler for NV SRIOV
Send handling request to host.
Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-
drm/amdgpu: add RAS poison consumption handler for NV SRIOV
Send handling request to host.
Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4 |
|
| #
3e183e2f |
| 19-Mar-2021 |
Jiange Zhao <[email protected]> |
drm/amdgpu: Add MB_REQ_MSG_READY_TO_RESET response when VF get FLR notification.
When guest received FLR notification from host, it would lock adapter into reset state. There will be no more job sub
drm/amdgpu: Add MB_REQ_MSG_READY_TO_RESET response when VF get FLR notification.
When guest received FLR notification from host, it would lock adapter into reset state. There will be no more job submission and hardware access after that.
Then it should send a response to host that it has prepared for host reset.
Signed-off-by: Jiange Zhao <[email protected]> Signed-off-by: Peng Ju Zhou <[email protected]> Reviewed-by: Emily.Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6 |
|
| #
3aa883ac |
| 25-Nov-2020 |
Jiange Zhao <[email protected]> |
drm/amdgpu/SRIOV: Extend VF reset request wait period
In Virtualization case, when one VF is sending too many FLR requests, hypervisor would stop responding to this VF's request for a long period of
drm/amdgpu/SRIOV: Extend VF reset request wait period
In Virtualization case, when one VF is sending too many FLR requests, hypervisor would stop responding to this VF's request for a long period of time. This is called event guard. During this period of cooling time, guest driver should wait instead of doing other things. After this period of time, guest driver would resume reset process and return to normal.
Currently, guest driver would wait 12 seconds and return fail if it doesn't get response from host.
Solution: extend this waiting time in guest driver and poll response periodically. Poll happens every 6 seconds and it will last for 60 seconds.
v2: change the max repetition times from number to macro.
Signed-off-by: Jiange Zhao <[email protected]> Acked-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3 |
|
| #
312a79b6 |
| 21-Apr-2020 |
Monk Liu <[email protected]> |
drm/amdgpu: extent threshold of waiting FLR_COMPLETE
to 5s to satisfy WHOLE GPU reset which need 3+ seconds to finish
Signed-off-by: Monk Liu <[email protected]> Acked-by: Yintian Tao <[email protected]
drm/amdgpu: extent threshold of waiting FLR_COMPLETE
to 5s to satisfy WHOLE GPU reset which need 3+ seconds to finish
Signed-off-by: Monk Liu <[email protected]> Acked-by: Yintian Tao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5 |
|
| #
ff1f03a7 |
| 04-Mar-2020 |
Monk Liu <[email protected]> |
drm/amdgpu: use static mmio offset for NV mailbox
what: with the new "req_init_data" handshake we need to use mailbox before do IP discovery, so in mxgpu_nv.c file the original SOC15_REG method won'
drm/amdgpu: use static mmio offset for NV mailbox
what: with the new "req_init_data" handshake we need to use mailbox before do IP discovery, so in mxgpu_nv.c file the original SOC15_REG method won'twork because that depends on IP discovery complete first.
how: so the solution is to always use static MMIO offset for NV+ mailbox registers. HW team confirm us all MAILBOX registers will be at the same offset for all ASICs, no IP discovery needed for those registers
Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
aa53bc2e |
| 04-Mar-2020 |
Monk Liu <[email protected]> |
drm/amdgpu: introduce new request and its function
1) modify xgpu_nv_send_access_requests to support new idh request
2) introduce new function: req_gpu_init_data() which is used to notify host to p
drm/amdgpu: introduce new request and its function
1) modify xgpu_nv_send_access_requests to support new idh request
2) introduce new function: req_gpu_init_data() which is used to notify host to prepare vbios/ip-discovery/pfvf exchange
Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
c27cbdd2 |
| 03-Mar-2020 |
Monk Liu <[email protected]> |
drm/amdgpu: introduce new idh_request/event enum
new idh_request and ihd_event to prepare for the new handshake protocol implementation later
Signed-off-by: Monk Liu <[email protected]> Reviewed-by:
drm/amdgpu: introduce new idh_request/event enum
new idh_request and ihd_event to prepare for the new handshake protocol implementation later
Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
4d130238 |
| 03-Mar-2020 |
Monk Liu <[email protected]> |
drm/amdgpu: cleanup idh event/req for NV headers
1) drop the headers from AI in mxgpu_nv.c, should refer to mxgpu_nv.h
2) the IDH_EVENT_MAX is not used and not aligned with host side so drop it
drm/amdgpu: cleanup idh event/req for NV headers
1) drop the headers from AI in mxgpu_nv.c, should refer to mxgpu_nv.h
2) the IDH_EVENT_MAX is not used and not aligned with host side so drop it 3) the IDH_TEXT_MESSAG was provided in host but not defined in guest
Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8 |
|
| #
3636169c |
| 04-Sep-2019 |
Jiange Zhao <[email protected]> |
drm/amdgpu: Add SRIOV mailbox backend for Navi1x
Mimic the ones for Vega10, add mailbox backend for Navi1x
Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Jiange Zhao <[email protected]
drm/amdgpu: Add SRIOV mailbox backend for Navi1x
Mimic the ones for Vega10, add mailbox backend for Navi1x
Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Jiange Zhao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|