|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1 |
|
| #
8fa7292f |
| 05-Apr-2025 |
Thomas Gleixner <[email protected]> |
treewide: Switch/rename to timer_delete[_sync]()
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree over and remove the historical wrapper inlines.
Conversion was done with c
treewide: Switch/rename to timer_delete[_sync]()
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree over and remove the historical wrapper inlines.
Conversion was done with coccinelle plus manual fixups where necessary.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3 |
|
| #
8bd82363 |
| 05-Jun-2024 |
Christian König <[email protected]> |
drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2
This reverts commit b8c415e3bf98 ("drm/amdgpu: take runtime pm reference when we attach a buffer") and commit 425285d39afd (
drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2
This reverts commit b8c415e3bf98 ("drm/amdgpu: take runtime pm reference when we attach a buffer") and commit 425285d39afd ("drm/amdgpu: add amdgpu runpm usage trace for separate funcs").
Taking a runtime pm reference for DMA-buf is actually completely unnecessary and even dangerous.
The problem is that calling pm_runtime_get_sync() from the DMA-buf callbacks is illegal because we have the reservation locked here which is also taken during resume. So this would deadlock.
When the buffer is in GTT it is still accessible even when the GPU is powered down and when it is in VRAM the buffer gets migrated to GTT before powering down.
The only use case which would make it mandatory to keep the runtime pm reference would be if we pin the buffer into VRAM, and that's not something we currently do.
v2: improve the commit message
Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> CC: [email protected]
show more ...
|
| #
030631e9 |
| 05-Jun-2024 |
Christian König <[email protected]> |
drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2
This reverts commit b8c415e3bf98 ("drm/amdgpu: take runtime pm reference when we attach a buffer") and commit 425285d39afd (
drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2
This reverts commit b8c415e3bf98 ("drm/amdgpu: take runtime pm reference when we attach a buffer") and commit 425285d39afd ("drm/amdgpu: add amdgpu runpm usage trace for separate funcs").
Taking a runtime pm reference for DMA-buf is actually completely unnecessary and even dangerous.
The problem is that calling pm_runtime_get_sync() from the DMA-buf callbacks is illegal because we have the reservation locked here which is also taken during resume. So this would deadlock.
When the buffer is in GTT it is still accessible even when the GPU is powered down and when it is in VRAM the buffer gets migrated to GTT before powering down.
The only use case which would make it mandatory to keep the runtime pm reference would be if we pin the buffer into VRAM, and that's not something we currently do.
v2: improve the commit message
Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> CC: [email protected]
show more ...
|
|
Revision tags: v6.10-rc2 |
|
| #
3a86fdc4 |
| 31-May-2024 |
Lijo Lazar <[email protected]> |
drm/amdgpu: Skip coredump during resets for debug
Skip scheduling coredump when gpu reset is intentionally triggered through debugfs.
Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Asa
drm/amdgpu: Skip coredump during resets for debug
Skip scheduling coredump when gpu reset is intentionally triggered through debugfs.
Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Asad Kamal <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
bac640dd |
| 04-Jun-2024 |
Eric Huang <[email protected]> |
drm/amdgpu: add reset source in various cases
To fullfill the reset event description.
Suggested-by: Lijo Lazar <[email protected]> Signed-off-by: Eric Huang <[email protected]> Reviewed-by
drm/amdgpu: add reset source in various cases
To fullfill the reset event description.
Suggested-by: Lijo Lazar <[email protected]> Signed-off-by: Eric Huang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6 |
|
| #
3d14cb02 |
| 21-Feb-2024 |
Kunwu Chan <[email protected]> |
drm/amdgpu: Simplify the allocation of fence slab caches
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches.
Reviewed-by: Christian König <ch
drm/amdgpu: Simplify the allocation of fence slab caches
Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches.
Reviewed-by: Christian König <[email protected]> Signed-off-by: Kunwu Chan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1 |
|
| #
425285d3 |
| 09-Nov-2023 |
Prike Liang <[email protected]> |
drm/amdgpu: add amdgpu runpm usage trace for separate funcs
Add trace for amdgpu runpm separate funcs usage and this will help debugging on the case of runpm usage missed to dereference. In the norm
drm/amdgpu: add amdgpu runpm usage trace for separate funcs
Add trace for amdgpu runpm separate funcs usage and this will help debugging on the case of runpm usage missed to dereference. In the normal case the runpm usage count referred by one kind of functionality pairwise and usage should be changed from 1 to 0, otherwise there will be an issue in the amdgpu runpm usage dereference.
Signed-off-by: Prike Liang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2 |
|
| #
4e8303cf |
| 11-Sep-2023 |
Lijo Lazar <[email protected]> |
drm/amdgpu: Use function for IP version check
Use an inline function for version check. Gives more flexibility to handle any format changes.
Signed-off-by: Lijo Lazar <[email protected]> Reviewed-
drm/amdgpu: Use function for IP version check
Use an inline function for version check. Gives more flexibility to handle any format changes.
Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc1, v6.5, v6.5-rc7 |
|
| #
f1740b1a |
| 14-Aug-2023 |
Tim Huang <[email protected]> |
drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
GFX v11.0.1 reported fence fallback timer expired issue on SDMA and GFX rings after S0ix resume. This is generated by EOP interrupts are
drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
GFX v11.0.1 reported fence fallback timer expired issue on SDMA and GFX rings after S0ix resume. This is generated by EOP interrupts are disabled when S0ix suspend but fails to re-enable when resume because of the GFX is in GFXOFF.
[ 203.349571] [drm] Fence fallback timer expired on ring sdma0 [ 203.349572] [drm] Fence fallback timer expired on ring gfx_0.0.0 [ 203.861635] [drm] Fence fallback timer expired on ring gfx_0.0.0
For S0ix, GFX is in GFXOFF state, avoid to touch the GFX registers to configure the fence driver interrupts for rings that belong to GFX. The interrupts configuration will be restored by GFXOFF exit.
Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
show more ...
|
| #
603b9a57 |
| 14-Aug-2023 |
Tim Huang <[email protected]> |
drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
GFX v11.0.1 reported fence fallback timer expired issue on SDMA and GFX rings after S0ix resume. This is generated by EOP interrupts are
drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
GFX v11.0.1 reported fence fallback timer expired issue on SDMA and GFX rings after S0ix resume. This is generated by EOP interrupts are disabled when S0ix suspend but fails to re-enable when resume because of the GFX is in GFXOFF.
[ 203.349571] [drm] Fence fallback timer expired on ring sdma0 [ 203.349572] [drm] Fence fallback timer expired on ring gfx_0.0.0 [ 203.861635] [drm] Fence fallback timer expired on ring gfx_0.0.0
For S0ix, GFX is in GFXOFF state, avoid to touch the GFX registers to configure the fence driver interrupts for rings that belong to GFX. The interrupts configuration will be restored by GFXOFF exit.
Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3 |
|
| #
0a33b11d |
| 17-Apr-2023 |
Christian König <[email protected]> |
drm/amdgpu: mark force completed fences with -ECANCELED
When we force complete fences we should mark them as canceled.
Signed-off-by: Christian König <[email protected]> Reviewed-by: Luben T
drm/amdgpu: mark force completed fences with -ECANCELED
When we force complete fences we should mark them as canceled.
Signed-off-by: Christian König <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
b13eb02b |
| 19-Apr-2023 |
Christian König <[email protected]> |
drm/amdgpu: add amdgpu_error_* debugfs file
This allows us to insert some error codes into the bottom of the pipeline on an engine.
Signed-off-by: Christian König <[email protected]> Reviewe
drm/amdgpu: add amdgpu_error_* debugfs file
This allows us to insert some error codes into the bottom of the pipeline on an engine.
Signed-off-by: Christian König <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
109b4d8c |
| 15-May-2023 |
Su Hui <[email protected]> |
drm/amdgpu: remove unnecessary (void*) conversions
No need cast (void*) to (struct amdgpu_device *).
Signed-off-by: Su Hui <[email protected]> Signed-off-by: Alex Deucher <[email protected]
drm/amdgpu: remove unnecessary (void*) conversions
No need cast (void*) to (struct amdgpu_device *).
Signed-off-by: Su Hui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
3083b100 |
| 09-May-2023 |
Guchun Chen <[email protected]> |
drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged
When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_devic
drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged
When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_device_halt. So amdgpu_irq_put for irqs stored in fence driver should not be called any more, otherwise, below calltrace will arrive.
[ 139.114088] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu] [ 139.114655] Call Trace: [ 139.114655] <TASK> [ 139.114657] amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu] [ 139.114836] amdgpu_device_fini_hw+0xb6/0x350 [amdgpu] [ 139.114955] amdgpu_driver_unload_kms+0x51/0x70 [amdgpu] [ 139.115075] amdgpu_pci_remove+0x63/0x160 [amdgpu] [ 139.115193] ? __pm_runtime_resume+0x64/0x90 [ 139.115195] pci_device_remove+0x3a/0xb0 [ 139.115197] device_remove+0x43/0x70 [ 139.115198] device_release_driver_internal+0xbd/0x140
Signed-off-by: Guchun Chen <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
6e87c422 |
| 24-Apr-2023 |
Alex Sierra <[email protected]> |
drm/amdgpu: improve wait logic at fence polling
Accomplish this by reading the seq number right away instead of sleep for 5us. There are certain cases where the fence is ready almost immediately. Sl
drm/amdgpu: improve wait logic at fence polling
Accomplish this by reading the seq number right away instead of sleep for 5us. There are certain cases where the fence is ready almost immediately. Sleep number granularity was also reduced as the majority of the kiq tlb flush takes between 2us to 6us.
Signed-off-by: Alex Sierra <[email protected]> Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
9e690184 |
| 03-May-2023 |
Srinivasan Shanmugam <[email protected]> |
drm/amd/amdgpu: Fix errors & warnings in amdgpu _bios, _cs, _dma_buf, _fence.c
The following checkpatch errors & warning is removed.
ERROR: else should follow close brace '}' ERROR: trailing statem
drm/amd/amdgpu: Fix errors & warnings in amdgpu _bios, _cs, _dma_buf, _fence.c
The following checkpatch errors & warning is removed.
ERROR: else should follow close brace '}' ERROR: trailing statements should be on next line WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: Possible repeated word: 'Fences' WARNING: Missing a blank line after declarations WARNING: braces {} are not necessary for single statement blocks WARNING: Comparisons should place the constant on the right side of the test WARNING: printk() should include KERN_<LEVEL> facility level
Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
c1a322a7 |
| 09-May-2023 |
Guchun Chen <[email protected]> |
drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged
When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_devic
drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged
When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_device_halt. So amdgpu_irq_put for irqs stored in fence driver should not be called any more, otherwise, below calltrace will arrive.
[ 139.114088] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu] [ 139.114655] Call Trace: [ 139.114655] <TASK> [ 139.114657] amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu] [ 139.114836] amdgpu_device_fini_hw+0xb6/0x350 [amdgpu] [ 139.114955] amdgpu_driver_unload_kms+0x51/0x70 [amdgpu] [ 139.115075] amdgpu_pci_remove+0x63/0x160 [amdgpu] [ 139.115193] ? __pm_runtime_resume+0x64/0x90 [ 139.115195] pci_device_remove+0x3a/0xb0 [ 139.115197] device_remove+0x43/0x70 [ 139.115198] device_release_driver_internal+0xbd/0x140
Signed-off-by: Guchun Chen <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3 |
|
| #
033c5647 |
| 16-Mar-2023 |
YuBiao Wang <[email protected]> |
drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
[Why] For engines not supporting soft reset, i.e. VCN, there will be a failed ib test before mode 1 reset during asic reset. Th
drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
[Why] For engines not supporting soft reset, i.e. VCN, there will be a failed ib test before mode 1 reset during asic reset. The fences in this case are never signaled and next time when we try to free the sa_bo, kernel will hang.
[How] During pre_asic_reset, driver will clear job fences and afterwards the fences' refcount will be reduced to 1. For drm_sched_jobs it will be released in job_free_cb, and for non-sched jobs like ib_test, it's meant to be released in sa_bo_free but only when the fences are signaled. So we have to force signal the non_sched bad job's fence during pre_asic_reset or the clear is not complete.
Signed-off-by: YuBiao Wang <[email protected]> Acked-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
f466b111 |
| 16-Mar-2023 |
YuBiao Wang <[email protected]> |
drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
[Why] For engines not supporting soft reset, i.e. VCN, there will be a failed ib test before mode 1 reset during asic reset. Th
drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
[Why] For engines not supporting soft reset, i.e. VCN, there will be a failed ib test before mode 1 reset during asic reset. The fences in this case are never signaled and next time when we try to free the sa_bo, kernel will hang.
[How] During pre_asic_reset, driver will clear job fences and afterwards the fences' refcount will be reduced to 1. For drm_sched_jobs it will be released in job_free_cb, and for non-sched jobs like ib_test, it's meant to be released in sa_bo_free but only when the fences are signaled. So we have to force signal the non_sched bad job's fence during pre_asic_reset or the clear is not complete.
Signed-off-by: YuBiao Wang <[email protected]> Acked-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7 |
|
| #
5ad7bbf3 |
| 02-Feb-2023 |
Guilherme G. Piccoli <[email protected]> |
drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini routine - such function is expected to be called only after t
drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini routine - such function is expected to be called only after the respective init function - drm_sched_init() - was executed successfully.
Happens that we faced a driver probe failure in the Steam Deck recently, and the function drm_sched_fini() was called even without its counter-part had been previously called, causing the following oops:
amdgpu: probe of 0000:04:00.0 failed with error -110 BUG: kernel NULL pointer dereference, address: 0000000000000090 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338 Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022 RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched] [...] Call Trace: <TASK> amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu] amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu] amdgpu_driver_release_kms+0x16/0x30 [amdgpu] devm_drm_dev_init_release+0x49/0x70 [...]
To prevent that, check if the drm_sched was properly initialized for a given ring before calling its fini counter-part.
Notice ideally we'd use sched.ready for that; such field is set as the latest thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such field - in the above oops for example, it was a GFX ring causing the crash, and the sched.ready field was set to true in the ring init routine, regardless of the state of the DRM scheduler. Hence, we ended-up using sched.ops as per Christian's suggestion [0], and also removed the no_scheduler check [1].
[0] https://lore.kernel.org/amd-gfx/[email protected]/ [1] https://lore.kernel.org/amd-gfx/[email protected]/
Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)") Suggested-by: Christian König <[email protected]> Cc: Guchun Chen <[email protected]> Cc: Luben Tuikov <[email protected]> Cc: Mario Limonciello <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Guilherme G. Piccoli <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
show more ...
|
| #
70f1872e |
| 02-Feb-2023 |
Guilherme G. Piccoli <[email protected]> |
drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini routine - such function is expected to be called only after t
drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini routine - such function is expected to be called only after the respective init function - drm_sched_init() - was executed successfully.
Happens that we faced a driver probe failure in the Steam Deck recently, and the function drm_sched_fini() was called even without its counter-part had been previously called, causing the following oops:
amdgpu: probe of 0000:04:00.0 failed with error -110 BUG: kernel NULL pointer dereference, address: 0000000000000090 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338 Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022 RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched] [...] Call Trace: <TASK> amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu] amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu] amdgpu_driver_release_kms+0x16/0x30 [amdgpu] devm_drm_dev_init_release+0x49/0x70 [...]
To prevent that, check if the drm_sched was properly initialized for a given ring before calling its fini counter-part.
Notice ideally we'd use sched.ready for that; such field is set as the latest thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such field - in the above oops for example, it was a GFX ring causing the crash, and the sched.ready field was set to true in the ring init routine, regardless of the state of the DRM scheduler. Hence, we ended-up using sched.ops as per Christian's suggestion [0], and also removed the no_scheduler check [1].
[0] https://lore.kernel.org/amd-gfx/[email protected]/ [1] https://lore.kernel.org/amd-gfx/[email protected]/
Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)") Suggested-by: Christian König <[email protected]> Cc: Guchun Chen <[email protected]> Cc: Luben Tuikov <[email protected]> Cc: Mario Limonciello <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Guilherme G. Piccoli <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5 |
|
| #
3f4c175d |
| 07-Sep-2022 |
Jiadong.Zhu <[email protected]> |
drm/amdgpu: MCBP based on DRM scheduler (v9)
Trigger Mid-Command Buffer Preemption according to the priority of the software rings and the hw fence signalling condition.
The muxer saves the locatio
drm/amdgpu: MCBP based on DRM scheduler (v9)
Trigger Mid-Command Buffer Preemption according to the priority of the software rings and the hw fence signalling condition.
The muxer saves the locations of the indirect buffer frames from the software ring together with the fence sequence number in its fifo queue, and pops out those records when the fences are signalled. The locations are used to resubmit packages in preemption scenarios by coping the chunks from the software ring.
v2: Update comment style. v3: Fix conflict caused by previous modifications. v4: Remove unnecessary prints. v5: Fix corner cases for resubmission cases. v6: Refactor functions for resubmission, calling fence_process in irq handler. v7: Solve conflict for removing amdgpu_sw_ring.c. v8: Add time threshold to judge if preemption request is needed. v9: Correct comment spelling. Set fence emit timestamp before rsu assignment.
Cc: Christian Koenig <[email protected]> Cc: Luben Tuikov <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Cc: Michel Dänzer <[email protected]> Signed-off-by: Jiadong.Zhu <[email protected]> Acked-by: Luben Tuikov <[email protected]> Acked-by: Huang Rui <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
e7b8e90a |
| 23-Sep-2022 |
Jiadong.Zhu <[email protected]> |
drm/amdgpu: Remove fence_process in count_emitted
The function amdgpu_fence_count_emitted used in work_hander should not call amdgpu_fence_process which must be used in irq handler.
Reviewed-by: Ch
drm/amdgpu: Remove fence_process in count_emitted
The function amdgpu_fence_count_emitted used in work_hander should not call amdgpu_fence_process which must be used in irq handler.
Reviewed-by: Christian König <[email protected]> Signed-off-by: Jiadong.Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
11e38360 |
| 23-Sep-2022 |
Jiadong.Zhu <[email protected]> |
drm/amdgpu: Remove fence_process in count_emitted
The function amdgpu_fence_count_emitted used in work_hander should not call amdgpu_fence_process which must be used in irq handler.
Reviewed-by: Ch
drm/amdgpu: Remove fence_process in count_emitted
The function amdgpu_fence_count_emitted used in work_hander should not call amdgpu_fence_process which must be used in irq handler.
Reviewed-by: Christian König <[email protected]> Signed-off-by: Jiadong.Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8 |
|
| #
9dd4545f |
| 21-Jul-2022 |
Slark Xiao <[email protected]> |
drm/amd: Fix typo 'the the' in comment
Replace 'the the' with 'the' in the comment.
Signed-off-by: Slark Xiao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
|