History log of /linux-6.15/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c (Results 1 – 25 of 176)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1
# 8fa7292f 05-Apr-2025 Thomas Gleixner <[email protected]>

treewide: Switch/rename to timer_delete[_sync]()

timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.

Conversion was done with c

treewide: Switch/rename to timer_delete[_sync]()

timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.

Conversion was done with coccinelle plus manual fixups where necessary.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


Revision tags: v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3
# 8bd82363 05-Jun-2024 Christian König <[email protected]>

drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2

This reverts commit b8c415e3bf98 ("drm/amdgpu: take runtime pm reference
when we attach a buffer") and commit 425285d39afd (

drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2

This reverts commit b8c415e3bf98 ("drm/amdgpu: take runtime pm reference
when we attach a buffer") and commit 425285d39afd ("drm/amdgpu: add amdgpu
runpm usage trace for separate funcs").

Taking a runtime pm reference for DMA-buf is actually completely
unnecessary and even dangerous.

The problem is that calling pm_runtime_get_sync() from the DMA-buf
callbacks is illegal because we have the reservation locked here
which is also taken during resume. So this would deadlock.

When the buffer is in GTT it is still accessible even when the GPU
is powered down and when it is in VRAM the buffer gets migrated to
GTT before powering down.

The only use case which would make it mandatory to keep the runtime
pm reference would be if we pin the buffer into VRAM, and that's not
something we currently do.

v2: improve the commit message

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
CC: [email protected]

show more ...


# 030631e9 05-Jun-2024 Christian König <[email protected]>

drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2

This reverts commit b8c415e3bf98 ("drm/amdgpu: take runtime pm reference
when we attach a buffer") and commit 425285d39afd (

drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2

This reverts commit b8c415e3bf98 ("drm/amdgpu: take runtime pm reference
when we attach a buffer") and commit 425285d39afd ("drm/amdgpu: add amdgpu
runpm usage trace for separate funcs").

Taking a runtime pm reference for DMA-buf is actually completely
unnecessary and even dangerous.

The problem is that calling pm_runtime_get_sync() from the DMA-buf
callbacks is illegal because we have the reservation locked here
which is also taken during resume. So this would deadlock.

When the buffer is in GTT it is still accessible even when the GPU
is powered down and when it is in VRAM the buffer gets migrated to
GTT before powering down.

The only use case which would make it mandatory to keep the runtime
pm reference would be if we pin the buffer into VRAM, and that's not
something we currently do.

v2: improve the commit message

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
CC: [email protected]

show more ...


Revision tags: v6.10-rc2
# 3a86fdc4 31-May-2024 Lijo Lazar <[email protected]>

drm/amdgpu: Skip coredump during resets for debug

Skip scheduling coredump when gpu reset is intentionally triggered
through debugfs.

Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Asa

drm/amdgpu: Skip coredump during resets for debug

Skip scheduling coredump when gpu reset is intentionally triggered
through debugfs.

Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# bac640dd 04-Jun-2024 Eric Huang <[email protected]>

drm/amdgpu: add reset source in various cases

To fullfill the reset event description.

Suggested-by: Lijo Lazar <[email protected]>
Signed-off-by: Eric Huang <[email protected]>
Reviewed-by

drm/amdgpu: add reset source in various cases

To fullfill the reset event description.

Suggested-by: Lijo Lazar <[email protected]>
Signed-off-by: Eric Huang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6
# 3d14cb02 21-Feb-2024 Kunwu Chan <[email protected]>

drm/amdgpu: Simplify the allocation of fence slab caches

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.

Reviewed-by: Christian König <ch

drm/amdgpu: Simplify the allocation of fence slab caches

Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
to simplify the creation of SLAB caches.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Kunwu Chan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1
# 425285d3 09-Nov-2023 Prike Liang <[email protected]>

drm/amdgpu: add amdgpu runpm usage trace for separate funcs

Add trace for amdgpu runpm separate funcs usage and this will
help debugging on the case of runpm usage missed to dereference.
In the norm

drm/amdgpu: add amdgpu runpm usage trace for separate funcs

Add trace for amdgpu runpm separate funcs usage and this will
help debugging on the case of runpm usage missed to dereference.
In the normal case the runpm usage count referred by one kind
of functionality pairwise and usage should be changed from 1 to 0,
otherwise there will be an issue in the amdgpu runpm usage
dereference.

Signed-off-by: Prike Liang <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2
# 4e8303cf 11-Sep-2023 Lijo Lazar <[email protected]>

drm/amdgpu: Use function for IP version check

Use an inline function for version check. Gives more flexibility to
handle any format changes.

Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-

drm/amdgpu: Use function for IP version check

Use an inline function for version check. Gives more flexibility to
handle any format changes.

Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.6-rc1, v6.5, v6.5-rc7
# f1740b1a 14-Aug-2023 Tim Huang <[email protected]>

drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix

GFX v11.0.1 reported fence fallback timer expired issue on
SDMA and GFX rings after S0ix resume. This is generated by
EOP interrupts are

drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix

GFX v11.0.1 reported fence fallback timer expired issue on
SDMA and GFX rings after S0ix resume. This is generated by
EOP interrupts are disabled when S0ix suspend but fails to
re-enable when resume because of the GFX is in GFXOFF.

[ 203.349571] [drm] Fence fallback timer expired on ring sdma0
[ 203.349572] [drm] Fence fallback timer expired on ring gfx_0.0.0
[ 203.861635] [drm] Fence fallback timer expired on ring gfx_0.0.0

For S0ix, GFX is in GFXOFF state, avoid to touch the GFX registers
to configure the fence driver interrupts for rings that belong to GFX.
The interrupts configuration will be restored by GFXOFF exit.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

show more ...


# 603b9a57 14-Aug-2023 Tim Huang <[email protected]>

drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix

GFX v11.0.1 reported fence fallback timer expired issue on
SDMA and GFX rings after S0ix resume. This is generated by
EOP interrupts are

drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix

GFX v11.0.1 reported fence fallback timer expired issue on
SDMA and GFX rings after S0ix resume. This is generated by
EOP interrupts are disabled when S0ix suspend but fails to
re-enable when resume because of the GFX is in GFXOFF.

[ 203.349571] [drm] Fence fallback timer expired on ring sdma0
[ 203.349572] [drm] Fence fallback timer expired on ring gfx_0.0.0
[ 203.861635] [drm] Fence fallback timer expired on ring gfx_0.0.0

For S0ix, GFX is in GFXOFF state, avoid to touch the GFX registers
to configure the fence driver interrupts for rings that belong to GFX.
The interrupts configuration will be restored by GFXOFF exit.

Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3
# 0a33b11d 17-Apr-2023 Christian König <[email protected]>

drm/amdgpu: mark force completed fences with -ECANCELED

When we force complete fences we should mark them as canceled.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Luben T

drm/amdgpu: mark force completed fences with -ECANCELED

When we force complete fences we should mark them as canceled.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# b13eb02b 19-Apr-2023 Christian König <[email protected]>

drm/amdgpu: add amdgpu_error_* debugfs file

This allows us to insert some error codes into the bottom of the pipeline
on an engine.

Signed-off-by: Christian König <[email protected]>
Reviewe

drm/amdgpu: add amdgpu_error_* debugfs file

This allows us to insert some error codes into the bottom of the pipeline
on an engine.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# 109b4d8c 15-May-2023 Su Hui <[email protected]>

drm/amdgpu: remove unnecessary (void*) conversions

No need cast (void*) to (struct amdgpu_device *).

Signed-off-by: Su Hui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]

drm/amdgpu: remove unnecessary (void*) conversions

No need cast (void*) to (struct amdgpu_device *).

Signed-off-by: Su Hui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# 3083b100 09-May-2023 Guchun Chen <[email protected]>

drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged

When performing device unbind or halt, we have disabled all irqs at the
very begining like amdgpu_pci_remove or amdgpu_devic

drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged

When performing device unbind or halt, we have disabled all irqs at the
very begining like amdgpu_pci_remove or amdgpu_device_halt. So
amdgpu_irq_put for irqs stored in fence driver should not be called
any more, otherwise, below calltrace will arrive.

[ 139.114088] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu]
[ 139.114655] Call Trace:
[ 139.114655] <TASK>
[ 139.114657] amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu]
[ 139.114836] amdgpu_device_fini_hw+0xb6/0x350 [amdgpu]
[ 139.114955] amdgpu_driver_unload_kms+0x51/0x70 [amdgpu]
[ 139.115075] amdgpu_pci_remove+0x63/0x160 [amdgpu]
[ 139.115193] ? __pm_runtime_resume+0x64/0x90
[ 139.115195] pci_device_remove+0x3a/0xb0
[ 139.115197] device_remove+0x43/0x70
[ 139.115198] device_release_driver_internal+0xbd/0x140

Signed-off-by: Guchun Chen <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# 6e87c422 24-Apr-2023 Alex Sierra <[email protected]>

drm/amdgpu: improve wait logic at fence polling

Accomplish this by reading the seq number right away instead of sleep
for 5us. There are certain cases where the fence is ready almost
immediately. Sl

drm/amdgpu: improve wait logic at fence polling

Accomplish this by reading the seq number right away instead of sleep
for 5us. There are certain cases where the fence is ready almost
immediately. Sleep number granularity was also reduced as the majority
of the kiq tlb flush takes between 2us to 6us.

Signed-off-by: Alex Sierra <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# 9e690184 03-May-2023 Srinivasan Shanmugam <[email protected]>

drm/amd/amdgpu: Fix errors & warnings in amdgpu _bios, _cs, _dma_buf, _fence.c

The following checkpatch errors & warning is removed.

ERROR: else should follow close brace '}'
ERROR: trailing statem

drm/amd/amdgpu: Fix errors & warnings in amdgpu _bios, _cs, _dma_buf, _fence.c

The following checkpatch errors & warning is removed.

ERROR: else should follow close brace '}'
ERROR: trailing statements should be on next line
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: Possible repeated word: 'Fences'
WARNING: Missing a blank line after declarations
WARNING: braces {} are not necessary for single statement blocks
WARNING: Comparisons should place the constant on the right side of the test
WARNING: printk() should include KERN_<LEVEL> facility level

Cc: Christian König <[email protected]>
Cc: Alex Deucher <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# c1a322a7 09-May-2023 Guchun Chen <[email protected]>

drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged

When performing device unbind or halt, we have disabled all irqs at the
very begining like amdgpu_pci_remove or amdgpu_devic

drm/amdgpu: skip disabling fence driver src_irqs when device is unplugged

When performing device unbind or halt, we have disabled all irqs at the
very begining like amdgpu_pci_remove or amdgpu_device_halt. So
amdgpu_irq_put for irqs stored in fence driver should not be called
any more, otherwise, below calltrace will arrive.

[ 139.114088] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu]
[ 139.114655] Call Trace:
[ 139.114655] <TASK>
[ 139.114657] amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu]
[ 139.114836] amdgpu_device_fini_hw+0xb6/0x350 [amdgpu]
[ 139.114955] amdgpu_driver_unload_kms+0x51/0x70 [amdgpu]
[ 139.115075] amdgpu_pci_remove+0x63/0x160 [amdgpu]
[ 139.115193] ? __pm_runtime_resume+0x64/0x90
[ 139.115195] pci_device_remove+0x3a/0xb0
[ 139.115197] device_remove+0x43/0x70
[ 139.115198] device_release_driver_internal+0xbd/0x140

Signed-off-by: Guchun Chen <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3
# 033c5647 16-Mar-2023 YuBiao Wang <[email protected]>

drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs

[Why]
For engines not supporting soft reset, i.e. VCN, there will be a failed
ib test before mode 1 reset during asic reset. Th

drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs

[Why]
For engines not supporting soft reset, i.e. VCN, there will be a failed
ib test before mode 1 reset during asic reset. The fences in this case
are never signaled and next time when we try to free the sa_bo, kernel
will hang.

[How]
During pre_asic_reset, driver will clear job fences and afterwards the
fences' refcount will be reduced to 1. For drm_sched_jobs it will be
released in job_free_cb, and for non-sched jobs like ib_test, it's meant
to be released in sa_bo_free but only when the fences are signaled. So
we have to force signal the non_sched bad job's fence during
pre_asic_reset or the clear is not complete.

Signed-off-by: YuBiao Wang <[email protected]>
Acked-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# f466b111 16-Mar-2023 YuBiao Wang <[email protected]>

drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs

[Why]
For engines not supporting soft reset, i.e. VCN, there will be a failed
ib test before mode 1 reset during asic reset. Th

drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs

[Why]
For engines not supporting soft reset, i.e. VCN, there will be a failed
ib test before mode 1 reset during asic reset. The fences in this case
are never signaled and next time when we try to free the sa_bo, kernel
will hang.

[How]
During pre_asic_reset, driver will clear job fences and afterwards the
fences' refcount will be reduced to 1. For drm_sched_jobs it will be
released in job_free_cb, and for non-sched jobs like ib_test, it's meant
to be released in sa_bo_free but only when the fences are signaled. So
we have to force signal the non_sched bad job's fence during
pre_asic_reset or the clear is not complete.

Signed-off-by: YuBiao Wang <[email protected]>
Acked-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7
# 5ad7bbf3 02-Feb-2023 Guilherme G. Piccoli <[email protected]>

drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after t

drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
<TASK>
amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
devm_drm_dev_init_release+0x49/0x70
[...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/[email protected]/
[1] https://lore.kernel.org/amd-gfx/[email protected]/

Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <[email protected]>
Cc: Guchun Chen <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Mario Limonciello <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Guilherme G. Piccoli <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]

show more ...


# 70f1872e 02-Feb-2023 Guilherme G. Piccoli <[email protected]>

drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after t

drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
<TASK>
amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
devm_drm_dev_init_release+0x49/0x70
[...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/[email protected]/
[1] https://lore.kernel.org/amd-gfx/[email protected]/

Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <[email protected]>
Cc: Guchun Chen <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Mario Limonciello <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Guilherme G. Piccoli <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5
# 3f4c175d 07-Sep-2022 Jiadong.Zhu <[email protected]>

drm/amdgpu: MCBP based on DRM scheduler (v9)

Trigger Mid-Command Buffer Preemption according to the priority of the software
rings and the hw fence signalling condition.

The muxer saves the locatio

drm/amdgpu: MCBP based on DRM scheduler (v9)

Trigger Mid-Command Buffer Preemption according to the priority of the software
rings and the hw fence signalling condition.

The muxer saves the locations of the indirect buffer frames from the software
ring together with the fence sequence number in its fifo queue, and pops out
those records when the fences are signalled. The locations are used to resubmit
packages in preemption scenarios by coping the chunks from the software ring.

v2: Update comment style.
v3: Fix conflict caused by previous modifications.
v4: Remove unnecessary prints.
v5: Fix corner cases for resubmission cases.
v6: Refactor functions for resubmission, calling fence_process in irq handler.
v7: Solve conflict for removing amdgpu_sw_ring.c.
v8: Add time threshold to judge if preemption request is needed.
v9: Correct comment spelling. Set fence emit timestamp before rsu assignment.

Cc: Christian Koenig <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Michel Dänzer <[email protected]>
Signed-off-by: Jiadong.Zhu <[email protected]>
Acked-by: Luben Tuikov <[email protected]>
Acked-by: Huang Rui <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# e7b8e90a 23-Sep-2022 Jiadong.Zhu <[email protected]>

drm/amdgpu: Remove fence_process in count_emitted

The function amdgpu_fence_count_emitted used in work_hander should not call
amdgpu_fence_process which must be used in irq handler.

Reviewed-by: Ch

drm/amdgpu: Remove fence_process in count_emitted

The function amdgpu_fence_count_emitted used in work_hander should not call
amdgpu_fence_process which must be used in irq handler.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Jiadong.Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


# 11e38360 23-Sep-2022 Jiadong.Zhu <[email protected]>

drm/amdgpu: Remove fence_process in count_emitted

The function amdgpu_fence_count_emitted used in work_hander should not call
amdgpu_fence_process which must be used in irq handler.

Reviewed-by: Ch

drm/amdgpu: Remove fence_process in count_emitted

The function amdgpu_fence_count_emitted used in work_hander should not call
amdgpu_fence_process which must be used in irq handler.

Reviewed-by: Christian König <[email protected]>
Signed-off-by: Jiadong.Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>

show more ...


Revision tags: v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8
# 9dd4545f 21-Jul-2022 Slark Xiao <[email protected]>

drm/amd: Fix typo 'the the' in comment

Replace 'the the' with 'the' in the comment.

Signed-off-by: Slark Xiao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>


12345678