| a5b230e7 | 18-Mar-2025 |
Brendan King <[email protected]> |
drm/imagination: fix firmware memory leaks
Free the memory used to hold the results of firmware image processing when the module is unloaded.
Fix the related issue of the same memory being leaked i
drm/imagination: fix firmware memory leaks
Free the memory used to hold the results of firmware image processing when the module is unloaded.
Fix the related issue of the same memory being leaked if processing of the firmware image fails during module load.
Ensure all firmware GEM objects are destroyed if firmware image processing fails.
Fixes memory leaks on powervr module unload detected by Kmemleak:
unreferenced object 0xffff000042e20000 (size 94208): comm "modprobe", pid 470, jiffies 4295277154 hex dump (first 32 bytes): 02 ae 7f ed bf 45 84 00 3c 5b 1f ed 9f 45 45 05 .....E..<[...EE. d5 4f 5d 14 6c 00 3d 23 30 d0 3a 4a 66 0e 48 c8 .O].l.=#0.:Jf.H. backtrace (crc dd329dec): kmemleak_alloc+0x30/0x40 ___kmalloc_large_node+0x140/0x188 __kmalloc_large_node_noprof+0x2c/0x13c __kmalloc_noprof+0x48/0x4c0 pvr_fw_init+0xaa4/0x1f50 [powervr]
unreferenced object 0xffff000042d20000 (size 20480): comm "modprobe", pid 470, jiffies 4295277154 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 09 00 00 00 0b 00 00 00 ................ 00 00 00 00 00 00 00 00 07 00 00 00 08 00 00 00 ................ backtrace (crc 395b02e3): kmemleak_alloc+0x30/0x40 ___kmalloc_large_node+0x140/0x188 __kmalloc_large_node_noprof+0x2c/0x13c __kmalloc_noprof+0x48/0x4c0 pvr_fw_init+0xb0c/0x1f50 [powervr]
Cc: [email protected] Fixes: cc1aeedb98ad ("drm/imagination: Implement firmware infrastructure and META FW support") Signed-off-by: Brendan King <[email protected]> Reviewed-by: Matt Coster <[email protected]> Link: https://lore.kernel.org/r/20250318-ddkopsrc-1339-firmware-related-memory-leak-on-module-unload-v1-1-155337c57bb4@imgtec.com Signed-off-by: Matt Coster <[email protected]>
show more ...
|
| 4ba2abe1 | 18-Mar-2025 |
Brendan King <[email protected]> |
drm/imagination: take paired job reference
For paired jobs, have the fragment job take a reference on the geometry job, so that the geometry job cannot be freed until the fragment job has finished w
drm/imagination: take paired job reference
For paired jobs, have the fragment job take a reference on the geometry job, so that the geometry job cannot be freed until the fragment job has finished with it.
The geometry job structure is accessed when the fragment job is being prepared by the GPU scheduler. Taking the reference prevents the geometry job being freed until the fragment job no longer requires it.
Fixes a use after free bug detected by KASAN:
[ 124.256386] BUG: KASAN: slab-use-after-free in pvr_queue_prepare_job+0x108/0x868 [powervr] [ 124.264893] Read of size 1 at addr ffff0000084cb960 by task kworker/u16:4/63
Cc: [email protected] Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling") Signed-off-by: Brendan King <[email protected]> Reviewed-by: Matt Coster <[email protected]> Link: https://lore.kernel.org/r/20250318-ddkopsrc-1337-use-after-free-in-pvr_queue_prepare_job-v1-1-80fb30d044a6@imgtec.com Signed-off-by: Matt Coster <[email protected]>
show more ...
|
| 1d2eabb6 | 21-Feb-2025 |
Alessio Belle <[email protected]> |
drm/imagination: Fix timestamps in firmware traces
When firmware traces are enabled, the firmware dumps 48-bit timestamps for each trace as two 32-bit values, highest 32 bits (of which only 16 usefu
drm/imagination: Fix timestamps in firmware traces
When firmware traces are enabled, the firmware dumps 48-bit timestamps for each trace as two 32-bit values, highest 32 bits (of which only 16 useful) first.
The driver was reassembling them the other way round i.e. interpreting the first value in memory as the lowest 32 bits, and the second value as the highest 32 bits (then truncated to 16 bits).
Due to this, firmware trace dumps showed very large timestamps even for traces recorded shortly after GPU boot. The timestamps in these dumps would also sometimes jump backwards because of the truncation.
Example trace dumped after loading the powervr module and enabling firmware traces, where each line is commented with the timestamp value in hexadecimal to better show both issues:
[93540092739584] : Host Sync Partition marker: 1 // 0x551300000000 [28419798597632] : GPU units deinit // 0x19d900000000 [28548647616512] : GPU deinit // 0x19f700000000
Update logic to reassemble the timestamps halves in the correct order.
Fixes: cb56cd610866 ("drm/imagination: Add firmware trace to debugfs") Signed-off-by: Alessio Belle <[email protected]> Reviewed-by: Matt Coster <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Matt Coster <[email protected]>
show more ...
|
| a5c4c3ba | 26-Feb-2025 |
Brendan King <[email protected]> |
drm/imagination: Hold drm_gem_gpuva lock for unmap
Avoid a warning from drm_gem_gpuva_assert_lock_held in drm_gpuva_unlink.
The Imagination driver uses the GEM object reservation lock to protect th
drm/imagination: Hold drm_gem_gpuva lock for unmap
Avoid a warning from drm_gem_gpuva_assert_lock_held in drm_gpuva_unlink.
The Imagination driver uses the GEM object reservation lock to protect the gpuva list, but the GEM object was not always known in the code paths that ended up calling drm_gpuva_unlink. When the GEM object isn't known, it is found by calling drm_gpuva_find to lookup the object associated with a given virtual address range, or by calling drm_gpuva_find_first when removing all mappings.
Cc: [email protected] Fixes: 4bc736f890ce ("drm/imagination: vm: make use of GPUVM's drm_exec helper") Signed-off-by: Brendan King <[email protected]> Reviewed-by: Matt Coster <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/20250226-hold-drm_gem_gpuva-lock-for-unmap-v2-1-3fdacded227f@imgtec.com Signed-off-by: Matt Coster <[email protected]>
show more ...
|
| df1a1ed5 | 26-Feb-2025 |
Brendan King <[email protected]> |
drm/imagination: avoid deadlock on fence release
Do scheduler queue fence release processing on a workqueue, rather than in the release function itself.
Fixes deadlock issues such as the following:
drm/imagination: avoid deadlock on fence release
Do scheduler queue fence release processing on a workqueue, rather than in the release function itself.
Fixes deadlock issues such as the following:
[ 607.400437] ============================================ [ 607.405755] WARNING: possible recursive locking detected [ 607.415500] -------------------------------------------- [ 607.420817] weston:zfq0/24149 is trying to acquire lock: [ 607.426131] ffff000017d041a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: pvr_gem_object_vunmap+0x40/0xc0 [powervr] [ 607.436728] but task is already holding lock: [ 607.442554] ffff000017d105a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_ioctl+0x250/0x554 [ 607.451727] other info that might help us debug this: [ 607.458245] Possible unsafe locking scenario:
[ 607.464155] CPU0 [ 607.466601] ---- [ 607.469044] lock(reservation_ww_class_mutex); [ 607.473584] lock(reservation_ww_class_mutex); [ 607.478114] *** DEADLOCK ***
Cc: [email protected] Fixes: eaf01ee5ba28 ("drm/imagination: Implement job submission and scheduling") Signed-off-by: Brendan King <[email protected]> Reviewed-by: Matt Coster <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Matt Coster <[email protected]>
show more ...
|
| b04ce1e7 | 18-Oct-2024 |
Brendan King <[email protected]> |
drm/imagination: Break an object reference loop
When remaining resources are being cleaned up on driver close, outstanding VM mappings may result in resources being leaked, due to an object referenc
drm/imagination: Break an object reference loop
When remaining resources are being cleaned up on driver close, outstanding VM mappings may result in resources being leaked, due to an object reference loop, as shown below, with each object (or set of objects) referencing the object below it:
PVR GEM Object GPU scheduler "finished" fence GPU scheduler “scheduled” fence PVR driver “done” fence PVR Context PVR VM Context PVR VM Mappings PVR GEM Object
The reference that the PVR VM Context has on the VM mappings is a soft one, in the sense that the freeing of outstanding VM mappings is done as part of VM context destruction; no reference counts are involved, as is the case for all the other references in the loop.
To break the reference loop during cleanup, free the outstanding VM mappings before destroying the PVR Context associated with the VM context.
Signed-off-by: Brendan King <[email protected]> Signed-off-by: Matt Coster <[email protected]> Reviewed-by: Frank Binns <[email protected]> Cc: [email protected] Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
| b2ef8087 | 26-Aug-2024 |
Christian König <[email protected]> |
drm/sched: add optional errno to drm_sched_start()
The current implementation of drm_sched_start uses a hardcoded -ECANCELED to dispose of a job when the parent/hw fence is NULL. This results in drm
drm/sched: add optional errno to drm_sched_start()
The current implementation of drm_sched_start uses a hardcoded -ECANCELED to dispose of a job when the parent/hw fence is NULL. This results in drm_sched_job_done being called with -ECANCELED for each job with a NULL parent in the pending list, making it difficult to distinguish between recovery methods, whether a queue reset or a full GPU reset was used.
To improve this, we first try a soft recovery for timeout jobs and use the error code -ENODATA. If soft recovery fails, we proceed with a queue reset, where the error code remains -ENODATA for the job. Finally, for a full GPU reset, we use error codes -ECANCELED or -ETIME. This patch adds an error code parameter to drm_sched_start, allowing us to differentiate between queue reset and GPU reset failures. This enables user mode and test applications to validate the expected correctness of the requested operation. After a successful queue reset, the only way to continue normal operation is to call drm_sched_job_done with the specific error code -ENODATA.
v1: Initial implementation by Jesse utilized amdgpu_device_lock_reset_domain and amdgpu_device_unlock_reset_domain to allow user mode to track the queue reset status and distinguish between queue reset and GPU reset. v2: Christian suggested using the error codes -ENODATA for queue reset and -ECANCELED or -ETIME for GPU reset, returned to amdgpu_cs_wait_ioctl. v3: To meet the requirements, we introduce a new function drm_sched_start_ex with an additional parameter to set dma_fence_set_error, allowing us to handle the specific error codes appropriately and dispose of bad jobs with the selected error code depending on whether it was a queue reset or GPU reset. v4: Alex suggested using a new name, drm_sched_start_with_recovery_error, which more accurately describes the function's purpose. Additionally, it was recommended to add documentation details about the new method. v5: Fixed declaration of new function drm_sched_start_with_recovery_error.(Alex) v6 (chk): rebase on upstream changes, cleanup the commit message, drop the new function again and update all callers, apply the errno also to scheduler fences with hw fences v7 (chk): rebased
Signed-off-by: Jesse Zhang <[email protected]> Signed-off-by: Vitaly Prosyak <[email protected]> Signed-off-by: Christian König <[email protected]> Acked-by: Daniel Vetter <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
| 2872a57c | 02-Sep-2024 |
Jinjie Ruan <[email protected]> |
drm/imagination: Use memdup_user() helper
Switching to memdup_user(), which combines kmalloc() and copy_from_user(), and it can simplfy code.
Signed-off-by: Jinjie Ruan <[email protected]> Sugg
drm/imagination: Use memdup_user() helper
Switching to memdup_user(), which combines kmalloc() and copy_from_user(), and it can simplfy code.
Signed-off-by: Jinjie Ruan <[email protected]> Suggested-by: Christophe JAILLET <[email protected]> Reviewed-by: Frank Binns <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Matt Coster <[email protected]>
show more ...
|
| 3742c209 | 31-Aug-2024 |
Jinjie Ruan <[email protected]> |
drm/imagination: Use memdup_user() helper to simplify code
Switching to memdup_user(), which combines kmalloc() and copy_from_user(), and it can simplfy code.
Signed-off-by: Jinjie Ruan <ruanjinjie
drm/imagination: Use memdup_user() helper to simplify code
Switching to memdup_user(), which combines kmalloc() and copy_from_user(), and it can simplfy code.
Signed-off-by: Jinjie Ruan <[email protected]> Reviewed-by: Frank Binns <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Matt Coster <[email protected]>
show more ...
|
| eb4accc5 | 30-Aug-2024 |
Matt Coster <[email protected]> |
drm/imagination: Use pvr_vm_context_get()
I missed this open-coded kref_get() while trying to debug a refcount bug, so let's use the helper function here to avoid that waste of time again in the fut
drm/imagination: Use pvr_vm_context_get()
I missed this open-coded kref_get() while trying to debug a refcount bug, so let's use the helper function here to avoid that waste of time again in the future.
Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Reviewed-by: Frank Binns <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Matt Coster <[email protected]>
show more ...
|