|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3 |
|
| #
10e08943 |
| 12-Feb-2025 |
Xiaogang Chen <[email protected]> |
drm/amdkfd: Fix pasid value leak
Curret kfd does not allocate pasid values, instead uses pasid value for each vm from graphic driver. So should not prevent graphic driver from releasing pasid values
drm/amdkfd: Fix pasid value leak
Curret kfd does not allocate pasid values, instead uses pasid value for each vm from graphic driver. So should not prevent graphic driver from releasing pasid values since the values are allocated by graphic driver, not kfd driver anymore. This patch does not stop graphic driver release pasid values.
Fixes: 8544374c0f82 ("drm/amdkfd: Have kfd driver use same PASID values from graphic driver") Signed-off-by: Xiaogang Chen <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc2, v6.14-rc1, v6.13 |
|
| #
23b64523 |
| 14-Jan-2025 |
Philip Yang <[email protected]> |
drm/amdgpu: Unlocked unmap only clear page table leaves
SVM migration unmap pages from GPU and then update mapping to GPU to recover page fault. Currently unmap clears the PDE entry for range length
drm/amdgpu: Unlocked unmap only clear page table leaves
SVM migration unmap pages from GPU and then update mapping to GPU to recover page fault. Currently unmap clears the PDE entry for range length >= huge page and free PTB bo, update mapping to alloc new PT bo. There is race bug that the freed entry bo maybe still on the pt_free list, reused when updating mapping and then freed, leave invalid PDE entry and cause GPU page fault.
By setting the update to clear only one PDE entry or clear PTB, to avoid unmap to free PTE bo. This fixes the race bug and improve the unmap and map to GPU performance. Update mapping to huge page will still free the PTB bo.
With this change, the vm->pt_freed list and work is not needed. Add WARN_ON(unlocked) in amdgpu_vm_pt_free_dfs to catch if unmap to free the PTB.
Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4 |
|
| #
74ef9527 |
| 19-Dec-2024 |
Yunxiang Li <[email protected]> |
drm/amdgpu: track bo memory stats at runtime
Before, every time fdinfo is queried we try to lock all the BOs in the VM and calculate memory usage from scratch. This works okay if the fdinfo is rarel
drm/amdgpu: track bo memory stats at runtime
Before, every time fdinfo is queried we try to lock all the BOs in the VM and calculate memory usage from scratch. This works okay if the fdinfo is rarely read and the VMs don't have a ton of BOs. If either of these conditions is not true, we get a massive performance hit.
In this new revision, we track the BOs as they change states. This way when the fdinfo is queried we only need to take the status lock and copy out the usage stats with minimal impact to the runtime performance. With this new approach however, we would no longer be able to track active buffers.
Signed-off-by: Yunxiang Li <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Christian König <[email protected]>
show more ...
|
| #
a541a6e8 |
| 19-Dec-2024 |
Yunxiang Li <[email protected]> |
drm/amdgpu: remove unused function parameter
amdgpu_vm_bo_invalidate doesn't use the adev parameter and not all callers have a reference to adev handy, so remove it for cleanliness.
Signed-off-by:
drm/amdgpu: remove unused function parameter
amdgpu_vm_bo_invalidate doesn't use the adev parameter and not all callers have a reference to adev handy, so remove it for cleanliness.
Signed-off-by: Yunxiang Li <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Christian König <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5 |
|
| #
fdee0872 |
| 24-Oct-2024 |
Yunxiang Li <[email protected]> |
drm/amdgpu: stop tracking visible memory stats
Since on modern systems all of vram can be made visible anyways, to simplify the new implementation, drops tracking how much memory is visible for now.
drm/amdgpu: stop tracking visible memory stats
Since on modern systems all of vram can be made visible anyways, to simplify the new implementation, drops tracking how much memory is visible for now. If this is really needed we can add it back on top of the new implementation, or just report all the BOs as visible.
Signed-off-by: Yunxiang Li <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1 |
|
| #
04bdba46 |
| 20-May-2024 |
Tvrtko Ursulin <[email protected]> |
drm/amdgpu: Use drm_print_memory_stats helper from fdinfo
Convert fdinfo memory stats to use the common drm_print_memory_stats helper.
This achieves alignment with the common keys as documented in
drm/amdgpu: Use drm_print_memory_stats helper from fdinfo
Convert fdinfo memory stats to use the common drm_print_memory_stats helper.
This achieves alignment with the common keys as documented in drm-usage-stats.rst, adding specifically drm-total- key the driver was missing until now.
Additionally I made the code stop skipping total size for objects which currently do not have a backing store, and I added resident, active and purgeable reporting.
Legacy keys have been preserved, with the outlook of only potentially removing only the drm-memory- when the time gets right.
The example output now looks like this:
pos: 0 flags: 02100002 mnt_id: 24 ino: 1239 drm-driver: amdgpu drm-client-id: 4 drm-pdev: 0000:04:00.0 pasid: 32771 drm-total-cpu: 0 drm-shared-cpu: 0 drm-active-cpu: 0 drm-resident-cpu: 0 drm-purgeable-cpu: 0 drm-total-gtt: 2392 KiB drm-shared-gtt: 0 drm-active-gtt: 0 drm-resident-gtt: 2392 KiB drm-purgeable-gtt: 0 drm-total-vram: 44564 KiB drm-shared-vram: 31952 KiB drm-active-vram: 0 drm-resident-vram: 44564 KiB drm-purgeable-vram: 0 drm-memory-vram: 44564 KiB drm-memory-gtt: 2392 KiB drm-memory-cpu: 0 KiB amd-memory-visible-vram: 44564 KiB amd-evicted-vram: 0 KiB amd-evicted-visible-vram: 0 KiB amd-requested-vram: 44564 KiB amd-requested-visible-vram: 11952 KiB amd-requested-gtt: 2392 KiB drm-engine-compute: 46464671 ns
v2: * Track purgeable via AMDGPU_GEM_CREATE_DISCARDABLE.
Acked-by: Daniel Vetter <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Rob Clark <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
4da5a95b |
| 20-Aug-2024 |
Christian König <[email protected]> |
drm/amdgpu: re-work VM syncing
Rework how VM operations synchronize to submissions. Provide an amdgpu_sync container to the backends instead of an reservation object and fill in the amdgpu_sync obje
drm/amdgpu: re-work VM syncing
Rework how VM operations synchronize to submissions. Provide an amdgpu_sync container to the backends instead of an reservation object and fill in the amdgpu_sync object in the higher layers of the code.
No intended functional change, just prepares for upcomming changes.
Signed-off-by: Christian König <[email protected]> Reviewed-by: Friedrich Vock <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
6ef29715 |
| 23-Aug-2024 |
Xiaogang Chen <[email protected]> |
drm/amdkfd: Change kfd/svm page fault drain handling
When app unmap vm ranges(munmap) kfd/svm starts drain pending page fault and not handle any incoming pages fault of this process until a deferred
drm/amdkfd: Change kfd/svm page fault drain handling
When app unmap vm ranges(munmap) kfd/svm starts drain pending page fault and not handle any incoming pages fault of this process until a deferred work item got executed by default system wq. The time period of "not handle page fault" can be long and is unpredicable. That is advese to kfd performance on page faults recovery.
This patch uses time stamp of incoming page fault to decide to drop or recover page fault. When app unmap vm ranges kfd records each gpu device's ih ring current time stamp. These time stamps are used at kfd page fault recovery routine.
Any page fault happened on unmapped ranges after unmap events is application bug that accesses vm range after unmap. It is not driver work to cover that.
By using time stamp of page fault do not need drain page faults at deferred work. So, the time period that kfd does not handle page faults is reduced and can be controlled.
Signed-off-by: Xiaogang.Chen <[email protected]> Reviewed-by: Philip Yang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4 |
|
| #
6b83b94a |
| 06-Feb-2024 |
Alex Deucher <[email protected]> |
drm/amdgpu: add additional VM bits
Add additional VM PTE bits.
Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
|
| #
45bd39fb |
| 29-May-2024 |
Shane Xiao <[email protected]> |
drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_VG10
This patch changes the implementation of AMDGPU_PTE_MTYPE_VG10, clear the bits before setting the new one.
Suggested-by: Alex Deucher
drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_VG10
This patch changes the implementation of AMDGPU_PTE_MTYPE_VG10, clear the bits before setting the new one.
Suggested-by: Alex Deucher <[email protected]> Signed-off-by: longlyao <[email protected]> Signed-off-by: Shane Xiao <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
39282901 |
| 29-May-2024 |
Shane Xiao <[email protected]> |
drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_NV10
This patch changes the implementation of AMDGPU_PTE_MTYPE_NV10, clear the bits before setting the new one.
Suggested-by: Alex Deucher
drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_NV10
This patch changes the implementation of AMDGPU_PTE_MTYPE_NV10, clear the bits before setting the new one.
Suggested-by: Alex Deucher <[email protected]> Signed-off-by: longlyao <[email protected]> Signed-off-by: Shane Xiao <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
eba791dc |
| 29-May-2024 |
Shane Xiao <[email protected]> |
drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_GFX12
This patch changes the implementation of AMDGPU_PTE_MTYPE_GFX12, clear the bits before setting the new one. This fixed the potential i
drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_GFX12
This patch changes the implementation of AMDGPU_PTE_MTYPE_GFX12, clear the bits before setting the new one. This fixed the potential issue that GFX12 setting memory to NC.
v2: Clear mtype field before setting the new one (Alex) v3: Fix typo (Felix)
Suggested-by: Alex Deucher <[email protected]> Signed-off-by: longlyao <[email protected]> Signed-off-by: Shane Xiao <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
26e20235 |
| 06-May-2024 |
Tvrtko Ursulin <[email protected]> |
drm/amdgpu: Add amdgpu_bo_is_vm_bo helper
Help code readability by replacing a bunch of:
bo->tbo.base.resv == vm->root.bo->tbo.base.resv
With:
amdgpu_vm_is_bo_always_valid(vm, bo)
No functional
drm/amdgpu: Add amdgpu_bo_is_vm_bo helper
Help code readability by replacing a bunch of:
bo->tbo.base.resv == vm->root.bo->tbo.base.resv
With:
amdgpu_vm_is_bo_always_valid(vm, bo)
No functional changes.
v2: * Rename helper and move to amdgpu_vm. (Christian)
v3: * Use Christian's kerneldoc.
v4: * Fixed logic inversion in amdgpu_vm_bo_get_memory.
Signed-off-by: Tvrtko Ursulin <[email protected]> Cc: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2 |
|
| #
980a0a94 |
| 08-Mar-2023 |
Hawking Zhang <[email protected]> |
drm/amdgpu: support gfx v12 specific pte/pde fields
Add gfx v12 pte/pde support to gmc common helper.
v2: squash in fixes (Alex)
Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: L
drm/amdgpu: support gfx v12 specific pte/pde fields
Add gfx v12 pte/pde support to gmc common helper.
v2: squash in fixes (Alex)
Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Likun Gao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
2d1d8756 |
| 08-Mar-2023 |
Hawking Zhang <[email protected]> |
drm/amdgpu: Add gfx v12 pte/pde format change
Add gfx v12 pte/pde format change.
Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Likun Gao <[email protected]> Signed-off-by: Alex
drm/amdgpu: Add gfx v12 pte/pde format change
Add gfx v12 pte/pde format change.
Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Likun Gao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
b6c4f90b |
| 18-Mar-2024 |
Shashank Sharma <[email protected]> |
drm/amdgpu: sync page table freeing with tlb flush
The idea behind this patch is to delay the freeing of PT entry objects until the TLB flush is done.
This patch: - Adds a tlb_flush_waitlist in amd
drm/amdgpu: sync page table freeing with tlb flush
The idea behind this patch is to delay the freeing of PT entry objects until the TLB flush is done.
This patch: - Adds a tlb_flush_waitlist in amdgpu_vm_update_params which will keep the objects that need to be freed after tlb_flush. - Adds PT entries in this list in amdgpu_vm_ptes_update after finding the PT entry. - Changes functionality of amdgpu_vm_pt_free_dfs from (df_search + free) to simply freeing of the BOs, also renames it to amdgpu_vm_pt_free_list to reflect this same. - Exports function amdgpu_vm_pt_free_list to be called directly. - Calls amdgpu_vm_pt_free_list directly from amdgpu_vm_update_range.
V2: rebase V4: Addressed review comments from Christian - add only locked PTEs entries in TLB flush waitlist. - do not create a separate function for list flush. - do not create a new lock for TLB flush. - there is no need to wait on tlb_flush_fence exclusively.
V5: Addressed review comments from Christian - change the amdgpu_vm_pt_free_dfs's functionality to simple freeing of the objects and rename it. - add all the PTE objects in params->tlb_flush_waitlist - let amdgpu_vm_pt_free_root handle the freeing of BOs independently - call amdgpu_vm_pt_free_list directly
V6: Rebase V7: Rebase V8: Added a NULL check to fix this backtrace issue: [ 415.351447] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 415.359245] #PF: supervisor write access in kernel mode [ 415.365081] #PF: error_code(0x0002) - not-present page [ 415.370817] PGD 101259067 P4D 101259067 PUD 10125a067 PMD 0 [ 415.377140] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 415.382004] CPU: 0 PID: 25481 Comm: test_with_MPI.e Tainted: G OE 5.18.2-mi300-build-140423-ubuntu-22.04+ #24 [ 415.394437] Hardware name: AMD Corporation Sh51p/Sh51p, BIOS RMO1001AS 02/21/2024 [ 415.402797] RIP: 0010:amdgpu_vm_ptes_update+0x6fd/0xa10 [amdgpu] [ 415.409648] Code: 4c 89 ff 4d 8d 66 30 e8 f1 ed ff ff 48 85 db 74 42 48 39 5d a0 74 40 48 8b 53 20 48 8b 4b 18 48 8d 43 18 48 8d 75 b0 4c 89 ff <48 > 89 51 08 48 89 0a 49 8b 56 30 48 89 42 08 48 89 53 18 4c 89 63 [ 415.430621] RSP: 0018:ffffc9000401f990 EFLAGS: 00010287 [ 415.436456] RAX: ffff888147bb82f0 RBX: ffff888147bb82d8 RCX: 0000000000000000 [ 415.444426] RDX: 0000000000000000 RSI: ffffc9000401fa30 RDI: ffff888161f80000 [ 415.452397] RBP: ffffc9000401fa80 R08: 0000000000000000 R09: ffffc9000401fa00 [ 415.460368] R10: 00000007f0cc0000 R11: 00000007f0c85000 R12: ffffc9000401fb20 [ 415.468340] R13: 00000007f0d00000 R14: ffffc9000401faf0 R15: ffff888161f80000 [ 415.476312] FS: 00007f132ff89840(0000) GS:ffff889f87c00000(0000) knlGS:0000000000000000 [ 415.485350] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 415.491767] CR2: 0000000000000008 CR3: 0000000161d46003 CR4: 0000000000770ef0 [ 415.499738] PKRU: 55555554 [ 415.502750] Call Trace: [ 415.505482] <TASK> [ 415.507825] amdgpu_vm_update_range+0x32a/0x880 [amdgpu] [ 415.513869] amdgpu_vm_clear_freed+0x117/0x250 [amdgpu] [ 415.519814] amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu+0x18c/0x250 [amdgpu] [ 415.527729] kfd_ioctl_unmap_memory_from_gpu+0xed/0x340 [amdgpu] [ 415.534551] kfd_ioctl+0x3b6/0x510 [amdgpu]
V9: Addressed review comments from Christian - No NULL check reqd for root PT freeing - Free PT list regardless of needs_flush - Move adding BOs in list in a separate function
V10: Added Christian's RB V11: squash in list fix
Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Felix Kuehling <[email protected]> Cc: Rajneesh Bhardwaj <[email protected]> Acked-by: Felix Kuehling <[email protected]> Acked-by: Rajneesh Bhardwaj <[email protected]> Reviewed-by: Christian König <[email protected]> Tested-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Shashank Sharma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
d8a3f0a0 |
| 18-Mar-2024 |
Christian Koenig <[email protected]> |
drm/amdgpu: implement TLB flush fence
The problem is that when (for example) 4k pages are replaced with a single 2M page we need to wait for change to be flushed out by invalidating the TLB before t
drm/amdgpu: implement TLB flush fence
The problem is that when (for example) 4k pages are replaced with a single 2M page we need to wait for change to be flushed out by invalidating the TLB before the PT can be freed.
Solve this by moving the TLB flush into a DMA-fence object which can be used to delay the freeing of the PT BOs until it is signaled.
V2: (Shashank) - rebase - set dma_fence_error only in case of error - add tlb_flush fence only when PT/PD BO is locked (Felix) - use vm->pasid when f is NULL (Mukul)
V4: - add a wait for (f->dependency) in tlb_fence_work (Christian) - move the misplaced fence_create call to the end (Philip)
V5: - free the f->dependency properly
V6: (Shashank) - light code movement, moved all the clean-up in previous patch - introduce params.needs_flush and its usage in this patch - rebase without TLB HW sequence patch
V7: - Keep the vm->last_update_fence and tlb_cb code until we can fix the HW sequencing (Christian) - Move all the tlb_fence related code in a separate function so that its easier to read and review
V9: Addressed review comments from Christian - start PT update only when we have callback memory allocated
V10: - handle device unlock in OOM case (Christian, Mukul) - added Christian's R-B
Cc: Christian Koenig <[email protected]> Cc: Felix Kuehling <[email protected]> Cc: Rajneesh Bhardwaj <[email protected]> Cc: Alex Deucher <[email protected]> Acked-by: Felix Kuehling <[email protected]> Acked-by: Rajneesh Bhardwaj <[email protected]> Tested-by: Rajneesh Bhardwaj <[email protected]> Reviewed-by: Shashank Sharma <[email protected]> Reviewed-by: Christian Koenig <[email protected]> Signed-off-by: Christian Koenig <[email protected]> Signed-off-by: Shashank Sharma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
dc406d92 |
| 07-Mar-2024 |
Sunil Khatri <[email protected]> |
drm/amdgpu: add recent pagefault info in vm_manager
Currently page fault information is stored per vm and which could be freed or stale during reset. Add it pagefault information in the vm_manager w
drm/amdgpu: add recent pagefault info in vm_manager
Currently page fault information is stored per vm and which could be freed or stale during reset. Add it pagefault information in the vm_manager which is a global space for vm's and remains valid across.
Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
bb8863cc |
| 05-Mar-2024 |
Jesse Zhang <[email protected]> |
drm/amdgpu: remove unused code
Remove the unused function - amdgpu_vm_pt_is_root_clean and remove the impossible condition
v1: entries == 0 is not possible any more, so this condition could pro
drm/amdgpu: remove unused code
Remove the unused function - amdgpu_vm_pt_is_root_clean and remove the impossible condition
v1: entries == 0 is not possible any more, so this condition could probably be removed (Felix)
Signed-off-by: Jesse Zhang <[email protected]> Suggested-by:Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
b8f67b9d |
| 18-Jan-2024 |
Shashank Sharma <[email protected]> |
drm/amdgpu: change vm->task_info handling
This patch changes the handling and lifecycle of vm->task_info object. The major changes are: - vm->task_info is a dynamically allocated ptr now, and its ua
drm/amdgpu: change vm->task_info handling
This patch changes the handling and lifecycle of vm->task_info object. The major changes are: - vm->task_info is a dynamically allocated ptr now, and its uasge is reference counted. - introducing two new helper funcs for task_info lifecycle management - amdgpu_vm_get_task_info: reference counts up task_info before returning this info - amdgpu_vm_put_task_info: reference counts down task_info - last put to task_info() frees task_info from the vm.
This patch also does logistical changes required for existing usage of vm->task_info.
V2: Do not block all the prints when task_info not found (Felix)
V3: Fixed review comments from Felix - Fix wrong indentation - No debug message for -ENOMEM - Add NULL check for task_info - Do not duplicate the debug messages (ti vs no ti) - Get first reference of task_info in vm_init(), put last in vm_fini()
V4: Fixed review comments from Felix - fix double reference increment in create_task_info - change amdgpu_vm_get_task_info_pasid - additional changes in amdgpu_gem.c while porting
Cc: Christian Koenig <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Felix Kuehling <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Shashank Sharma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
feb13f52 |
| 29-Feb-2024 |
Jesse Zhang <[email protected]> |
Revert "drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute" for Raven
fix the issue: "amdgpu: Failed to create process VM object".
[Why]when amdgpu initialized, seq64 do mampping and up
Revert "drm/amdgpu: remove vm sanity check from amdgpu_vm_make_compute" for Raven
fix the issue: "amdgpu: Failed to create process VM object".
[Why]when amdgpu initialized, seq64 do mampping and update bo mapping in vm page table. But when clifo run. It also initializes a vm for a process device through the function kfd_process_device_init_vm and ensure the root PD is clean through the function amdgpu_vm_pt_is_root_clean. So they have a conflict, and clinfo always failed.
v1: - remove all the pte_supports_ats stuff from the amdgpu_vm code (Felix)
Signed-off-by: Jesse Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
34a1de0f |
| 25-Jan-2024 |
Felix Kuehling <[email protected]> |
drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole
The TBA and TMA, along with an unused IB allocation, reside at low addresses in the VM address space. A stray VM fault which hits these pages
drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole
The TBA and TMA, along with an unused IB allocation, reside at low addresses in the VM address space. A stray VM fault which hits these pages must be serviced by making their page table entries invalid. The scheduler depends upon these pages being resident and fails, preventing a debugger from inspecting the failure state.
By relocating these pages above 47 bits in the VM address space they can only be reached when bits [63:48] are set to 1. This makes it much less likely for a misbehaving program to generate accesses to them. The current placement at VA (PAGE_SIZE*2) is readily hit by a NULL access with a small offset.
v2: - Move it to the reserved space to avoid concflicts with Mesa - Add macros to make reserved space management easier
v3: - Move VM max PFN calculation into AMDGPU_VA_RESERVED macros
Cc: Arunpravin Paneer Selvam <[email protected]> Cc: Christian Koenig <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Jay Cornwall <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
efe0f34c |
| 30-Jan-2024 |
Felix Kuehling <[email protected]> |
drm/amdgpu: Reduce VA_RESERVED_BOTTOM to 64KB
The reservation is there to catch NULL pointer dereferences from the GPU. Reduce the size to 64KB to make sure that shared virtual address programming m
drm/amdgpu: Reduce VA_RESERVED_BOTTOM to 64KB
The reservation is there to catch NULL pointer dereferences from the GPU. Reduce the size to 64KB to make sure that shared virtual address programming models can map all CPU-accessible virtual addresses for GPU access. This is also the default for CPU virtual address mappings as seen in /proc/sys/vm/mmap_min_addr.
Reviewed-by: Christian König <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|
| #
00a11f97 |
| 12-Jan-2024 |
Arunpravin Paneer Selvam <[email protected]> |
drm/amdgpu: Enable seq64 manager and fix bugs
- Enable the seq64 mapping sequence. - Fix wflinfo va conflict and other bugs.
v1: - The seq64 area needs to be included in the AMDGPU_VA_RESERVED_SI
drm/amdgpu: Enable seq64 manager and fix bugs
- Enable the seq64 mapping sequence. - Fix wflinfo va conflict and other bugs.
v1: - The seq64 area needs to be included in the AMDGPU_VA_RESERVED_SIZE otherwise the areas will conflict with user space allocations (Alex)
- It needs to be mapped read only in the user VM (Alex)
v2: - Instead of just one define for TOP/BOTTOM reserved space separate them into two (Christian)
- Fix the CPU and VA calculations and while at it also cleanup error handling and kerneldoc (Christian)
Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Reviewed-by: Christian König <[email protected]>
show more ...
|
| #
50661eb1 |
| 03-Jan-2024 |
Felix Kuehling <[email protected]> |
drm/amdgpu: Auto-validate DMABuf imports in compute VMs
DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the process_info->kfd_bo_list. There is no explicit KFD API call to valid
drm/amdgpu: Auto-validate DMABuf imports in compute VMs
DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the process_info->kfd_bo_list. There is no explicit KFD API call to validate them or add eviction fences to them.
This patch automatically validates and fences dymanic DMABuf imports when they are added to a compute VM. Revalidation after evictions is handled in the VM code.
v2: * Renamed amdgpu_vm_validate_evicted_bos to amdgpu_vm_validate * Eliminated evicted_user state, use evicted state for VM BOs and user BOs * Fixed and simplified amdgpu_vm_fence_imports, depends on reserved BOs * Moved dma_resv_reserve_fences for amdgpu_vm_fence_imports into amdgpu_vm_validate, outside the vm->status_lock * Added dummy version of amdgpu_amdkfd_bo_validate_and_fence for builds without KFD
v4: Eliminate amdgpu_vm_fence_imports. It's not needed because the reservation with its fences is shared with the export, as long as all imports are from KFD, with the exports already reserved, validated and fenced by the KFD restore worker.
v5: Reintroduced separate evicted_user state to simplify the state machine and CS error handling when amdgpu_vm_validate is called without a ticket.
Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
show more ...
|