|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4 |
|
| #
046eda65 |
| 20-Feb-2025 |
Shuicheng Lin <[email protected]> |
drm/xe/devcoredump: Remove IS_ERR_OR_NULL check for kzalloc
kzalloc returns a valid pointer or NULL if the allocation fails. It never returns an error pointer. It is better to check for NULL directl
drm/xe/devcoredump: Remove IS_ERR_OR_NULL check for kzalloc
kzalloc returns a valid pointer or NULL if the allocation fails. It never returns an error pointer. It is better to check for NULL directly.
Signed-off-by: Shuicheng Lin <[email protected]> Cc: John Harrison <[email protected]> Cc: Lucas De Marchi <[email protected]> Reviewed-by: Tejas Upadhyay <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
show more ...
|
| #
c504ad91 |
| 20-Feb-2025 |
Shuicheng Lin <[email protected]> |
drm/xe/devcoredump: Fix print typo of offset
The log should print with "offset" instead of "size". Correct the typo in the comment.
v2: split kzalloc change and add typo fix in commit message (Luca
drm/xe/devcoredump: Fix print typo of offset
The log should print with "offset" instead of "size". Correct the typo in the comment.
v2: split kzalloc change and add typo fix in commit message (Lucas)
Signed-off-by: Shuicheng Lin <[email protected]> Cc: John Harrison <[email protected]> Cc: Lucas De Marchi <[email protected]> Reviewed-by: Tejas Upadhyay <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc3, v6.14-rc2, v6.14-rc1 |
|
| #
a9ab6591 |
| 23-Jan-2025 |
Lucas De Marchi <[email protected]> |
drm/xe: Fix and re-enable xe_print_blob_ascii85()
Commit 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool") partially reverted some changes to workaround breakage caused to me
drm/xe: Fix and re-enable xe_print_blob_ascii85()
Commit 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool") partially reverted some changes to workaround breakage caused to mesa tools. However, in doing so it also broke fetching the GuC log via debugfs since xe_print_blob_ascii85() simply bails out.
The fix is to avoid the extra newlines: the devcoredump interface is line-oriented and adding random newlines in the middle breaks it. If a tool is able to parse it by looking at the data and checking for chars that are out of the ascii85 space, it can still do so. A format change that breaks the line-oriented output on devcoredump however needs better coordination with existing tools.
v2: Add suffix description comment v3: Reword explanation of xe_print_blob_ascii85() calling drm_puts() in a loop
Reviewed-by: José Roberto de Souza <[email protected]> Cc: John Harrison <[email protected]> Cc: Julia Filipchuk <[email protected]> Cc: José Roberto de Souza <[email protected]> Cc: [email protected] Fixes: 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool") Fixes: ec1455ce7e35 ("drm/xe/devcoredump: Add ASCII85 dump helper function") Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> (cherry picked from commit 2c95bbf5002776117a69caed3b31c10bf7341bec) Signed-off-by: Rodrigo Vivi <[email protected]>
show more ...
|
| #
042c48b7 |
| 23-Jan-2025 |
Lucas De Marchi <[email protected]> |
drm/xe/devcoredump: Move exec queue snapshot to Contexts section
Having the exec queue snapshot inside a "GuC CT" section was always wrong. Commit c28fd6c358db ("drm/xe/devcoredump: Improve section
drm/xe/devcoredump: Move exec queue snapshot to Contexts section
Having the exec queue snapshot inside a "GuC CT" section was always wrong. Commit c28fd6c358db ("drm/xe/devcoredump: Improve section headings and add tile info") tried to fix that bug, but with that also broke the mesa tool that parses the devcoredump, hence it was reverted in commit a53da2fb25a3 ("drm/xe: Revert some changes that break a mesa debug tool").
With the mesa tool also fixed, this can propagate as a fix on both kernel and userspace side to avoid unnecessary headache for a debug feature.
Cc: John Harrison <[email protected]> Cc: Julia Filipchuk <[email protected]> Cc: José Roberto de Souza <[email protected]> Cc: [email protected] Fixes: a53da2fb25a3 ("drm/xe: Revert some changes that break a mesa debug tool") Reviewed-by: José Roberto de Souza <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> (cherry picked from commit a37934ea75d331fafa7fe80b6180642ba5193422) Signed-off-by: Rodrigo Vivi <[email protected]>
show more ...
|
| #
2c95bbf5 |
| 23-Jan-2025 |
Lucas De Marchi <[email protected]> |
drm/xe: Fix and re-enable xe_print_blob_ascii85()
Commit 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool") partially reverted some changes to workaround breakage caused to me
drm/xe: Fix and re-enable xe_print_blob_ascii85()
Commit 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool") partially reverted some changes to workaround breakage caused to mesa tools. However, in doing so it also broke fetching the GuC log via debugfs since xe_print_blob_ascii85() simply bails out.
The fix is to avoid the extra newlines: the devcoredump interface is line-oriented and adding random newlines in the middle breaks it. If a tool is able to parse it by looking at the data and checking for chars that are out of the ascii85 space, it can still do so. A format change that breaks the line-oriented output on devcoredump however needs better coordination with existing tools.
v2: Add suffix description comment v3: Reword explanation of xe_print_blob_ascii85() calling drm_puts() in a loop
Reviewed-by: José Roberto de Souza <[email protected]> Cc: John Harrison <[email protected]> Cc: Julia Filipchuk <[email protected]> Cc: José Roberto de Souza <[email protected]> Cc: [email protected] Fixes: 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool") Fixes: ec1455ce7e35 ("drm/xe/devcoredump: Add ASCII85 dump helper function") Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
show more ...
|
| #
a37934ea |
| 23-Jan-2025 |
Lucas De Marchi <[email protected]> |
drm/xe/devcoredump: Move exec queue snapshot to Contexts section
Having the exec queue snapshot inside a "GuC CT" section was always wrong. Commit c28fd6c358db ("drm/xe/devcoredump: Improve section
drm/xe/devcoredump: Move exec queue snapshot to Contexts section
Having the exec queue snapshot inside a "GuC CT" section was always wrong. Commit c28fd6c358db ("drm/xe/devcoredump: Improve section headings and add tile info") tried to fix that bug, but with that also broke the mesa tool that parses the devcoredump, hence it was reverted in commit 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool").
With the mesa tool also fixed, this can propagate as a fix on both kernel and userspace side to avoid unnecessary headache for a debug feature.
Cc: John Harrison <[email protected]> Cc: Julia Filipchuk <[email protected]> Cc: José Roberto de Souza <[email protected]> Cc: [email protected] Fixes: 70fb86a85dc9 ("drm/xe: Revert some changes that break a mesa debug tool") Reviewed-by: José Roberto de Souza <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
show more ...
|
|
Revision tags: v6.13, v6.13-rc7 |
|
| #
75fd04f2 |
| 06-Jan-2025 |
Nitin Gote <[email protected]> |
drm/xe: Fix all typos in xe
Fix all typos in files of xe, reported by codespell tool.
Signed-off-by: Nitin Gote <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Reviewe
drm/xe: Fix all typos in xe
Fix all typos in files of xe, reported by codespell tool.
Signed-off-by: Nitin Gote <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Reviewed-by: Stuart Summers <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Nirmoy Das <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3 |
|
| #
a53da2fb |
| 13-Dec-2024 |
John Harrison <[email protected]> |
drm/xe: Revert some changes that break a mesa debug tool
There is a mesa debug tool for decoding devcoredump files. Recent changes to improve the devcoredump output broke that tool. So revert the ch
drm/xe: Revert some changes that break a mesa debug tool
There is a mesa debug tool for decoding devcoredump files. Recent changes to improve the devcoredump output broke that tool. So revert the changes until the tool can be extended to support the new fields.
Signed-off-by: John Harrison <[email protected]> Fixes: c28fd6c358db ("drm/xe/devcoredump: Improve section headings and add tile info") Fixes: ec1455ce7e35 ("drm/xe/devcoredump: Add ASCII85 dump helper function") Cc: John Harrison <[email protected]> Cc: Julia Filipchuk <[email protected]> Cc: Lucas De Marchi <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: [email protected] Reviewed-by: Jonathan Cavitt <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]> (cherry picked from commit 70fb86a85dc9fd66014d7eb2fe356f50702ceeb6) Signed-off-by: Thomas Hellström <[email protected]>
show more ...
|
| #
70fb86a8 |
| 13-Dec-2024 |
John Harrison <[email protected]> |
drm/xe: Revert some changes that break a mesa debug tool
There is a mesa debug tool for decoding devcoredump files. Recent changes to improve the devcoredump output broke that tool. So revert the ch
drm/xe: Revert some changes that break a mesa debug tool
There is a mesa debug tool for decoding devcoredump files. Recent changes to improve the devcoredump output broke that tool. So revert the changes until the tool can be extended to support the new fields.
Signed-off-by: John Harrison <[email protected]> Fixes: c28fd6c358db ("drm/xe/devcoredump: Improve section headings and add tile info") Fixes: ec1455ce7e35 ("drm/xe/devcoredump: Add ASCII85 dump helper function") Cc: John Harrison <[email protected]> Cc: Julia Filipchuk <[email protected]> Cc: Lucas De Marchi <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: [email protected] Reviewed-by: Jonathan Cavitt <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc2, v6.13-rc1 |
|
| #
5dce85fe |
| 28-Nov-2024 |
John Harrison <[email protected]> |
drm/xe: Move the coredump registration to the worker thread
Adding lockdep checking to the coredump code showed that there was an existing violation. The dev_coredumpm_timeout() call is used to regi
drm/xe: Move the coredump registration to the worker thread
Adding lockdep checking to the coredump code showed that there was an existing violation. The dev_coredumpm_timeout() call is used to register the dump with the base coredump subsystem. However, that makes multiple memory allocations, only some of which use the GFP_ flags passed in. So that also needs to be deferred to the worker function where it is safe to allocate with arbitrary flags.
In order to not add protoypes for the callback functions, moving the _timeout call also means moving the worker thread function to later in the file.
v2: Rebased after other changes to the worker function.
Fixes: e799485044cb ("drm/xe: Introduce the dev_coredump infrastructure.") Cc: Thomas Hellström <[email protected]> Cc: Matthew Brost <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Francois Dugast <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Lucas De Marchi <[email protected]> Cc: "Thomas Hellström" <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: "Christian König" <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: <[email protected]> # v6.8+ Signed-off-by: John Harrison <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 90f51a7f4ec1004fc4ddfbc6d1f1068d85ef4771) Signed-off-by: Thomas Hellström <[email protected]>
show more ...
|
| #
906c4b30 |
| 28-Nov-2024 |
John Harrison <[email protected]> |
drm/xe: Add mutex locking to devcoredump
There are now multiple places that can trigger a coredump. Some of which can happen in parallel. There is already a check against capturing multiple dumps se
drm/xe: Add mutex locking to devcoredump
There are now multiple places that can trigger a coredump. Some of which can happen in parallel. There is already a check against capturing multiple dumps sequentially, but without locking it doesn't guarantee to work against concurrent dumps. And if two dumps do happen in parallel, they can end up doing Bad Things such as one call stack freeing the data the other call stack is still processing. Which leads to a crashed kernel.
Further, it is possible for the DRM timeout to expire and trigger a free of the capture while a user is still reading that capture out through sysfs. Again leading to dodgy pointer problems.
So, add a mutext lock around the capture, read and free functions to prevent inteference.
v2: Swap tiny scope spin_lock for larger scope mutex and fix kernel-doc comment (review feedback from Matthew Brost) v3: Move mutex locks to exclude worker thread and add reclaim annotation (review feedback from Matthew Brost) v4: Fix typo.
Signed-off-by: John Harrison <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
| #
90f51a7f |
| 28-Nov-2024 |
John Harrison <[email protected]> |
drm/xe: Move the coredump registration to the worker thread
Adding lockdep checking to the coredump code showed that there was an existing violation. The dev_coredumpm_timeout() call is used to regi
drm/xe: Move the coredump registration to the worker thread
Adding lockdep checking to the coredump code showed that there was an existing violation. The dev_coredumpm_timeout() call is used to register the dump with the base coredump subsystem. However, that makes multiple memory allocations, only some of which use the GFP_ flags passed in. So that also needs to be deferred to the worker function where it is safe to allocate with arbitrary flags.
In order to not add protoypes for the callback functions, moving the _timeout call also means moving the worker thread function to later in the file.
v2: Rebased after other changes to the worker function.
Fixes: e799485044cb ("drm/xe: Introduce the dev_coredump infrastructure.") Cc: Thomas Hellström <[email protected]> Cc: Matthew Brost <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Francois Dugast <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Lucas De Marchi <[email protected]> Cc: "Thomas Hellström" <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: "Christian König" <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: <[email protected]> # v6.8+ Signed-off-by: John Harrison <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
| #
429915ac |
| 28-Nov-2024 |
John Harrison <[email protected]> |
drm/xe: Add a reason string to the devcoredump
There are debug level prints giving more information about the cause of the hang immediately before core dumps are created. However, not everyone has d
drm/xe: Add a reason string to the devcoredump
There are debug level prints giving more information about the cause of the hang immediately before core dumps are created. However, not everyone has debug level prints enabled or saves the dmesg log at all. So include that information in the dump file itself. Also, at least one of those prints included the pid as well as the process name. So include that in the capture too.
v2: Fix kvfree vs kfree and missing kernel-doc (review feedback from Matthew Brost) v3: Use GFP_ATOMIC instead of GFP_KERNEL.
Signed-off-by: John Harrison <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
| #
aef0b4a0 |
| 26-Nov-2024 |
Matthew Brost <[email protected]> |
drm/xe: Take PM ref in delayed snapshot capture worker
The delayed snapshot capture worker can access the GPU or VRAM both of which require a PM reference. Take a reference in this worker.
Cc: Rodr
drm/xe: Take PM ref in delayed snapshot capture worker
The delayed snapshot capture worker can access the GPU or VRAM both of which require a PM reference. Take a reference in this worker.
Cc: Rodrigo Vivi <[email protected]> Cc: Maarten Lankhorst <[email protected]> Fixes: 4f04d07c0a94 ("drm/xe: Faster devcoredump") Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 1c6878af115a4586a40d6c14d530fa9f93e0bd83) Signed-off-by: Thomas Hellström <[email protected]>
show more ...
|
| #
1c6878af |
| 26-Nov-2024 |
Matthew Brost <[email protected]> |
drm/xe: Take PM ref in delayed snapshot capture worker
The delayed snapshot capture worker can access the GPU or VRAM both of which require a PM reference. Take a reference in this worker.
Cc: Rodr
drm/xe: Take PM ref in delayed snapshot capture worker
The delayed snapshot capture worker can access the GPU or VRAM both of which require a PM reference. Take a reference in this worker.
Cc: Rodrigo Vivi <[email protected]> Cc: Maarten Lankhorst <[email protected]> Fixes: 4f04d07c0a94 ("drm/xe: Faster devcoredump") Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
|
Revision tags: v6.12 |
|
| #
a54b0de7 |
| 14-Nov-2024 |
Matthew Brost <[email protected]> |
drm/xe: Change xe_engine_snapshot_capture_for_job to be for queue
During capture time, the target job may be unavailable (e.g., if it's in LR mode). However, the associated exec queue will be availa
drm/xe: Change xe_engine_snapshot_capture_for_job to be for queue
During capture time, the target job may be unavailable (e.g., if it's in LR mode). However, the associated exec queue will be available regardless, change xe_engine_snapshot_capture_for_job to take a queue argument ann rename to xe_engine_snapshot_capture_for_queue.
v2: - Reword commit message (Jonathan) - Remove redundant queueu check (Zhanjun) - Remove devcoredump job member (Zhanjun)
Cc: Zhanjun Dong <[email protected]> Cc: Rodrigo Vivi <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
| #
f62e6edf |
| 14-Nov-2024 |
Matthew Brost <[email protected]> |
drm/xe: Add exec queue param to devcoredump
During capture time, the target job may be unavailable (e.g., if it's in LR mode). However, the associated exec queue will be available regardless, so add
drm/xe: Add exec queue param to devcoredump
During capture time, the target job may be unavailable (e.g., if it's in LR mode). However, the associated exec queue will be available regardless, so add an exec queue param for such cases.
v2: - Reword commit message (Jonathan)
Cc: Zhanjun Dong <[email protected]> Cc: Rodrigo Vivi <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
|
Revision tags: v6.12-rc7, v6.12-rc6 |
|
| #
aa06cb83 |
| 02-Nov-2024 |
Lucas De Marchi <[email protected]> |
drm/xe: Improve devcoredump documentation
Let the introduction be useful for both userspace and kernel. Also improve the formatting to wire up later to the documentation build.
Reviewed-by: Raag Ja
drm/xe: Improve devcoredump documentation
Let the introduction be useful for both userspace and kernel. Also improve the formatting to wire up later to the documentation build.
Reviewed-by: Raag Jadav <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc5 |
|
| #
db38fdb7 |
| 24-Oct-2024 |
John Harrison <[email protected]> |
drm/xe/guc: Separate full CTB content from guc_info debugfs
The guc_info debugfs file is meant to be a quick view of the current software state of the GuC interface. Including the full CTB contents
drm/xe/guc: Separate full CTB content from guc_info debugfs
The guc_info debugfs file is meant to be a quick view of the current software state of the GuC interface. Including the full CTB contents makes the file as a whole much less human readable and is not partiular useful in the general case. So don't pollute the info dump with the full buffers. Instead, move those into a separate debugfs entry that can be read when that information is actually required.
Also, improve the human readability by adding a few extra blank lines to delimt the sections.
v2: Hide the internal capture/print params from external callers that don't need to know (review feedback from Matthew Brost).
Signed-off-by: John Harrison <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
|
Revision tags: v6.12-rc4 |
|
| #
9ffd6ec2 |
| 14-Oct-2024 |
Himal Prasad Ghimiray <[email protected]> |
drm/xe/devcoredump: Update handling of xe_force_wake_get return
xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will
drm/xe/devcoredump: Update handling of xe_force_wake_get return
xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will always be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even in the event of failure. Use helper xe_force_wake_ref_has_domain to verify all domains are initialized or not. Update the return handling of xe_force_wake_get() to reflect this behavior, and ensure that the return value is passed as input to xe_force_wake_put().
v3 - return xe_wakeref_t instead of int in xe_force_wake_get()
v5 - return unsigned int for xe_force_wake_get()
v6 - use helper xe_force_wake_ref_has_domain()
v7 - Fix commit message
Cc: Rodrigo Vivi <[email protected]> Cc: Lucas De Marchi <[email protected]> Signed-off-by: Himal Prasad Ghimiray <[email protected]> Reviewed-by: Nirmoy Das <[email protected]> Reviewed-by: Badal Nilawar <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc3, v6.12-rc2 |
|
| #
0f1fdf55 |
| 04-Oct-2024 |
Zhanjun Dong <[email protected]> |
drm/xe/guc: Save manual engine capture into capture list
Save manual engine capture into capture list. This removes duplicate register definitions across manual-capture vs guc-err-capture.
Signed-o
drm/xe/guc: Save manual engine capture into capture list
Save manual engine capture into capture list. This removes duplicate register definitions across manual-capture vs guc-err-capture.
Signed-off-by: Zhanjun Dong <[email protected]> Reviewed-by: Alan Previn <[email protected]> Signed-off-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
| #
ecb63364 |
| 04-Oct-2024 |
Zhanjun Dong <[email protected]> |
drm/xe/guc: Plumb GuC-capture into dev coredump
When we decide to kill a job, (from guc_exec_queue_timedout_job), we could end up with 4 possible scenarios at this starting point of this decision: 1
drm/xe/guc: Plumb GuC-capture into dev coredump
When we decide to kill a job, (from guc_exec_queue_timedout_job), we could end up with 4 possible scenarios at this starting point of this decision: 1. the guc-captured register-dump is already there. 2. the driver is wedged.mode > 1, so GuC-engine-reset / GuC-err-capture will not happen. 3. the user has started the driver in execlist-submission mode. 4. the guc-captured register-dump is not ready yet so we force GuC to kill that context now, but: A. we don't know yet if GuC will be successful on the engine-reset and get the guc-err-capture, else kmd will do a manual reset later OR B. guc will be successful and we will get a guc-err-capture shortly.
So to accomdate the scenarios of 2 and 4A, we will need to do a manual KMD capture first(which is not be reliable in guc-submission mode) and decide later if we need to use that for the cases of 2 or 4A. So this flow is part of the implementation for this patch.
Provide xe_guc_capture_get_reg_desc_list to get the register dscriptor list. Add manual capture by read from hw engine if GuC capture is not ready. If it becomes ready at later time, GuC sourced data will be used.
Although there may only be a small delay between (1) the check for whether guc-err-capture is available at the start of guc_exec_queue_timedout_job and (2) the decision on using a valid guc-err-capture or manual-capture, lets not take any chances and lock the matching node down so it doesn't get re-claimed if GuC-Err-Capture subsystem is running out of pre-cached nodes.
Signed-off-by: Zhanjun Dong <[email protected]> Reviewed-by: Alan Previn <[email protected]> Signed-off-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
| #
8b7dfb98 |
| 03-Oct-2024 |
John Harrison <[email protected]> |
drm/xe/guc: Add GuC log to devcoredump captures
Include the GuC log in devcoredump captures because they can be useful with debugging certain types of bug.
v2: Fix kerneldoc v3: Drop module paramet
drm/xe/guc: Add GuC log to devcoredump captures
Include the GuC log in devcoredump captures because they can be useful with debugging certain types of bug.
v2: Fix kerneldoc v3: Drop module parameter as now using more compact ascii85 encoding rather than hexdump (although still not compressed) (review feedback from Matthew B). Rebase onto recent refactoring of devcoredump code. v4: Don't move the submission snapshot inside the GuC internals structure 'cos it really doesn't belong there.
Signed-off-by: John Harrison <[email protected]> Reviewed-by: Julia Filipchuk <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
| #
ec1455ce |
| 03-Oct-2024 |
John Harrison <[email protected]> |
drm/xe/devcoredump: Add ASCII85 dump helper function
There is a need to include the GuC log and other large binary objects in core dumps and via dmesg. So add a helper for dumping to a printer funct
drm/xe/devcoredump: Add ASCII85 dump helper function
There is a need to include the GuC log and other large binary objects in core dumps and via dmesg. So add a helper for dumping to a printer function via conversion to ASCII85 encoding.
Another issue with dumping such a large buffer is that it can be slow, especially if dumping to dmesg over a serial port. So add a yield to prevent the 'task has been stuck for 120s' kernel hang check feature from firing.
v2: Add a prefix to the output string. Fix memory allocation bug. v3: Correct a string size calculation and clean up a define (review feedback from Julia F).
Signed-off-by: John Harrison <[email protected]> Reviewed-by: Julia Filipchuk <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
| #
c28fd6c3 |
| 03-Oct-2024 |
John Harrison <[email protected]> |
drm/xe/devcoredump: Improve section headings and add tile info
The xe_guc_exec_queue_snapshot is not really a GuC internal thing and is definitely not a GuC CT thing. So give it its own section head
drm/xe/devcoredump: Improve section headings and add tile info
The xe_guc_exec_queue_snapshot is not really a GuC internal thing and is definitely not a GuC CT thing. So give it its own section heading. The snapshot itself is really a capture of the submission backend's internal state. Although all it currently prints out is the submission contexts. So label it as 'Contexts'. If more general state is added later then it could be change to 'Submission backend' or some such.
Further, everything from the GuC CT section onwards is GT specific but there was no indication of which GT it was related to (and that is impossible to work out from the other fields that are given). So add a GT section heading. Also include the tile id of the GT, because again significant information.
Lastly, drop a couple of unnecessary line feeds within sections.
v2: Add GT section heading, add tile id to device section.
Signed-off-by: John Harrison <[email protected]> Reviewed-by: Julia Filipchuk <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|