History log of /linux-6.15/include/drm/gpu_scheduler.h (Results 1 – 25 of 107)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4
# 27d48151 21-Feb-2025 Tvrtko Ursulin <[email protected]>

drm/sched: Group exported prototypes by object type

Do a bit of house keeping in gpu_scheduler.h by grouping the API by type
of object it operates on.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@

drm/sched: Group exported prototypes by object type

Do a bit of house keeping in gpu_scheduler.h by grouping the API by type
of object it operates on.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Danilo Krummrich <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# 71a18f72 21-Feb-2025 Tvrtko Ursulin <[email protected]>

drm/sched: Move internal prototypes to internal header

Now that we have a header file for internal scheduler interfaces we can
move some more prototypes into it. By doing that we eliminate the chanc

drm/sched: Move internal prototypes to internal header

Now that we have a header file for internal scheduler interfaces we can
move some more prototypes into it. By doing that we eliminate the chance
of drivers trying to use something which was not intended to be used.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Danilo Krummrich <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# 4b7320bf 21-Feb-2025 Tvrtko Ursulin <[email protected]>

drm/sched: Move drm_sched_entity_is_ready to internal header

Helper is for scheduler internal use so lets hide it from DRM drivers
completely.

At the same time we change the method of checking whet

drm/sched: Move drm_sched_entity_is_ready to internal header

Helper is for scheduler internal use so lets hide it from DRM drivers
completely.

At the same time we change the method of checking whethere there is
anything in the queue from peeking to looking at the node count.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Danilo Krummrich <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# b76f1467 21-Feb-2025 Tvrtko Ursulin <[email protected]>

drm/sched: Remove a hole from struct drm_sched_job

We can re-order some struct members and take u32 credits outside of the
pointer sandwich and also for the last_dependency member we can get away
wi

drm/sched: Remove a hole from struct drm_sched_job

We can re-order some struct members and take u32 credits outside of the
pointer sandwich and also for the last_dependency member we can get away
with an unsigned int since for dependency we use xa_limit_32b.

Pahole report before:
/* size: 160, cachelines: 3, members: 14 */
/* sum members: 156, holes: 1, sum holes: 4 */
/* last cacheline: 32 bytes */

And after:
/* size: 152, cachelines: 3, members: 14 */
/* last cacheline: 24 bytes */

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Danilo Krummrich <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Acked-by: Danilo Krummrich <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.14-rc3
# 796a9f55 11-Feb-2025 Philipp Stanner <[email protected]>

drm/sched: Use struct for drm_sched_init() params

drm_sched_init() has a great many parameters and upcoming new
functionality for the scheduler might add even more. Generally, the
great number of pa

drm/sched: Use struct for drm_sched_init() params

drm_sched_init() has a great many parameters and upcoming new
functionality for the scheduler might add even more. Generally, the
great number of parameters reduces readability and has already caused
one missnaming, addressed in:

commit 6f1cacf4eba7 ("drm/nouveau: Improve variable name in
nouveau_sched_init()").

Introduce a new struct for the scheduler init parameters and port all
users.

Reviewed-by: Liviu Dudau <[email protected]>
Acked-by: Matthew Brost <[email protected]> # for Xe
Reviewed-by: Boris Brezillon <[email protected]> # for Panfrost and Panthor
Reviewed-by: Christian Gmeiner <[email protected]> # for Etnaviv
Reviewed-by: Frank Binns <[email protected]> # for Imagination
Reviewed-by: Tvrtko Ursulin <[email protected]> # for Sched
Reviewed-by: Maíra Canal <[email protected]> # for v3d
Reviewed-by: Danilo Krummrich <[email protected]>
Reviewed-by: Lizhi Hou <[email protected]> # for amdxdna
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.14-rc2
# 2eca617f 05-Feb-2025 Tvrtko Ursulin <[email protected]>

drm/scheduler: Remove some unused prototypes

As far as I can tell some removed prototypes were introduced by probably
bad conflict resolution in
fc58764bbf60 ("Merge tag 'amd-drm-next-6.2-2022-11-18

drm/scheduler: Remove some unused prototypes

As far as I can tell some removed prototypes were introduced by probably
bad conflict resolution in
fc58764bbf60 ("Merge tag 'amd-drm-next-6.2-2022-11-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-next").

Remove them.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Danilo Krummrich <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.14-rc1, v6.13
# 51678bb9 13-Jan-2025 Tvrtko Ursulin <[email protected]>

drm/sched: Add helper to check job dependencies

Lets isolate scheduler internals from drivers such as pvr which currently
walks the dependency array to look for fences.

Signed-off-by: Tvrtko Ursuli

drm/sched: Add helper to check job dependencies

Lets isolate scheduler internals from drivers such as pvr which currently
walks the dependency array to look for fences.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Danilo Krummrich <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Reviewed-by: Matt Coster <[email protected]>
Acked-by: Danilo Krummrich <[email protected]>
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.13-rc7
# 573b73e5 10-Jan-2025 Tvrtko Ursulin <[email protected]>

drm/sched: Delete unused update_job_credits

No driver is using the update_job_credits() schduler vfunc
so lets remove it.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian Kön

drm/sched: Delete unused update_job_credits

No driver is using the update_job_credits() schduler vfunc
so lets remove it.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Danilo Krummrich <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Acked-by: Danilo Krummrich <[email protected]>
Acked-by: Boris Brezillon <[email protected]>
Acked-by: Matt Coster <[email protected]>
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5
# 3ae80b37 23-Oct-2024 Philipp Stanner <[email protected]>

drm/sched: warn about drm_sched_job_init()'s partial init

drm_sched_job_init()'s name suggests that after the function succeeded,
parameter "job" will be fully initialized. This is not the case; som

drm/sched: warn about drm_sched_job_init()'s partial init

drm_sched_job_init()'s name suggests that after the function succeeded,
parameter "job" will be fully initialized. This is not the case; some
members are only later set, notably drm_sched_job.sched by
drm_sched_job_arm().

Document that drm_sched_job_init() does not set all struct members.

Document the lifetime of drm_sched_job.sched.

Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.12-rc4
# 134e71bd 16-Oct-2024 Tvrtko Ursulin <[email protected]>

drm/sched: Further optimise drm_sched_entity_push_job

Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit
lar

drm/sched: Further optimise drm_sched_entity_push_job

Having removed one re-lock cycle on the entity->lock in a patch titled
"drm/sched: Optimise drm_sched_entity_push_job", with only a tiny bit
larger refactoring we can do the same optimisation on the rq->lock.
(Currently both drm_sched_rq_add_entity() and
drm_sched_rq_update_fifo_locked() take and release the same lock.)

To achieve this we make drm_sched_rq_update_fifo_locked() and
drm_sched_rq_add_entity() expect the rq->lock to be held.

We also align drm_sched_rq_update_fifo_locked(),
drm_sched_rq_add_entity() and
drm_sched_rq_remove_fifo_locked() function signatures, by adding rq as a
parameter to the latter.

v2:
* Fix after rebase of the series.
* Avoid naming inconsistency between drm_sched_rq_add/remove. (Christian)

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Philipp Stanner <[email protected]>
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# f93126f5 16-Oct-2024 Tvrtko Ursulin <[email protected]>

drm/sched: Re-group and rename the entity run-queue lock

When writing to a drm_sched_entity's run-queue, writers are protected
through the lock drm_sched_entity.rq_lock. This naming, however,
freque

drm/sched: Re-group and rename the entity run-queue lock

When writing to a drm_sched_entity's run-queue, writers are protected
through the lock drm_sched_entity.rq_lock. This naming, however,
frequently collides with the separate internal lock of struct
drm_sched_rq, resulting in uses like this:

spin_lock(&entity->rq_lock);
spin_lock(&entity->rq->lock);

Rename drm_sched_entity.rq_lock to improve readability. While at it,
re-order that struct's members to make it more obvious what the lock
protects.

v2:
* Rename some rq_lock straddlers in kerneldoc, improve commit text. (Philipp)

Signed-off-by: Tvrtko Ursulin <[email protected]>
Suggested-by: Christian König <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Reviewed-by: Christian König <[email protected]>
[pstanner: Fix typo in docstring]
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# a6f46283 16-Oct-2024 Tvrtko Ursulin <[email protected]>

drm/sched: Re-order struct drm_sched_rq members for clarity

Current kerneldoc for struct drm_sched_rq incompletely documents what
fields are protected by the lock.

This is not good because it is mi

drm/sched: Re-order struct drm_sched_rq members for clarity

Current kerneldoc for struct drm_sched_rq incompletely documents what
fields are protected by the lock.

This is not good because it is misleading.

Lets fix it by listing all the elements which are protected by the lock.

While at it, lets also re-order the members so all protected by the lock
are in a single group.

v2:
* Refer variables by kerneldoc syntax, more verbose commit text. (Philipp)

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Philipp Stanner <[email protected]>
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# d42a2546 16-Oct-2024 Tvrtko Ursulin <[email protected]>

drm/sched: Optimise drm_sched_entity_push_job

In FIFO mode (which is the default), both drm_sched_entity_push_job() and
drm_sched_rq_update_fifo(), where the latter calls the former, are
currently t

drm/sched: Optimise drm_sched_entity_push_job

In FIFO mode (which is the default), both drm_sched_entity_push_job() and
drm_sched_rq_update_fifo(), where the latter calls the former, are
currently taking and releasing the same entity->rq_lock.

We can avoid that design inelegance, and also have a miniscule
efficiency improvement on the submit from idle path, by introducing a new
drm_sched_rq_update_fifo_locked() helper and pulling up the lock taking to
its callers.

v2:
* Remove drm_sched_rq_update_fifo() altogether. (Christian)

v3:
* Improved commit message. (Philipp)

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Christian König <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Philipp Stanner <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Philipp Stanner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.12-rc3, v6.12-rc2, v6.12-rc1
# 1e436f4f 17-Sep-2024 Shuicheng Lin <[email protected]>

drm/scheduler: Improve documentation

Function drm_sched_entity_push_job() doesn't have a return value,
remove the return value description for it.
Correct several other typo errors.

v2 (Philipp):
-

drm/scheduler: Improve documentation

Function drm_sched_entity_push_job() doesn't have a return value,
remove the return value description for it.
Correct several other typo errors.

v2 (Philipp):
- more correction with related comments.

Signed-off-by: Shuicheng Lin <[email protected]>
Reviewed-by: Philipp Stanner <[email protected]>
Signed-off-by: Simona Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.11
# 440d52b3 13-Sep-2024 Rob Clark <[email protected]>

drm/sched: Fix dynamic job-flow control race

Fixes a race condition reported here: https://github.com/AsahiLinux/linux/issues/309#issuecomment-2238968609

The whole premise of lockless access to a s

drm/sched: Fix dynamic job-flow control race

Fixes a race condition reported here: https://github.com/AsahiLinux/linux/issues/309#issuecomment-2238968609

The whole premise of lockless access to a single-producer-single-
consumer queue is that there is just a single producer and single
consumer. That means we can't call drm_sched_can_queue() (which is
about queueing more work to the hw, not to the spsc queue) from
anywhere other than the consumer (wq).

This call in the producer is just an optimization to avoid scheduling
the consuming worker if it cannot yet queue more work to the hw. It
is safe to drop this optimization to avoid the race condition.

Suggested-by: Asahi Lina <[email protected]>
Fixes: a78422e9dff3 ("drm/sched: implement dynamic job-flow control")
Closes: https://github.com/AsahiLinux/linux/issues/309
Cc: [email protected]
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Danilo Krummrich <[email protected]>
Tested-by: Janne Grunau <[email protected]>
Signed-off-by: Danilo Krummrich <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.11-rc7, v6.11-rc6
# b2ef8087 26-Aug-2024 Christian König <[email protected]>

drm/sched: add optional errno to drm_sched_start()

The current implementation of drm_sched_start uses a hardcoded
-ECANCELED to dispose of a job when the parent/hw fence is NULL.
This results in drm

drm/sched: add optional errno to drm_sched_start()

The current implementation of drm_sched_start uses a hardcoded
-ECANCELED to dispose of a job when the parent/hw fence is NULL.
This results in drm_sched_job_done being called with -ECANCELED for
each job with a NULL parent in the pending list, making it difficult
to distinguish between recovery methods, whether a queue reset or a
full GPU reset was used.

To improve this, we first try a soft recovery for timeout jobs and
use the error code -ENODATA. If soft recovery fails, we proceed with
a queue reset, where the error code remains -ENODATA for the job.
Finally, for a full GPU reset, we use error codes -ECANCELED or
-ETIME. This patch adds an error code parameter to drm_sched_start,
allowing us to differentiate between queue reset and GPU reset
failures. This enables user mode and test applications to validate
the expected correctness of the requested operation. After a
successful queue reset, the only way to continue normal operation is
to call drm_sched_job_done with the specific error code -ENODATA.

v1: Initial implementation by Jesse utilized amdgpu_device_lock_reset_domain
and amdgpu_device_unlock_reset_domain to allow user mode to track
the queue reset status and distinguish between queue reset and
GPU reset.
v2: Christian suggested using the error codes -ENODATA for queue reset
and -ECANCELED or -ETIME for GPU reset, returned to
amdgpu_cs_wait_ioctl.
v3: To meet the requirements, we introduce a new function
drm_sched_start_ex with an additional parameter to set
dma_fence_set_error, allowing us to handle the specific error
codes appropriately and dispose of bad jobs with the selected
error code depending on whether it was a queue reset or GPU reset.
v4: Alex suggested using a new name, drm_sched_start_with_recovery_error,
which more accurately describes the function's purpose.
Additionally, it was recommended to add documentation details
about the new method.
v5: Fixed declaration of new function drm_sched_start_with_recovery_error.(Alex)
v6 (chk): rebase on upstream changes, cleanup the commit message,
drop the new function again and update all callers,
apply the errno also to scheduler fences with hw fences
v7 (chk): rebased

Signed-off-by: Jesse Zhang <[email protected]>
Signed-off-by: Vitaly Prosyak <[email protected]>
Signed-off-by: Christian König <[email protected]>
Acked-by: Daniel Vetter <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1
# 83b501c1 19-Jul-2024 Christian König <[email protected]>

drm/scheduler: remove full_recover from drm_sched_start

This was basically just another one of amdgpus hacks. The parameter
allowed to restart the scheduler without turning fence signaling on
again.

drm/scheduler: remove full_recover from drm_sched_start

This was basically just another one of amdgpus hacks. The parameter
allowed to restart the scheduler without turning fence signaling on
again.

That this is absolutely not a good idea should be obvious by now since
the fences will then just sit there and never signal.

While at it cleanup the code a bit.

Signed-off-by: Christian König <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3
# 38f922a5 23-Nov-2023 Luben Tuikov <[email protected]>

drm/sched: Reverse run-queue priority enumeration

Reverse run-queue priority enumeration such that the higest priority is now 0,
and for each consecutive integer the prioirty diminishes.

Run-queues

drm/sched: Reverse run-queue priority enumeration

Reverse run-queue priority enumeration such that the higest priority is now 0,
and for each consecutive integer the prioirty diminishes.

Run-queues correspond to priorities. To an external observer a scheduler
created with a single run-queue, and another created with
DRM_SCHED_PRIORITY_COUNT number of run-queues, should always schedule
sched->sched_rq[0] with the same "priority", as that index run-queue exists in
both schedulers, i.e. a scheduler with one run-queue or many. This patch makes
it so.

In other words, the "priority" of sched->sched_rq[n], n >= 0, is the same for
any scheduler created with any allowable number of run-queues (priorities), 0
to DRM_SCHED_PRIORITY_COUNT.

Cc: Rob Clark <[email protected]>
Cc: Abhinav Kumar <[email protected]>
Cc: Dmitry Baryshkov <[email protected]>
Cc: Danilo Krummrich <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.7-rc2
# fe375c74 15-Nov-2023 Luben Tuikov <[email protected]>

drm/sched: Rename priority MIN to LOW

Rename DRM_SCHED_PRIORITY_MIN to DRM_SCHED_PRIORITY_LOW.

This mirrors DRM_SCHED_PRIORITY_HIGH, for a list of DRM scheduler priorities
in ascending order,
DRM

drm/sched: Rename priority MIN to LOW

Rename DRM_SCHED_PRIORITY_MIN to DRM_SCHED_PRIORITY_LOW.

This mirrors DRM_SCHED_PRIORITY_HIGH, for a list of DRM scheduler priorities
in ascending order,
DRM_SCHED_PRIORITY_LOW,
DRM_SCHED_PRIORITY_NORMAL,
DRM_SCHED_PRIORITY_HIGH,
DRM_SCHED_PRIORITY_KERNEL.

Cc: Rob Clark <[email protected]>
Cc: Abhinav Kumar <[email protected]>
Cc: Dmitry Baryshkov <[email protected]>
Cc: Danilo Krummrich <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


Revision tags: v6.7-rc1
# a78422e9 10-Nov-2023 Danilo Krummrich <[email protected]>

drm/sched: implement dynamic job-flow control

Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit th

drm/sched: implement dynamic job-flow control

Currently, job flow control is implemented simply by limiting the number
of jobs in flight. Therefore, a scheduler is initialized with a credit
limit that corresponds to the number of jobs which can be sent to the
hardware.

This implies that for each job, drivers need to account for the maximum
job size possible in order to not overflow the ring buffer.

However, there are drivers, such as Nouveau, where the job size has a
rather large range. For such drivers it can easily happen that job
submissions not even filling the ring by 1% can block subsequent
submissions, which, in the worst case, can lead to the ring run dry.

In order to overcome this issue, allow for tracking the actual job size
instead of the number of jobs. Therefore, add a field to track a job's
credit count, which represents the number of credits a job contributes
to the scheduler's credit limit.

Signed-off-by: Danilo Krummrich <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# f3123c25 09-Nov-2023 Luben Tuikov <[email protected]>

drm/sched: Qualify drm_sched_wakeup() by drm_sched_entity_is_ready()

Don't "wake up" the GPU scheduler unless the entity is ready, as well as we
can queue to the scheduler, i.e. there is no point in

drm/sched: Qualify drm_sched_wakeup() by drm_sched_entity_is_ready()

Don't "wake up" the GPU scheduler unless the entity is ready, as well as we
can queue to the scheduler, i.e. there is no point in waking up the scheduler
for the entity unless the entity is ready.

Signed-off-by: Luben Tuikov <[email protected]>
Fixes: bc8d6a9df99038 ("drm/sched: Don't disturb the entity when in RR-mode scheduling")
Reviewed-by: Danilo Krummrich <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

show more ...


# f12af4c4 02-Nov-2023 Tvrtko Ursulin <[email protected]>

drm/sched: Drop suffix from drm_sched_wakeup_if_can_queue

Because a) helper is exported to other parts of the scheduler and
b) there isn't a plain drm_sched_wakeup to begin with, I think we can
drop

drm/sched: Drop suffix from drm_sched_wakeup_if_can_queue

Because a) helper is exported to other parts of the scheduler and
b) there isn't a plain drm_sched_wakeup to begin with, I think we can
drop the suffix and by doing so separate the intimiate knowledge
between the scheduler components a bit better.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>

show more ...


# 3c6c7ca4 31-Oct-2023 Matthew Brost <[email protected]>

drm/sched: Add a helper to queue TDR immediately

Add a helper whereby a driver can invoke TDR immediately.

v2:
- Drop timeout args, rename function, use mod delayed work (Luben)
v3:
- s/XE/Xe (Lu

drm/sched: Add a helper to queue TDR immediately

Add a helper whereby a driver can invoke TDR immediately.

v2:
- Drop timeout args, rename function, use mod delayed work (Luben)
v3:
- s/XE/Xe (Luben)
- present tense in commit message (Luben)
- Adjust comment for drm_sched_tdr_queue_imm (Luben)
v4:
- Adjust commit message (Luben)

Cc: Luben Tuikov <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Luben Tuikov <[email protected]>

show more ...


# f7fe64ad 31-Oct-2023 Matthew Brost <[email protected]>

drm/sched: Split free_job into own work item

Rather than call free_job and run_job in same work item have a dedicated
work item for each. This aligns with the design and intended use of work
queues.

drm/sched: Split free_job into own work item

Rather than call free_job and run_job in same work item have a dedicated
work item for each. This aligns with the design and intended use of work
queues.

v2:
- Test for DMA_FENCE_FLAG_TIMESTAMP_BIT before setting
timestamp in free_job() work item (Danilo)
v3:
- Drop forward dec of drm_sched_select_entity (Boris)
- Return in drm_sched_run_job_work if entity NULL (Boris)
v4:
- Replace dequeue with peek and invert logic (Luben)
- Wrap to 100 lines (Luben)
- Update comments for *_queue / *_queue_if_ready functions (Luben)
v5:
- Drop peek argument, blindly reinit idle (Luben)
- s/drm_sched_free_job_queue_if_ready/drm_sched_free_job_queue_if_done (Luben)
- Update work_run_job & work_free_job kernel doc (Luben)
v6:
- Do not move drm_sched_select_entity in file (Luben)

Signed-off-by: Matthew Brost <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>

show more ...


# a6149f03 31-Oct-2023 Matthew Brost <[email protected]>

drm/sched: Convert drm scheduler to use a work queue rather than kthread

In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity.

drm/sched: Convert drm scheduler to use a work queue rather than kthread

In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.

1. In Xe the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in Xe we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.

2. In Xe submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.

A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.

v2:
- (Rob Clark) Fix msm build
- Pass in run work queue
v3:
- (Boris) don't have loop in worker
v4:
- (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
- (Boris) default to ordered work queue
v6:
- (Luben / checkpatch) fix alignment in msm_ringbuffer.c
- (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue
- (Luben) Update comment for drm_sched_wqueue_enqueue
- (Luben) Positive check for submit_wq in drm_sched_init
- (Luben) s/alloc_submit_wq/own_submit_wq
v7:
- (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue
v8:
- (Luben) Adjust var names / comments

Signed-off-by: Matthew Brost <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Luben Tuikov <[email protected]>

show more ...


12345