|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6 |
|
| #
0d83b8a9 |
| 04-Mar-2025 |
Caleb Sander Mateos <[email protected]> |
io_uring: introduce io_cache_free() helper
Add a helper function io_cache_free() that returns an allocation to a io_alloc_cache, falling back on kfree() if the io_alloc_cache is full. This is the in
io_uring: introduce io_cache_free() helper
Add a helper function io_cache_free() that returns an allocation to a io_alloc_cache, falling back on kfree() if the io_alloc_cache is full. This is the inverse of io_cache_alloc(), which takes an allocation from an io_alloc_cache and falls back on kmalloc() if the cache is empty.
Convert 4 callers to use the helper.
Signed-off-by: Caleb Sander Mateos <[email protected]> Suggested-by: Li Zetao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc5, v6.14-rc4 |
|
| #
bcf8a029 |
| 17-Feb-2025 |
Caleb Sander Mateos <[email protected]> |
io_uring: introduce type alias for io_tw_state
In preparation for changing how io_tw_state is passed, introduce a type alias io_tw_token_t for struct io_tw_state *. This allows for changing the repr
io_uring: introduce type alias for io_tw_state
In preparation for changing how io_tw_state is passed, introduce a type alias io_tw_token_t for struct io_tw_state *. This allows for changing the representation in one place, without having to update the many functions that just forward their struct io_tw_state * argument.
Also add a comment to struct io_tw_state to explain its purpose.
Signed-off-by: Caleb Sander Mateos <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc3, v6.14-rc2 |
|
| #
2eaa2fac |
| 05-Feb-2025 |
Jens Axboe <[email protected]> |
io_uring/futex: use generic io_cancel_remove() helper
Don't implement our own loop rolling and checking, just use the generic helper to find and cancel requests.
Signed-off-by: Jens Axboe <axboe@ke
io_uring/futex: use generic io_cancel_remove() helper
Don't implement our own loop rolling and checking, just use the generic helper to find and cancel requests.
Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
e855b913 |
| 05-Feb-2025 |
Jens Axboe <[email protected]> |
io_uring/futex: convert to io_cancel_remove_all()
Use the generic helper for cancelations.
Signed-off-by: Jens Axboe <[email protected]>
|
|
Revision tags: v6.14-rc1, v6.13 |
|
| #
5e0e02f0 |
| 15-Jan-2025 |
Jens Axboe <[email protected]> |
futex: Pass in task to futex_queue()
futex_queue() -> __futex_queue() uses 'current' as the task to store in the struct futex_q->task field. This is fine for synchronous usage of the futex infrastru
futex: Pass in task to futex_queue()
futex_queue() -> __futex_queue() uses 'current' as the task to store in the struct futex_q->task field. This is fine for synchronous usage of the futex infrastructure, but it's not always correct when used by io_uring where the task doing the initial futex_queue() might not be available later on. This doesn't lead to any issues currently, as the io_uring side doesn't support PI futexes, but it does leave a potentially dangling pointer which is never a good idea.
Have futex_queue() take a task_struct argument, and have the regular callers pass in 'current' for that. Meanwhile io_uring can just pass in NULL, as the task should never be used off that path. In theory req->tctx->task could be used here, but there's no point populating it with a task field that will never be used anyway.
Reported-by: Jann Horn <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
show more ...
|
| #
fa359552 |
| 23-Jan-2025 |
Jens Axboe <[email protected]> |
io_uring: get rid of alloc cache init_once handling
init_once is called when an object doesn't come from the cache, and hence needs initial clearing of certain members. While the whole struct could
io_uring: get rid of alloc cache init_once handling
init_once is called when an object doesn't come from the cache, and hence needs initial clearing of certain members. While the whole struct could get cleared by memset() in that case, a few of the cache members are large enough that this may cause unnecessary overhead if the caches used aren't large enough to satisfy the workload. For those cases, some churn of kmalloc+kfree is to be expected.
Ensure that the 3 users that need clearing put the members they need cleared at the start of the struct, and wrap the rest of the struct in a struct group so the offset is known.
While at it, improve the interaction with KASAN such that when/if KASAN writes to members inside the struct that should be retained over caching, it won't trip over itself. For rw and net, the retaining of the iovec over caching is disabled if KASAN is enabled. A helper will free and clear those members in that case.
Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4 |
|
| #
b2846567 |
| 16-Dec-2024 |
Gabriel Krisman Bertazi <[email protected]> |
io_uring/futex: Allocate ifd with generic alloc_cache helper
Instead of open-coding the allocation, use the generic alloc_cache helper.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]> Link
io_uring/futex: Allocate ifd with generic alloc_cache helper
Instead of open-coding the allocation, use the generic alloc_cache helper.
Signed-off-by: Gabriel Krisman Bertazi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6 |
|
| #
f03baece |
| 03-Nov-2024 |
Jens Axboe <[email protected]> |
io_uring: move cancelations to be io_uring_task based
Right now the task_struct pointer is used as the key to match a task, but in preparation for some io_kiocb changes, move it to using struct io_u
io_uring: move cancelations to be io_uring_task based
Right now the task_struct pointer is used as the key to match a task, but in preparation for some io_kiocb changes, move it to using struct io_uring_task instead. No functional changes intended in this patch.
Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1 |
|
| #
414d0f45 |
| 20-Mar-2024 |
Jens Axboe <[email protected]> |
io_uring/alloc_cache: switch to array based caching
Currently lists are being used to manage this, but best practice is usually to have these in an array instead as that it cheaper to manage.
Outsi
io_uring/alloc_cache: switch to array based caching
Currently lists are being used to manage this, but best practice is usually to have these in an array instead as that it cheaper to manage.
Outside of that detail, games are also played with KASAN as the list is inside the cached entry itself.
Finally, all users of this need a struct io_cache_entry embedded in their struct, which is union'ized with something else in there that isn't used across the free -> realloc cycle.
Get rid of all of that, and simply have it be an array. This will not change the memory used, as we're just trading an 8-byte member entry for the per-elem array size.
This reduces the overhead of the recycled allocations, and it reduces the amount of code code needed to support recycling to about half of what it currently is.
Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
30dab608 |
| 15-Mar-2024 |
Jens Axboe <[email protected]> |
io_uring/futex: always remove futex entry for cancel all
We know the request is either being removed, or already in the process of being removed through task_work, so we can delete it from our futex
io_uring/futex: always remove futex entry for cancel all
We know the request is either being removed, or already in the process of being removed through task_work, so we can delete it from our futex list upfront. This is important for remove all conditions, as we otherwise will find it multiple times and prevent cancelation progress.
Cc: [email protected] Fixes: 194bb58c6090 ("io_uring: add support for futex wake and wait") Fixes: 8f350194d5cf ("io_uring: add support for vectored futex waits") Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7 |
|
| #
8f350194 |
| 13-Jun-2023 |
Jens Axboe <[email protected]> |
io_uring: add support for vectored futex waits
This adds support for IORING_OP_FUTEX_WAITV, which allows registering a notification for a number of futexes at once. If one of the futexes are woken,
io_uring: add support for vectored futex waits
This adds support for IORING_OP_FUTEX_WAITV, which allows registering a notification for a number of futexes at once. If one of the futexes are woken, then the request will complete with the index of the futex that got woken as the result. This is identical to what the normal vectored futex waitv operation does.
Use like IORING_OP_FUTEX_WAIT, except sqe->addr must now contain a pointer to a struct futex_waitv array, and sqe->off must now contain the number of elements in that array. As flags are passed in the futex_vector array, and likewise for the value and futex address(es), sqe->addr2 and sqe->addr3 are also reserved for IORING_OP_FUTEX_WAITV.
For cancelations, FUTEX_WAITV does not rely on the futex_unqueue() return value as we're dealing with multiple futexes. Instead, a separate per io_uring request atomic is used to claim ownership of the request.
Waiting on N futexes could be done with IORING_OP_FUTEX_WAIT as well, but that punts a lot of the work to the application:
1) Application would need to submit N IORING_OP_FUTEX_WAIT requests, rather than just a single IORING_OP_FUTEX_WAITV.
2) When one futex is woken, application would need to cancel the remaining N-1 requests that didn't trigger.
While this is of course doable, having a single vectored futex wait makes for much simpler application code.
Acked-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc6 |
|
| #
194bb58c |
| 08-Jun-2023 |
Jens Axboe <[email protected]> |
io_uring: add support for futex wake and wait
Add support for FUTEX_WAKE/WAIT primitives.
IORING_OP_FUTEX_WAKE is mix of FUTEX_WAKE and FUTEX_WAKE_BITSET, as it does support passing in a bitset.
S
io_uring: add support for futex wake and wait
Add support for FUTEX_WAKE/WAIT primitives.
IORING_OP_FUTEX_WAKE is mix of FUTEX_WAKE and FUTEX_WAKE_BITSET, as it does support passing in a bitset.
Similary, IORING_OP_FUTEX_WAIT is a mix of FUTEX_WAIT and FUTEX_WAIT_BITSET.
For both of them, they are using the futex2 interface.
FUTEX_WAKE is straight forward, as those can always be done directly from the io_uring submission without needing async handling. For FUTEX_WAIT, things are a bit more complicated. If the futex isn't ready, then we rely on a callback via futex_queue->wake() when someone wakes up the futex. From that calback, we queue up task_work with the original task, which will post a CQE and wake it, if necessary.
Cancelations are supported, both from the application point-of-view, but also to be able to cancel pending waits if the ring exits before all events have occurred. The return value of futex_unqueue() is used to gate who wins the potential race between cancelation and futex wakeups. Whomever gets a 'ret == 1' return from that claims ownership of the io_uring futex request.
This is just the barebones wait/wake support. PI or REQUEUE support is not added at this point, unclear if we might look into that later.
Likewise, explicit timeouts are not supported either. It is expected that users that need timeouts would do so via the usual io_uring mechanism to do that using linked timeouts.
The SQE format is as follows:
`addr` Address of futex `fd` futex2(2) FUTEX2_* flags `futex_flags` io_uring specific command flags. None valid now. `addr2` Value of futex `addr3` Mask to wake/wait
Acked-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
show more ...
|