History log of /linux-6.15/io_uring/io_uring.c (Results 1 – 25 of 563)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7
# a7d755ed 17-May-2025 Pavel Begunkov <[email protected]>

io_uring: fix overflow resched cqe reordering

Leaving the CQ critical section in the middle of a overflow flushing
can cause cqe reordering since the cache cq pointers are reset and any
new cqe emit

io_uring: fix overflow resched cqe reordering

Leaving the CQ critical section in the middle of a overflow flushing
can cause cqe reordering since the cache cq pointers are reset and any
new cqe emitters that might get called in between are not going to be
forced into io_cqe_cache_refill().

Fixes: eac2ca2d682f9 ("io_uring: check if we need to reschedule during overflow flush")
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/90ba817f1a458f091f355f407de1c911d2b93bbf.1747483784.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.15-rc6
# 687b2bae 07-May-2025 Jens Axboe <[email protected]>

io_uring: ensure deferred completions are flushed for multishot

Multishot normally uses io_req_post_cqe() to post completions, but when
stopping it, it may finish up with a deferred completion. This

io_uring: ensure deferred completions are flushed for multishot

Multishot normally uses io_req_post_cqe() to post completions, but when
stopping it, it may finish up with a deferred completion. This is fine,
except if another multishot event triggers before the deferred completions
get flushed. If this occurs, then CQEs may get reordered in the CQ ring,
as new multishot completions get posted before the deferred ones are
flushed. This can cause confusion on the application side, if strict
ordering is required for the use case.

When multishot posting via io_req_post_cqe(), flush any pending deferred
completions first, if any.

Cc: [email protected] # 6.1+
Reported-by: Norman Maurer <[email protected]>
Reported-by: Christian Mazakas <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.15-rc5
# b53e5232 04-May-2025 Jens Axboe <[email protected]>

io_uring: always arm linked timeouts prior to issue

There are a few spots where linked timeouts are armed, and not all of
them adhere to the pre-arm, attempt issue, post-arm pattern. This can
be pro

io_uring: always arm linked timeouts prior to issue

There are a few spots where linked timeouts are armed, and not all of
them adhere to the pre-arm, attempt issue, post-arm pattern. This can
be problematic if the linked request returns that it will trigger a
callback later, and does so before the linked timeout is fully armed.

Consolidate all the linked timeout handling into __io_issue_sqe(),
rather than have it spread throughout the various issue entry points.

Cc: [email protected]
Link: https://github.com/axboe/liburing/issues/1390
Reported-by: Chase Hiltz <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.15-rc4
# edd43f4d 24-Apr-2025 Jens Axboe <[email protected]>

io_uring: fix 'sync' handling of io_fallback_tw()

A previous commit added a 'sync' parameter to io_fallback_tw(), which if
true, means the caller wants to wait on the fallback thread handling it.
Bu

io_uring: fix 'sync' handling of io_fallback_tw()

A previous commit added a 'sync' parameter to io_fallback_tw(), which if
true, means the caller wants to wait on the fallback thread handling it.
But the logic is somewhat messed up, ensure that ctxs are swapped and
flushed appropriately.

Cc: [email protected]
Fixes: dfbe5561ae93 ("io_uring: flush offloaded and delayed task_work on exit")
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 5e16f1a6 24-Apr-2025 Pavel Begunkov <[email protected]>

io_uring: don't duplicate flushing in io_req_post_cqe

io_req_post_cqe() sets submit_state.cq_flush so that
*flush_completions() can take care of batch commiting CQEs. Don't commit
it twice by using

io_uring: don't duplicate flushing in io_req_post_cqe

io_req_post_cqe() sets submit_state.cq_flush so that
*flush_completions() can take care of batch commiting CQEs. Don't commit
it twice by using __io_cq_unlock_post().

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/41c416660c509cee676b6cad96081274bcb459f3.1745493861.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.15-rc3, v6.15-rc2, v6.15-rc1
# 39051364 03-Apr-2025 Pavel Begunkov <[email protected]>

io_uring: always do atomic put from iowq

io_uring always switches requests to atomic refcounting for iowq
execution before there is any parallilism by setting REQ_F_REFCOUNT,
and the flag is not cle

io_uring: always do atomic put from iowq

io_uring always switches requests to atomic refcounting for iowq
execution before there is any parallilism by setting REQ_F_REFCOUNT,
and the flag is not cleared until the request completes. That should be
fine as long as the compiler doesn't make up a non existing value for
the flags, however KCSAN still complains when the request owner changes
oter flag bits:

BUG: KCSAN: data-race in io_req_task_cancel / io_wq_free_work
...
read to 0xffff888117207448 of 8 bytes by task 3871 on cpu 0:
req_ref_put_and_test io_uring/refs.h:22 [inline]

Skip REQ_F_REFCOUNT checks for iowq, we know it's set.

Reported-by: [email protected]
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/d880bc27fb8c3209b54641be4ff6ac02b0e5789a.1743679736.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 697b2876 31-Mar-2025 Pavel Begunkov <[email protected]>

io_uring: add req flag invariant build assertion

We're caching some of file related request flags in a tricky way, put
a build check to make sure flags don't get reshuffled.

Signed-off-by: Pavel Be

io_uring: add req flag invariant build assertion

We're caching some of file related request flags in a tricky way, put
a build check to make sure flags don't get reshuffled.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/9877577b83c25dd78224a8274f799187e7ec7639.1743407551.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# ea910678 28-Mar-2025 Pavel Begunkov <[email protected]>

io_uring: don't pass ctx to tw add remote helper

Unlike earlier versions, io_msg_remote_post() creates a valid request
with a proper context, so don't pass a context to
io_req_task_work_add_remote()

io_uring: don't pass ctx to tw add remote helper

Unlike earlier versions, io_msg_remote_post() creates a valid request
with a proper context, so don't pass a context to
io_req_task_work_add_remote() explicitly but derive it from the request.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/721f51cf34996d98b48f0bfd24ad40aa2730167e.1743190078.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 6889ae1b 27-Mar-2025 Pavel Begunkov <[email protected]>

io_uring/net: fix io_req_post_cqe abuse by send bundle

[ 114.987980][ T5313] WARNING: CPU: 6 PID: 5313 at io_uring/io_uring.c:872 io_req_post_cqe+0x12e/0x4f0
[ 114.991597][ T5313] RIP: 0010:io_req

io_uring/net: fix io_req_post_cqe abuse by send bundle

[ 114.987980][ T5313] WARNING: CPU: 6 PID: 5313 at io_uring/io_uring.c:872 io_req_post_cqe+0x12e/0x4f0
[ 114.991597][ T5313] RIP: 0010:io_req_post_cqe+0x12e/0x4f0
[ 115.001880][ T5313] Call Trace:
[ 115.002222][ T5313] <TASK>
[ 115.007813][ T5313] io_send+0x4fe/0x10f0
[ 115.009317][ T5313] io_issue_sqe+0x1a6/0x1740
[ 115.012094][ T5313] io_wq_submit_work+0x38b/0xed0
[ 115.013223][ T5313] io_worker_handle_work+0x62a/0x1600
[ 115.013876][ T5313] io_wq_worker+0x34f/0xdf0

As the comment states, io_req_post_cqe() should only be used by
multishot requests, i.e. REQ_F_APOLL_MULTISHOT, which bundled sends are
not. Add a flag signifying whether a request wants to post multiple
CQEs. Eventually REQ_F_APOLL_MULTISHOT should imply the new flag, but
that's left out for simplicity.

Cc: [email protected]
Fixes: a05d1f625c7aa ("io_uring/net: support bundles for send")
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/8b611dbb54d1cd47a88681f5d38c84d0c02bc563.1743067183.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 81661978 24-Mar-2025 Pavel Begunkov <[email protected]>

io_uring: move min_events sanitisation

iopoll and normal waiting already duplicate min_completion truncation,
so move them inside the corresponding routines.

Signed-off-by: Pavel Begunkov <asml.sil

io_uring: move min_events sanitisation

iopoll and normal waiting already duplicate min_completion truncation,
so move them inside the corresponding routines.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/254adb289cc04638f25d746a7499260fa89a179e.1742829388.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# d73acd7a 24-Mar-2025 Pavel Begunkov <[email protected]>

io_uring: rename "min" arg in io_iopoll_check()

Don't name arguments "min", it shadows the namesake function.
min_events is also more consistent.

Signed-off-by: Pavel Begunkov <[email protected]

io_uring: rename "min" arg in io_iopoll_check()

Don't name arguments "min", it shadows the namesake function.
min_events is also more consistent.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/f52ce9d88d3bca5732a218b0da14924aa6968909.1742829388.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 4c76de42 24-Mar-2025 Pavel Begunkov <[email protected]>

io_uring: open code __io_post_aux_cqe()

There is no reason to keep __io_post_aux_cqe() separately from
io_post_aux_cqe().

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.k

io_uring: open code __io_post_aux_cqe()

There is no reason to keep __io_post_aux_cqe() separately from
io_post_aux_cqe().

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/2c4c1f68d694deea25a212fc09bbb11f330cd82e.1742829388.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 3afcb3b2 24-Mar-2025 Pavel Begunkov <[email protected]>

io_uring: defer iowq cqe overflow via task_work

Don't handle CQE overflows in io_req_complete_post() and defer it to
flush_completions. It cuts some duplication, and I also want to limit
the number

io_uring: defer iowq cqe overflow via task_work

Don't handle CQE overflows in io_req_complete_post() and defer it to
flush_completions. It cuts some duplication, and I also want to limit
the number of places directly overflowing completions.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/9046410ac27e18f2baa6f7cdb363ec921cbc3b79.1742829388.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 3f0cb8de 24-Mar-2025 Pavel Begunkov <[email protected]>

io_uring: fix retry handling off iowq

io_req_complete_post() doesn't handle reissue and if called with a
REQ_F_REISSUE request it might post extra unexpected completions. Fix it
by pushing into flus

io_uring: fix retry handling off iowq

io_req_complete_post() doesn't handle reissue and if called with a
REQ_F_REISSUE request it might post extra unexpected completions. Fix it
by pushing into flush_completion via task work.

Fixes: d803d123948fe ("io_uring/rw: handle -EAGAIN retry at IO completion time")
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/badb3d7e462881e7edbfcc2be6301090b07dbe53.1742829388.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.14
# 3a4689ac 21-Mar-2025 Pavel Begunkov <[email protected]>

io_uring/cmd: add iovec cache for commands

Add iou_vec to commands and wire caching for it, but don't expose it to
users just yet. We need the vec cleared on initial alloc, but since
we can't place

io_uring/cmd: add iovec cache for commands

Add iou_vec to commands and wire caching for it, but don't expose it to
users just yet. We need the vec cleared on initial alloc, but since
we can't place it at the beginning at the moment, zero the entire
async_data. It's cached, and the performance effects only the initial
allocation, and it might be not a bad idea since we're exposing those
bits to outside drivers.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/c0f2145b75791bc6106eb4e72add2cf6a2c72a7a.1742579999.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.14-rc7
# 07754bfd 14-Mar-2025 Jens Axboe <[email protected]>

io_uring: enable toggle of iowait usage when waiting on CQEs

By default, io_uring marks a waiting task as being in iowait, if it's
sleeping waiting on events and there are pending requests. This isn

io_uring: enable toggle of iowait usage when waiting on CQEs

By default, io_uring marks a waiting task as being in iowait, if it's
sleeping waiting on events and there are pending requests. This isn't
necessarily always useful, and may be confusing on non-storage setups
where iowait isn't expected. It can also cause extra power usage, by
preventing the CPU from entering lower sleep states.

This adds a new enter flag, IORING_ENTER_NO_IOWAIT. If set, then
io_uring will not account the sleeping task as being in iowait. If the
kernel supports this feature, then it will be marked by having the
IORING_FEAT_NO_IOWAIT feature flag set.

As the kernel currently does not support separating the iowait
accounting and CPU frequency boosting, the IORING_ENTER_NO_IOWAIT
controls both of these at the same time. In the future, if those do end
up being split, then it'd be possible to control them separately.
However, it seems more likely that the kernel will decouple iowait and
CPU frequency boosting anyway.

Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 5f14404b 19-Mar-2025 Pavel Begunkov <[email protected]>

io_uring/cmd: don't expose entire cmd async data

io_uring needs private bits in cmd's ->async_data, and they should never
be exposed to drivers as it'd certainly be abused. Leave struct
io_uring_cmd

io_uring/cmd: don't expose entire cmd async data

io_uring needs private bits in cmd's ->async_data, and they should never
be exposed to drivers as it'd certainly be abused. Leave struct
io_uring_cmd_data for the drivers but wrap it into a structure. It's a
prep patch and doesn't do anything useful yet.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 575e7b06 19-Mar-2025 Pavel Begunkov <[email protected]>

io_uring: rename the data cmd cache

Pick a more descriptive name for the cmd async data cache.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/20250319061251.

io_uring: rename the data cmd cache

Pick a more descriptive name for the cmd async data cache.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.14-rc6
# 5027d024 08-Mar-2025 Pavel Begunkov <[email protected]>

io_uring: unify STOP_MULTISHOT with IOU_OK

IOU_OK means that the request ownership is now handed back to core
io_uring and it has to complete it using the result provided in
req->cqe. Same is true f

io_uring: unify STOP_MULTISHOT with IOU_OK

IOU_OK means that the request ownership is now handed back to core
io_uring and it has to complete it using the result provided in
req->cqe. Same is true for multishot and IOU_STOP_MULTISHOT.

Rename it into IOU_COMPLETE to avoid confusion and use for both modes.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/e6a5b2edb0eb9558acb1c8f1db38ac45fee95491.1741453534.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 7a9dcb05 08-Mar-2025 Pavel Begunkov <[email protected]>

io_uring: return -EAGAIN to continue multishot

Multishot errors can be mapped 1:1 to normal errors, but there are not
identical. It leads to a peculiar situation where all multishot requests
has to

io_uring: return -EAGAIN to continue multishot

Multishot errors can be mapped 1:1 to normal errors, but there are not
identical. It leads to a peculiar situation where all multishot requests
has to check in what context they're run and return different codes.

Unify them starting with EAGAIN / IOU_ISSUE_SKIP_COMPLETE(EIOCBQUEUED)
pair, which mean that core io_uring still owns the request and it should
be retried. In case of multishot it's naturally just continues to poll,
otherwise it might poll, use iowq or do any other kind of allowed
blocking. Introduce IOU_RETRY aliased to -EAGAIN for that.

Apart from obvious upsides, multishot can now also check for misuse of
IOU_ISSUE_SKIP_COMPLETE.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/da117b79ce72ecc3ab488c744e29fae9ba54e23b.1741453534.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 0d83b8a9 04-Mar-2025 Caleb Sander Mateos <[email protected]>

io_uring: introduce io_cache_free() helper

Add a helper function io_cache_free() that returns an allocation to a
io_alloc_cache, falling back on kfree() if the io_alloc_cache is full.
This is the in

io_uring: introduce io_cache_free() helper

Add a helper function io_cache_free() that returns an allocation to a
io_alloc_cache, falling back on kfree() if the io_alloc_cache is full.
This is the inverse of io_cache_alloc(), which takes an allocation from
an io_alloc_cache and falls back on kmalloc() if the cache is empty.

Convert 4 callers to use the helper.

Signed-off-by: Caleb Sander Mateos <[email protected]>
Suggested-by: Li Zetao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.14-rc5
# ed9f3112 27-Feb-2025 Keith Busch <[email protected]>

io_uring: cache nodes and mapped buffers

Frequent alloc/free cycles on these is pretty costly. Use an io cache to
more efficiently reuse these buffers.

Signed-off-by: Keith Busch <[email protected]

io_uring: cache nodes and mapped buffers

Frequent alloc/free cycles on these is pretty costly. Use an io cache to
more efficiently reuse these buffers.

Signed-off-by: Keith Busch <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[axboe: fix imu leak]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 27cb27b6 27-Feb-2025 Keith Busch <[email protected]>

io_uring: add support for kernel registered bvecs

Provide an interface for the kernel to leverage the existing
pre-registered buffers that io_uring provides. User space can reference
these later to

io_uring: add support for kernel registered bvecs

Provide an interface for the kernel to leverage the existing
pre-registered buffers that io_uring provides. User space can reference
these later to achieve zero-copy IO.

User space must register an empty fixed buffer table with io_uring in
order for the kernel to make use of it.

Signed-off-by: Keith Busch <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.14-rc4
# c457eed5 23-Feb-2025 Pavel Begunkov <[email protected]>

io_uring: make io_poll_issue() sturdier

io_poll_issue() forwards the call to io_issue_sqe() and thus inherits
some of the handling. That's not particularly failure resistant, as for
example returnin

io_uring: make io_poll_issue() sturdier

io_poll_issue() forwards the call to io_issue_sqe() and thus inherits
some of the handling. That's not particularly failure resistant, as for
example returning an innocently looking IOU_OK from a multishot issue
will lead to severe bugs.

Reimplement io_poll_issue() without io_issue_sqe()'s request completion
logic. Remove extra checks as we know that req->file is already set,
linked timeout are armed, and iopoll is not supported. Also cover it
with warnings for now.

The patch should be useful by itself, but it's also preparing the
codebase for other future clean ups.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/3096d7b1026d9a52426a598bdfc8d9d324555545.1740331076.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.14-rc3
# 62aa9805 12-Feb-2025 Caleb Sander Mateos <[email protected]>

io_uring: use lockless_cq flag in io_req_complete_post()

io_uring_create() computes ctx->lockless_cq as:
ctx->task_complete || (ctx->flags & IORING_SETUP_IOPOLL)

So use it to simplify that expressi

io_uring: use lockless_cq flag in io_req_complete_post()

io_uring_create() computes ctx->lockless_cq as:
ctx->task_complete || (ctx->flags & IORING_SETUP_IOPOLL)

So use it to simplify that expression in io_req_complete_post().

Signed-off-by: Caleb Sander Mateos <[email protected]>
Reviewed-by: Li Zetao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


12345678910>>...23