|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4 |
|
| #
12d90811 |
| 18-Dec-2024 |
Jann Horn <[email protected]> |
io_uring: Fix registered ring file refcount leak
Currently, io_uring_unreg_ringfd() (which cleans up registered rings) is only called on exit, but __io_uring_free (which frees the tctx in which the
io_uring: Fix registered ring file refcount leak
Currently, io_uring_unreg_ringfd() (which cleans up registered rings) is only called on exit, but __io_uring_free (which frees the tctx in which the registered ring pointers are stored) is also called on execve (via begin_new_exec -> io_uring_task_cancel -> __io_uring_cancel -> io_uring_cancel_generic -> __io_uring_free).
This means: A process going through execve while having registered rings will leak references to the rings' `struct file`.
Fix it by zapping registered rings on execve(). This is implemented by moving the io_uring_unreg_ringfd() from io_uring_files_cancel() into its callee __io_uring_cancel(), which is called from io_uring_task_cancel() on execve.
This could probably be exploited *on 32-bit kernels* by leaking 2^32 references to the same ring, because the file refcount is stored in a pointer-sized field and get_file() doesn't have protection against refcount overflow, just a WARN_ONCE(); but on 64-bit it should have no impact beyond a memory leak.
Cc: [email protected] Fixes: e7a6c00dc77a ("io_uring: add support for registering ring file descriptors") Signed-off-by: Jann Horn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4 |
|
| #
8c9a6f54 |
| 09-Apr-2024 |
Pavel Begunkov <[email protected]> |
io_uring: separate header for exported net bits
We're exporting some io_uring bits to networking, e.g. for implementing a net callback for io_uring cmds, but we don't want to expose more than needed
io_uring: separate header for exported net bits
We're exporting some io_uring bits to networking, e.g. for implementing a net callback for io_uring cmds, but we don't want to expose more than needed. Add a separate header for networking.
Signed-off-by: Pavel Begunkov <[email protected]> Signed-off-by: David Wei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7 |
|
| #
a4104821 |
| 19-Dec-2023 |
Jens Axboe <[email protected]> |
io_uring/unix: drop usage of io_uring socket
Since we no longer allow sending io_uring fds over SCM_RIGHTS, move to using io_is_uring_fops() to detect whether this is a io_uring fd or not. With that
io_uring/unix: drop usage of io_uring socket
Since we no longer allow sending io_uring fds over SCM_RIGHTS, move to using io_is_uring_fops() to detect whether this is a io_uring fd or not. With that done, kill off io_uring_get_socket() as nobody calls it anymore.
This is in preparation to yanking out the rest of the core related to unix gc with io_uring.
Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.7-rc6, v6.7-rc5, v6.7-rc4 |
|
| #
b66509b8 |
| 01-Dec-2023 |
Pavel Begunkov <[email protected]> |
io_uring: split out cmd api into a separate header
linux/io_uring.h is slowly becoming a rubbish bin where we put anything exposed to other subsystems. For instance, the task exit hooks and io_uring
io_uring: split out cmd api into a separate header
linux/io_uring.h is slowly becoming a rubbish bin where we put anything exposed to other subsystems. For instance, the task exit hooks and io_uring cmd infra are completely orthogonal and don't need each other's definitions. Start cleaning it up by splitting out all command bits into a new header file.
Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/7ec50bae6e21f371d3850796e716917fc141225a.1701391955.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
8fadb86d |
| 30-Nov-2023 |
Keith Busch <[email protected]> |
io_uring: remove uring_cmd cookie
No more users of this field.
Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Keith Busch <k
io_uring: remove uring_cmd cookie
No more users of this field.
Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Keith Busch <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
e5da71f1 |
| 30-Nov-2023 |
Keith Busch <[email protected]> |
iouring: remove IORING_URING_CMD_POLLED
No more users of this flag.
Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Keith Bus
iouring: remove IORING_URING_CMD_POLLED
No more users of this flag.
Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Signed-off-by: Keith Busch <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7 |
|
| #
5fea44a6 |
| 16-Oct-2023 |
Breno Leitao <[email protected]> |
io_uring/cmd: Pass compat mode in issue_flags
Create a new flag to track if the operation is running compat mode. This basically check the context->compat and pass it to the issue_flags, so, it coul
io_uring/cmd: Pass compat mode in issue_flags
Create a new flag to track if the operation is running compat mode. This basically check the context->compat and pass it to the issue_flags, so, it could be queried later in the callbacks.
Signed-off-by: Breno Leitao <[email protected]> Reviewed-by: Gabriel Krisman Bertazi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc6, v6.6-rc5, v6.6-rc4 |
|
| #
93b8cc60 |
| 28-Sep-2023 |
Ming Lei <[email protected]> |
io_uring: cancelable uring_cmd
uring_cmd may never complete, such as ublk, in which uring cmd isn't completed until one new block request is coming from ublk block device.
Add cancelable uring_cmd
io_uring: cancelable uring_cmd
uring_cmd may never complete, such as ublk, in which uring cmd isn't completed until one new block request is coming from ublk block device.
Add cancelable uring_cmd to provide mechanism to driver for cancelling pending commands in its own way.
Add API of io_uring_cmd_mark_cancelable() for driver to mark one command as cancelable, then io_uring will cancel this command in io_uring_cancel_generic(). ->uring_cmd() callback is reused for canceling command in driver's way, then driver gets notified with the cancelling from io_uring.
Add API of io_uring_cmd_get_task() to help driver cancel handler deal with the canceling.
Reviewed-by: Gabriel Krisman Bertazi <[email protected]> Suggested-by: Jens Axboe <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
528ce678 |
| 28-Sep-2023 |
Ming Lei <[email protected]> |
io_uring: retain top 8bits of uring_cmd flags for kernel internal use
Retain top 8bits of uring_cmd flags for kernel internal use, so that we can move IORING_URING_CMD_POLLED out of uapi header.
Re
io_uring: retain top 8bits of uring_cmd flags for kernel internal use
Retain top 8bits of uring_cmd flags for kernel internal use, so that we can move IORING_URING_CMD_POLLED out of uapi header.
Reviewed-by: Gabriel Krisman Bertazi <[email protected]> Reviewed-by: Anuj Gupta <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1 |
|
| #
8e9fad0e |
| 27-Jun-2023 |
Breno Leitao <[email protected]> |
io_uring: Add io_uring command support for sockets
Enable io_uring commands on network sockets. Create two new SOCKET_URING_OP commands that will operate on sockets.
In order to call ioctl on socke
io_uring: Add io_uring command support for sockets
Enable io_uring commands on network sockets. Create two new SOCKET_URING_OP commands that will operate on sockets.
In order to call ioctl on sockets, use the file_operations->io_uring_cmd callbacks, and map it to a uring socket function, which handles the SOCKET_URING_OP accordingly, and calls socket ioctls.
This patches was tested by creating a new test case in liburing. Link: https://github.com/leitao/liburing/tree/io_uring_cmd
Signed-off-by: Breno Leitao <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3 |
|
| #
5f3139fc |
| 15-May-2023 |
Pavel Begunkov <[email protected]> |
io_uring/cmd: add cmd lazy tw wake helper
We want to use IOU_F_TWQ_LAZY_WAKE in commands. First, introduce a new cmd tw helper accepting TWQ flags, and then add io_uring_cmd_do_in_task_laz() that wi
io_uring/cmd: add cmd lazy tw wake helper
We want to use IOU_F_TWQ_LAZY_WAKE in commands. First, introduce a new cmd tw helper accepting TWQ flags, and then add io_uring_cmd_do_in_task_laz() that will pass IOU_F_TWQ_LAZY_WAKE and imply the "lazy" semantics, i.e. it posts no more than 1 CQE and delaying execution of this tw should not prevent forward progress.
Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/5b9f6716006df7e817f18bd555aee2f8f9c8b0c3.1684154817.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc2 |
|
| #
293007b0 |
| 08-May-2023 |
Jens Axboe <[email protected]> |
io_uring: make io_uring_sqe_cmd() unconditionally available
If CONFIG_IO_URING isn't set, then io_uring_sqe_cmd() is not defined. As the nvme driver uses this helper, it causes a compilation issue:
io_uring: make io_uring_sqe_cmd() unconditionally available
If CONFIG_IO_URING isn't set, then io_uring_sqe_cmd() is not defined. As the nvme driver uses this helper, it causes a compilation issue:
drivers/nvme/host/ioctl.c: In function 'nvme_uring_cmd_io': drivers/nvme/host/ioctl.c:555:44: error: implicit declaration of function 'io_uring_sqe_cmd'; did you mean 'io_uring_free'? [-Werror=implicit-function-declaration] 555 | const struct nvme_uring_cmd *cmd = io_uring_sqe_cmd(ioucmd->sqe); | ^~~~~~~~~~~~~~~~ | io_uring_free
Fix it by just making io_uring_sqe_cmd() generally available - the types are known, and there's no reason to hide it under CONFIG_IO_URING.
Fixes: fd9b8547bc5c ("io_uring: Pass whole sqe to commands") Reported-by: kernel test robot <[email protected]> Reported-by: Chen-Yu Tsai <[email protected]> Tested-by: Chen-Yu Tsai <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc1 |
|
| #
fd9b8547 |
| 04-May-2023 |
Breno Leitao <[email protected]> |
io_uring: Pass whole sqe to commands
Currently uring CMD operation relies on having large SQEs, but future operations might want to use normal SQE.
The io_uring_cmd currently only saves the payload
io_uring: Pass whole sqe to commands
Currently uring CMD operation relies on having large SQEs, but future operations might want to use normal SQE.
The io_uring_cmd currently only saves the payload (cmd) part of the SQE, but, for commands that use normal SQE size, it might be necessary to access the initial SQE fields outside of the payload/cmd block. So, saves the whole SQE other than just the pdu.
This changes slightly how the io_uring_cmd works, since the cmd structures and callbacks are not opaque to io_uring anymore. I.e, the callbacks can look at the SQE entries, not only, in the cmd structure.
The main advantage is that we don't need to create custom structures for simple commands.
Creates io_uring_sqe_cmd() that returns the cmd private data as a null pointer and avoids casting in the callee side. Also, make most of ublk_drv's sqe->cmd priv structure into const, and use io_uring_sqe_cmd() to get the private structure, removing the unwanted cast. (There is one case where the cast is still needed since the header->{len,addr} is updated in the private structure)
Suggested-by: Pavel Begunkov <[email protected]> Signed-off-by: Breno Leitao <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4 |
|
| #
9d2789ac |
| 21-Mar-2023 |
Jens Axboe <[email protected]> |
block/io_uring: pass in issue_flags for uring_cmd task_work handling
io_uring_cmd_done() currently assumes that the uring_lock is held when invoked, and while it generally is, this is not guaranteed
block/io_uring: pass in issue_flags for uring_cmd task_work handling
io_uring_cmd_done() currently assumes that the uring_lock is held when invoked, and while it generally is, this is not guaranteed. Pass in the issue_flags associated with it, so that we have IO_URING_F_UNLOCKED available to be able to lock the CQ ring appropriately when completing events.
Cc: [email protected] Fixes: ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd") Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1 |
|
| #
e6aeb272 |
| 07-Dec-2022 |
Pavel Begunkov <[email protected]> |
io_uring: complete all requests in task context
This patch adds ctx->task_complete flag. If set, we'll complete all requests in the context of the original task. Note, this extends to completion CQE
io_uring: complete all requests in task context
This patch adds ctx->task_complete flag. If set, we'll complete all requests in the context of the original task. Note, this extends to completion CQE posting only but not io_kiocb cleanup / free, e.g. io-wq may free the requests in the free calllback. This flag will be used later for optimisations purposes.
Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/21ece72953f76bb2e77659a72a14326227ab6460.1670384893.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc8 |
|
| #
7500194a |
| 30-Nov-2022 |
Pavel Begunkov <[email protected]> |
io_uring: reshuffle issue_flags
Reshuffle issue flags to keep normal flags separate from the uring_cmd ctx-setup like flags. Shift the second type to the second byte so it's easier to add new ones i
io_uring: reshuffle issue_flags
Reshuffle issue flags to keep normal flags separate from the uring_cmd ctx-setup like flags. Shift the second type to the second byte so it's easier to add new ones in the future.
Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/d6e4696c883943082d248716f4cd568f37b17a74.1669821213.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc7, v6.1-rc6 |
|
| #
91482864 |
| 17-Nov-2022 |
Pavel Begunkov <[email protected]> |
io_uring: fix multishot accept request leaks
Having REQ_F_POLLED set doesn't guarantee that the request is executed as a multishot from the polling path. Fortunately for us, if the code thinks it's
io_uring: fix multishot accept request leaks
Having REQ_F_POLLED set doesn't guarantee that the request is executed as a multishot from the polling path. Fortunately for us, if the code thinks it's multishot issue when it's not, it can only ask to skip completion so leaking the request. Use issue_flags to mark multipoll issues.
Cc: [email protected] Fixes: 390ed29b5e425 ("io_uring: add IORING_ACCEPT_MULTISHOT for accept") Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/7700ac57653f2823e30b34dc74da68678c0c5f13.1668710222.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1 |
|
| #
0e0abad2 |
| 04-Oct-2022 |
Geert Uytterhoeven <[email protected]> |
io_uring: Add missing inline to io_uring_cmd_import_fixed() dummy
If CONFIG_IO_URING is not set:
include/linux/io_uring.h:65:12: error: ‘io_uring_cmd_import_fixed’ defined but not used [-Werror
io_uring: Add missing inline to io_uring_cmd_import_fixed() dummy
If CONFIG_IO_URING is not set:
include/linux/io_uring.h:65:12: error: ‘io_uring_cmd_import_fixed’ defined but not used [-Werror=unused-function] 65 | static int io_uring_cmd_import_fixed(u64 ubuf, unsigned long len, int rw, | ^~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by adding the missing "inline" keyword.
Fixes: a9216fac3ed8819c ("io_uring: add io_uring_cmd_import_fixed") Signed-off-by: Geert Uytterhoeven <[email protected]> Link: https://lore.kernel.org/r/7404b4a696f64e33e5ef3c5bd3754d4f26d13e50.1664887093.git.geert+renesas@glider.be Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.0 |
|
| #
9cda70f6 |
| 30-Sep-2022 |
Anuj Gupta <[email protected]> |
io_uring: introduce fixed buffer support for io_uring_cmd
Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring command with previously registered buffers. User-space passes the bu
io_uring: introduce fixed buffer support for io_uring_cmd
Add IORING_URING_CMD_FIXED flag that is to be used for sending io_uring command with previously registered buffers. User-space passes the buffer index in sqe->buf_index, same as done in read/write variants that uses fixed buffers.
Signed-off-by: Anuj Gupta <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> Link: https://lore.kernel.org/r/[email protected] [axboe: shuffle valid flags check before acting on it] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
a9216fac |
| 30-Sep-2022 |
Anuj Gupta <[email protected]> |
io_uring: add io_uring_cmd_import_fixed
This is a new helper that callers can use to obtain a bvec iterator for the previously mapped buffer. This is preparatory work to enable fixed-buffer support
io_uring: add io_uring_cmd_import_fixed
This is a new helper that callers can use to obtain a bvec iterator for the previously mapped buffer. This is preparatory work to enable fixed-buffer support for io_uring_cmd.
Signed-off-by: Anuj Gupta <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3 |
|
| #
5756a3a7 |
| 23-Aug-2022 |
Kanchan Joshi <[email protected]> |
io_uring: add iopoll infrastructure for io_uring_cmd
Put this up in the same way as iopoll is done for regular read/write IO. Make place for storing a cookie into struct io_uring_cmd on submission.
io_uring: add iopoll infrastructure for io_uring_cmd
Put this up in the same way as iopoll is done for regular read/write IO. Make place for storing a cookie into struct io_uring_cmd on submission. Perform the completion using the ->uring_cmd_iopoll handler.
Signed-off-by: Kanchan Joshi <[email protected]> Signed-off-by: Pankaj Raghav <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7 |
|
| #
ee692a21 |
| 11-May-2022 |
Jens Axboe <[email protected]> |
fs,io_uring: add infrastructure for uring-cmd
file_operations->uring_cmd is a file private handler. This is somewhat similar to ioctl but hopefully a lot more sane and useful as it can be used to en
fs,io_uring: add infrastructure for uring-cmd
file_operations->uring_cmd is a file private handler. This is somewhat similar to ioctl but hopefully a lot more sane and useful as it can be used to enable many io_uring capabilities for the underlying operation.
IORING_OP_URING_CMD is a file private kind of request. io_uring doesn't know what is in this command type, it's for the provider of ->uring_cmd() to deal with.
Co-developed-by: Kanchan Joshi <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v5.18-rc6, v5.18-rc5 |
|
| #
33337d03 |
| 26-Apr-2022 |
Dylan Yudaken <[email protected]> |
io_uring: add io_uring_get_opcode
In some debug scenarios it is useful to have the text representation of the opcode. Add this function in preparation.
Signed-off-by: Dylan Yudaken <[email protected]>
io_uring: add io_uring_get_opcode
In some debug scenarios it is useful to have the text representation of the opcode. Add this function in preparation.
Signed-off-by: Dylan Yudaken <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7 |
|
| #
e7a6c00d |
| 04-Mar-2022 |
Jens Axboe <[email protected]> |
io_uring: add support for registering ring file descriptors
Lots of workloads use multiple threads, in which case the file table is shared between them. This makes getting and putting the ring file
io_uring: add support for registering ring file descriptors
Lots of workloads use multiple threads, in which case the file table is shared between them. This makes getting and putting the ring file descriptor for each io_uring_enter(2) system call more expensive, as it involves an atomic get and put for each call.
Similarly to how we allow registering normal file descriptors to avoid this overhead, add support for an io_uring_register(2) API that allows to register the ring fds themselves:
1) IORING_REGISTER_RING_FDS - takes an array of io_uring_rsrc_update structs, and registers them with the task. 2) IORING_UNREGISTER_RING_FDS - takes an array of io_uring_src_update structs, and unregisters them.
When a ring fd is registered, it is internally represented by an offset. This offset is returned to the application, and the application then uses this offset and sets IORING_ENTER_REGISTERED_RING for the io_uring_enter(2) system call. This works just like using a registered file descriptor, rather than a real one, in an SQE, where IOSQE_FIXED_FILE gets set to tell io_uring that we're using an internal offset/descriptor rather than a real file descriptor.
In initial testing, this provides a nice bump in performance for threaded applications in real world cases where the batch count (eg number of requests submitted per io_uring_enter(2) invocation) is low. In a microbenchmark, submitting NOP requests, we see the following increases in performance:
Requests per syscall Baseline Registered Increase ---------------------------------------------------------------- 1 ~7030K ~8080K +15% 2 ~13120K ~14800K +13% 4 ~22740K ~25300K +11%
Co-developed-by: Xiaoguang Wang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6 |
|
| #
f552a27a |
| 12-Aug-2021 |
Hao Xu <[email protected]> |
io_uring: remove files pointer in cancellation functions
When doing cancellation, we use a parameter to indicate where it's from do_exit or exec. So a boolean value is good enough for this, remove t
io_uring: remove files pointer in cancellation functions
When doing cancellation, we use a parameter to indicate where it's from do_exit or exec. So a boolean value is good enough for this, remove the struct files* as it is not necessary.
Signed-off-by: Hao Xu <[email protected]> [axboe: fixup io_uring_files_cancel for !CONFIG_IO_URING] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|