History log of /linux-6.15/io_uring/memmap.c (Results 1 – 25 of 27)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7
# f446c631 12-May-2025 Jens Axboe <[email protected]>

io_uring/memmap: don't use page_address() on a highmem page

For older/32-bit systems with highmem, don't assume that the pages in
a mapped region are always going to be mapped. If io_region_init_ptr

io_uring/memmap: don't use page_address() on a highmem page

For older/32-bit systems with highmem, don't assume that the pages in
a mapped region are always going to be mapped. If io_region_init_ptr()
finds that the pages are coalescable, also check if the first page is
a HighMem page or not. If it is, fall through to the usual vmap()
mapping rather than attempt to get the unmapped page address.

Cc: [email protected]
Fixes: c4d0ac1c1567 ("io_uring/memmap: optimise single folio regions")
Link: https://lore.kernel.org/all/[email protected]/
Reported-by: [email protected]
Link: https://lore.kernel.org/all/[email protected]/
Reported-by: [email protected]
Tested-by: [email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4
# 92ade52f 21-Feb-2025 Bui Quang Minh <[email protected]>

io_uring: add missing IORING_MAP_OFF_ZCRX_REGION in io_uring_mmap

Allow user to mmap the kernel allocated zerocopy-rx refill queue.

Signed-off-by: Bui Quang Minh <[email protected]>
Reviewed

io_uring: add missing IORING_MAP_OFF_ZCRX_REGION in io_uring_mmap

Allow user to mmap the kernel allocated zerocopy-rx refill queue.

Signed-off-by: Bui Quang Minh <[email protected]>
Reviewed-by: Pavel Begunkov <[email protected]>
Reviewed-by: Li Zetao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1
# 7cd7b957 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/memmap: unify io_uring mmap'ing code

All mapped memory is now backed by regions and we can unify and clean
up io_region_validate_mmap() and io_uring_mmap(). Extract a function
looking up a

io_uring/memmap: unify io_uring mmap'ing code

All mapped memory is now backed by regions and we can unify and clean
up io_region_validate_mmap() and io_uring_mmap(). Extract a function
looking up a region, the rest of the handling should be generic and just
needs the region.

There is one more ring type specific code, i.e. the mmaping size
truncation quirk for IORING_OFF_[S,C]Q_RING, which is left as is.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/f5e1eda1562bfd34276de07465525ae5f10e1e84.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# ef62de3c 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/kbuf: use region api for pbuf rings

Convert internal parts of the provided buffer ring managment to the
region API. It's the last non-region mapped ring we have, so it also
kills a bunch of

io_uring/kbuf: use region api for pbuf rings

Convert internal parts of the provided buffer ring managment to the
region API. It's the last non-region mapped ring we have, so it also
kills a bunch of now unused memmap.c helpers.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/6c40cf7beaa648558acd4d84bc0fb3279a35d74b.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 90175f3f 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/kbuf: remove pbuf ring refcounting

struct io_buffer_list refcounting was needed for RCU based sync with
mmap, now we can kill it.

Signed-off-by: Pavel Begunkov <[email protected]>
Li

io_uring/kbuf: remove pbuf ring refcounting

struct io_buffer_list refcounting was needed for RCU based sync with
mmap, now we can kill it.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/4a9cc54bf0077bb2bf2f3daf917549ddd41080da.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 81a4058e 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring: use region api for CQ

Convert internal parts of the CQ/SQ array managment to the region API.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/46fc3c8

io_uring: use region api for CQ

Convert internal parts of the CQ/SQ array managment to the region API.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/46fc3c801290d6b1ac16023d78f6b8e685c87fd6.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 8078486e 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring: use region api for SQ

Convert internal parts of the SQ managment to the region API.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/1fb73ced6b835cb3

io_uring: use region api for SQ

Convert internal parts of the SQ managment to the region API.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/1fb73ced6b835cb319ab0fe1dc0b2e982a9a5650.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 087f9978 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/memmap: implement mmap for regions

The patch implements mmap for the param region and enables the kernel
allocation mode. Internally it uses a fixed mmap offset, however the
user has to use

io_uring/memmap: implement mmap for regions

The patch implements mmap for the param region and enables the kernel
allocation mode. Internally it uses a fixed mmap offset, however the
user has to use the offset returned in
struct io_uring_region_desc::mmap_offset.

Note, mmap doesn't and can't take ->uring_lock and the region / ring
lookup is protected by ->mmap_lock, and it's directly peeking at
ctx->param_region. We can't protect io_create_region() with the
mmap_lock as it'd deadlock, which is why io_create_region_mmap_safe()
initialises it for us in a temporary variable and then publishes it
with the lock taken. It's intentionally decoupled from main region
helpers, and in the future we might want to have a list of active
regions, which then could be protected by the ->mmap_lock.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/0f1212bd6af7fb39b63514b34fae8948014221d1.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 1e21df69 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/memmap: implement kernel allocated regions

Allow the kernel to allocate memory for a region. That's the classical
way SQ/CQ are allocated. It's not yet useful to user space as there
is no w

io_uring/memmap: implement kernel allocated regions

Allow the kernel to allocate memory for a region. That's the classical
way SQ/CQ are allocated. It's not yet useful to user space as there
is no way to mmap it, which is why it's explicitly disabled in
io_register_mem_region().

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/7b8c40e6542546bbf93f4842a9a42a7373b81e0d.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 4b851d20 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/memmap: add IO_REGION_F_SINGLE_REF

Kernel allocated compound pages will have just one reference for the
entire page array, add a flag telling io_free_region about that.

Signed-off-by: Pave

io_uring/memmap: add IO_REGION_F_SINGLE_REF

Kernel allocated compound pages will have just one reference for the
entire page array, add a flag telling io_free_region about that.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/a7abfa7535e9728d5fcade29a1ea1605ec2c04ce.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# a90558b3 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/memmap: helper for pinning region pages

In preparation to adding kernel allocated regions extract a new helper
that pins user pages.

Signed-off-by: Pavel Begunkov <[email protected]>

io_uring/memmap: helper for pinning region pages

In preparation to adding kernel allocated regions extract a new helper
that pins user pages.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/a17d7c39c3de4266b66b75b2dcf768150e1fc618.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# c4d0ac1c 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/memmap: optimise single folio regions

We don't need to vmap if memory is already physically contiguous. There
are two important cases it covers: PAGE_SIZE regions and huge pages.
Use io_che

io_uring/memmap: optimise single folio regions

We don't need to vmap if memory is already physically contiguous. There
are two important cases it covers: PAGE_SIZE regions and huge pages.
Use io_check_coalesce_buffer() to get the number of contiguous folios.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/d5240af23064a824c29d14d2406f1ae764bf4505.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 226ae1b4 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/memmap: reuse io_free_region for failure path

Regions are going to become more complex with allocation options and
optimisations, I want to split initialisation into steps and for that it
n

io_uring/memmap: reuse io_free_region for failure path

Regions are going to become more complex with allocation options and
optimisations, I want to split initialisation into steps and for that it
needs a sane fail path. Reuse io_free_region(), it's smart enough to
undo only what's needed and leaves the structure in a consistent state.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/b853b4ec407cc80d033d021bdd2c14e22378fc78.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# fc5f22a6 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/memmap: account memory before pinning

Move memory accounting before page pinning. It shouldn't even try to pin
pages if it's not allowed, and accounting is also relatively
inexpensive. It a

io_uring/memmap: account memory before pinning

Move memory accounting before page pinning. It shouldn't even try to pin
pages if it's not allowed, and accounting is also relatively
inexpensive. It also give a better code structure as we do generic
accounting and then can branch for different mapping types.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/1e242b8038411a222e8b269d35e021fa5015289f.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 16375af3 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/memmap: flag regions with user pages

In preparation to kernel allocated regions add a flag telling if
the region contains user pinned pages or not.

Signed-off-by: Pavel Begunkov <asml.sile

io_uring/memmap: flag regions with user pages

In preparation to kernel allocated regions add a flag telling if
the region contains user pinned pages or not.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/0dc91564642654405bab080b7ec911cb4a43ec6e.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# a730d204 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/memmap: flag vmap'ed regions

Add internal flags for struct io_mapped_region. The first flag we need
is IO_REGION_F_VMAPPED, that indicates that the pointer has to be
unmapped on region dest

io_uring/memmap: flag vmap'ed regions

Add internal flags for struct io_mapped_region. The first flag we need
is IO_REGION_F_VMAPPED, that indicates that the pointer has to be
unmapped on region destruction. For now all regions are vmap'ed, so it's
set unconditionally.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/5a3d8046a038da97c0f8a8c8f1733fa3fc689d31.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 943d0609 29-Nov-2024 Pavel Begunkov <[email protected]>

io_uring: rename ->resize_lock

->resize_lock is used for resizing rings, but it's a good idea to reuse
it in other cases as well. Rename it into mmap_lock as it's protects
from races with mmap.

Sig

io_uring: rename ->resize_lock

->resize_lock is used for resizing rings, but it's a good idea to reuse
it in other cases as well. Rename it into mmap_lock as it's protects
from races with mmap.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/68f705306f3ac4d2fb999eb80ea1615015ce9f7f.1732886067.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 43eef70e 25-Nov-2024 Pavel Begunkov <[email protected]>

io_uring: fix corner case forgetting to vunmap

io_pages_unmap() is a bit tricky in trying to figure whether the pages
were previously vmap'ed or not. In particular If there is juts one page
it beliv

io_uring: fix corner case forgetting to vunmap

io_pages_unmap() is a bit tricky in trying to figure whether the pages
were previously vmap'ed or not. In particular If there is juts one page
it belives there is no need to vunmap. Paired io_pages_map(), however,
could've failed io_mem_alloc_compound() and attempted to
io_mem_alloc_single(), which does vmap, and that leads to unpaired vmap.

The solution is to fail if io_mem_alloc_compound() can't allocate a
single page. That's the easiest way to deal with it, and those two
functions are getting removed soon, so no need to overcomplicate it.

Cc: [email protected]
Fixes: 3ab1db3c6039e ("io_uring: get rid of remap_pfn_range() for mapping rings/sqes")
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/477e75a3907a2fe83249e49c0a92cd480b2c60e0.1732569842.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 0c0a4eae 26-Nov-2024 Pavel Begunkov <[email protected]>

io_uring: check for overflows in io_pin_pages

WARNING: CPU: 0 PID: 5834 at io_uring/memmap.c:144 io_pin_pages+0x149/0x180 io_uring/memmap.c:144
CPU: 0 UID: 0 PID: 5834 Comm: syz-executor825 Not tain

io_uring: check for overflows in io_pin_pages

WARNING: CPU: 0 PID: 5834 at io_uring/memmap.c:144 io_pin_pages+0x149/0x180 io_uring/memmap.c:144
CPU: 0 UID: 0 PID: 5834 Comm: syz-executor825 Not tainted 6.12.0-next-20241118-syzkaller #0
Call Trace:
<TASK>
__io_uaddr_map+0xfb/0x2d0 io_uring/memmap.c:183
io_rings_map io_uring/io_uring.c:2611 [inline]
io_allocate_scq_urings+0x1c0/0x650 io_uring/io_uring.c:3470
io_uring_create+0x5b5/0xc00 io_uring/io_uring.c:3692
io_uring_setup io_uring/io_uring.c:3781 [inline]
...
</TASK>

io_pin_pages()'s uaddr parameter came directly from the user and can be
garbage. Don't just add size to it as it can overflow.

Cc: [email protected]
Reported-by: [email protected]
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/1b7520ddb168e1d537d64be47414a0629d0d8f8f.1732581026.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 2ae6bdb1 20-Nov-2024 Dan Carpenter <[email protected]>

io_uring/region: return negative -E2BIG in io_create_region()

This code accidentally returns positivie E2BIG instead of negative
-E2BIG. The callers treat negatives and positives the same so this
d

io_uring/region: return negative -E2BIG in io_create_region()

This code accidentally returns positivie E2BIG instead of negative
-E2BIG. The callers treat negatives and positives the same so this
doesn't affect the kernel. The error code is returned to userspace via
the system call.

Fixes: dfbbfbf19187 ("io_uring: introduce concept of memory regions")
Signed-off-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.12
# a6529588 17-Nov-2024 Pavel Begunkov <[email protected]>

io_uring/region: fix error codes after failed vmap

io_create_region() jumps after a vmap failure without setting the return
code, it could be 0 or just uninitialised.

Fixes: dfbbfbf191878 ("io_urin

io_uring/region: fix error codes after failed vmap

io_create_region() jumps after a vmap failure without setting the return
code, it could be 0 or just uninitialised.

Fixes: dfbbfbf191878 ("io_uring: introduce concept of memory regions")
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/0abac19dbf81c061cffaa9534a2471ed5460ad3e.1731803848.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# dfbbfbf1 15-Nov-2024 Pavel Begunkov <[email protected]>

io_uring: introduce concept of memory regions

We've got a good number of mappings we share with the userspace, that
includes the main rings, provided buffer rings, upcoming rings for
zerocopy rx and

io_uring: introduce concept of memory regions

We've got a good number of mappings we share with the userspace, that
includes the main rings, provided buffer rings, upcoming rings for
zerocopy rx and more. All of them duplicate user argument parsing and
some internal details as well (page pinnning, huge page optimisations,
mmap'ing, etc.)

Introduce a notion of regions. For userspace for now it's just a new
structure called struct io_uring_region_desc which is supposed to
parameterise all such mapping / queue creations. A region either
represents a user provided chunk of memory, in which case the user_addr
field should point to it, or a request for the kernel to allocate the
memory, in which case the user would need to mmap it after using the
offset returned in the mmap_offset field. With a uniform userspace API
we can avoid additional boiler plate code and apply future optimisation
to all of them at once.

Internally, there is a new structure struct io_mapped_region holding all
relevant runtime information and some helpers to work with it. This
patch limits it to user provided regions.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/0e6fe25818dfbaebd1bd90b870a6cac503fe1a24.1731689588.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# 68685fa2 15-Nov-2024 Pavel Begunkov <[email protected]>

io_uring: fortify io_pin_pages with a warning

We're a bit too frivolous with types of nr_pages arguments, converting
it to long and back to int, passing an unsigned int pointer as an int
pointer and

io_uring: fortify io_pin_pages with a warning

We're a bit too frivolous with types of nr_pages arguments, converting
it to long and back to int, passing an unsigned int pointer as an int
pointer and so on. Shouldn't cause any problem but should be carefully
reviewed, but until then let's add a WARN_ON_ONCE check to be more
confident callers don't pass poorely checked arguents.

Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/d48e0c097cbd90fb47acaddb6c247596510d8cfc.1731689588.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.12-rc7, v6.12-rc6, v6.12-rc5
# 79cfe9e5 21-Oct-2024 Jens Axboe <[email protected]>

io_uring/register: add IORING_REGISTER_RESIZE_RINGS

Once a ring has been created, the size of the CQ and SQ rings are fixed.
Usually this isn't a problem on the SQ ring side, as it merely controls
t

io_uring/register: add IORING_REGISTER_RESIZE_RINGS

Once a ring has been created, the size of the CQ and SQ rings are fixed.
Usually this isn't a problem on the SQ ring side, as it merely controls
the available number of requests that can be submitted in a single
system call, and there's rarely a need to change that.

For the CQ ring, it's a different story. For most efficient use of
io_uring, it's important that the CQ ring never overflows. This means
that applications must size it for the worst case scenario, which can
be wasteful.

Add IORING_REGISTER_RESIZE_RINGS, which allows an application to resize
the existing rings. It takes a struct io_uring_params argument, the same
one which is used to setup the ring initially, and resizes rings
according to the sizes given.

Certain properties are always inherited from the original ring setup,
like SQE128/CQE32 and other setup options. The implementation only
allows flag associated with how the CQ ring is sized and clamped.

Existing unconsumed SQE and CQE entries are copied as part of the
process. If either the SQ or CQ resized destination ring cannot hold the
entries already present in the source rings, then the operation is failed
with -EOVERFLOW. Any register op holds ->uring_lock, which prevents new
submissions, and the internal mapping holds the completion lock as well
across moving CQ ring state.

To prevent races between mmap and ring resizing, add a mutex that's
solely used to serialize ring resize and mmap. mmap_sem can't be used
here, as as fork'ed process may be doing mmaps on the ring as well.
The ctx->resize_lock is held across mmap operations, and the resize
will grab it before swapping out the already mapped new data.

Signed-off-by: Jens Axboe <[email protected]>

show more ...


# d090bffa 24-Oct-2024 Jens Axboe <[email protected]>

io_uring/memmap: explicitly return -EFAULT for mmap on NULL rings

The later mapping will actually check this too, but in terms of code
clarify, explicitly check for whether or not the rings and sqes

io_uring/memmap: explicitly return -EFAULT for mmap on NULL rings

The later mapping will actually check this too, but in terms of code
clarify, explicitly check for whether or not the rings and sqes are
valid during validation. That makes it explicit that if they are
non-NULL, they are valid and can get mapped.

Signed-off-by: Jens Axboe <[email protected]>

show more ...


12