|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1 |
|
| #
ebf695f1 |
| 27-Mar-2025 |
Ming Lei <[email protected]> |
ublk: add segment parameter
IO split is usually bad in io_uring world, since -EAGAIN is caused and IO handling may have to fallback to io-wq, this way does hurt performance.
ublk starts to support
ublk: add segment parameter
IO split is usually bad in io_uring world, since -EAGAIN is caused and IO handling may have to fallback to io-wq, this way does hurt performance.
ublk starts to support zero copy recently, for avoiding unnecessary IO split, ublk driver's segment limit should be aligned with backend device's segment limit.
Another reason is that io_buffer_register_bvec() needs to allocate bvecs, which number is aligned with ublk request segment number, so that big memory allocation can be avoided by setting reasonable max_segments limit.
So add segment parameter for providing ublk server chance to align segment limit with backend, and keep it reasonable from implementation viewpoint.
Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5 |
|
| #
e84025d2 |
| 27-Feb-2025 |
Ming Lei <[email protected]> |
ublk: add DMA alignment limit
The in-tree ublk driver doesn't need DMA alignment limit because there is one data copy between request pages and the userspace buffer.
However, ublk is going to suppo
ublk: add DMA alignment limit
The in-tree ublk driver doesn't need DMA alignment limit because there is one data copy between request pages and the userspace buffer.
However, ublk is going to support zero copy, then DMA alignment limit is required, because same IO buffer is forwarded to backend which may have specific buffer DMA alignment limit, so the limit has to be exposed from the frontend driver to client application.
Cc: Keith Busch <[email protected]> Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
1f6540e2 |
| 27-Feb-2025 |
Keith Busch <[email protected]> |
ublk: zc register/unregister bvec
Provide new operations for the user to request mapping an active request to an io uring instance's buf_table. The user has to provide the index it wants to install
ublk: zc register/unregister bvec
Provide new operations for the user to request mapping an active request to an io uring instance's buf_table. The user has to provide the index it wants to install the buffer.
A reference count is taken on the request to ensure it can't be completed while it is active in a ring's buf_table.
Signed-off-by: Keith Busch <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3 |
|
| #
59eaa01c |
| 07-Oct-2024 |
Uday Shankar <[email protected]> |
ublk: support device recovery without I/O queueing
ublk currently supports the following behaviors on ublk server exit:
A: outstanding I/Os get errors, subsequently issued I/Os get errors B: outsta
ublk: support device recovery without I/O queueing
ublk currently supports the following behaviors on ublk server exit:
A: outstanding I/Os get errors, subsequently issued I/Os get errors B: outstanding I/Os get errors, subsequently issued I/Os queue C: outstanding I/Os get reissued, subsequently issued I/Os queue
and the following behaviors for recovery of preexisting block devices by a future incarnation of the ublk server:
1: ublk devices stopped on ublk server exit (no recovery possible) 2: ublk devices are recoverable using start/end_recovery commands
The userspace interface allows selection of combinations of these behaviors using flags specified at device creation time, namely:
default behavior: A + 1 UBLK_F_USER_RECOVERY: B + 2 UBLK_F_USER_RECOVERY|UBLK_F_USER_RECOVERY_REISSUE: C + 2
The behavior A + 2 is currently unsupported. Add support for this behavior under the new flag combination UBLK_F_USER_RECOVERY|UBLK_F_USER_RECOVERY_FAIL_IO.
Signed-off-by: Uday Shankar <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
42aafd8b |
| 16-Oct-2024 |
Ming Lei <[email protected]> |
ublk: don't allow user copy for unprivileged device
UBLK_F_USER_COPY requires userspace to call write() on ublk char device for filling request buffer, and unprivileged device can't be trusted.
So
ublk: don't allow user copy for unprivileged device
UBLK_F_USER_COPY requires userspace to call write() on ublk char device for filling request buffer, and unprivileged device can't be trusted.
So don't allow user copy for unprivileged device.
Cc: [email protected] Fixes: 1172d5b8beca ("ublk: support user copy") Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6 |
|
| #
13fe8e68 |
| 23-Feb-2024 |
Ming Lei <[email protected]> |
ublk: add UBLK_CMD_DEL_DEV_ASYNC
The current command UBLK_CMD_DEL_DEV won't return until the device is released, this way looks more reliable, but makes userspace more difficult to implement, especi
ublk: add UBLK_CMD_DEL_DEV_ASYNC
The current command UBLK_CMD_DEL_DEV won't return until the device is released, this way looks more reliable, but makes userspace more difficult to implement, especially about orders: unmap command buffer(which holds one ublkc reference), ublkc close, io_uring_file_unregister, ublkb close.
Add UBLK_CMD_DEL_DEV_ASYNC so that device deletion won't wait release, then userspace needn't worry about the above order. Actually both loop and nbd is deleted in this async way.
Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6 |
|
| #
851e0629 |
| 10-Aug-2023 |
Ming Lei <[email protected]> |
ublk: zoned: support REQ_OP_ZONE_RESET_ALL
There isn't any reason to not support REQ_OP_ZONE_RESET_ALL given everything is actually handled in userspace, not mention it is pretty easy to support RES
ublk: zoned: support REQ_OP_ZONE_RESET_ALL
There isn't any reason to not support REQ_OP_ZONE_RESET_ALL given everything is actually handled in userspace, not mention it is pretty easy to support RESET_ALL.
So enable REQ_OP_ZONE_RESET_ALL and let userspace handle it.
Verified by 'tools/zbc_reset_zone -all /dev/ublkb0' in libzbc[1] with libublk-rs based ublk-zoned target prototype[2], follows command line for creating ublk-zoned:
cargo run --example zoned -- add -1 1024 # add $dev_id $DEV_SIZE
[1] https://github.com/westerndigitalcorporation/libzbc [2] https://github.com/ming1/libublk-rs/tree/zoned.v2
Cc: Niklas Cassel <[email protected]> Cc: Damien Le Moal <[email protected]> Cc: Andreas Hindborg <[email protected]> Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc5 |
|
| #
29802d7c |
| 04-Aug-2023 |
Andreas Hindborg <[email protected]> |
ublk: enable zoned storage support
Add zoned storage support to ublk: report_zones and operations: - REQ_OP_ZONE_OPEN - REQ_OP_ZONE_CLOSE - REQ_OP_ZONE_FINISH - REQ_OP_ZONE_RESET - REQ_OP_ZONE_
ublk: enable zoned storage support
Add zoned storage support to ublk: report_zones and operations: - REQ_OP_ZONE_OPEN - REQ_OP_ZONE_CLOSE - REQ_OP_ZONE_FINISH - REQ_OP_ZONE_RESET - REQ_OP_ZONE_APPEND
The zone append feature uses the `addr` field of `struct ublksrv_io_cmd` to communicate ALBA back to the kernel. Therefore ublk must be used with the user copy feature (UBLK_F_USER_COPY) for zoned storage support to be available. Without this feature, ublk will not allow zoned storage support.
Signed-off-by: Andreas Hindborg <[email protected]> Reviewed-by: Ming Lei <[email protected]> Tested-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5 |
|
| #
b5bbc52f |
| 03-Jun-2023 |
Ming Lei <[email protected]> |
ublk: add control command of UBLK_U_CMD_GET_FEATURES
Add control command of UBLK_U_CMD_GET_FEATURES for returning driver's feature set or capability.
This way can simplify userspace for maintaining
ublk: add control command of UBLK_U_CMD_GET_FEATURES
Add control command of UBLK_U_CMD_GET_FEATURES for returning driver's feature set or capability.
This way can simplify userspace for maintaining compatibility because userspace doesn't need to send command to one device for querying driver feature set any more. Such as, with the queried feature set, userspace can choose to use:
- UBLK_CMD_GET_DEV_INFO2 or UBLK_CMD_GET_DEV_INFO, - UBLK_U_CMD_* or UBLK_CMD_*
Userspace code: https://github.com/ming1/ubdsrv/commits/features-cmd
Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc4, v6.4-rc3 |
|
| #
1172d5b8 |
| 19-May-2023 |
Ming Lei <[email protected]> |
ublk: support user copy
Currently copy between io request buffer(pages) and userspace buffer is done inside ublk_map_io() or ublk_unmap_io(). This way performs very well in case of pre-allocated use
ublk: support user copy
Currently copy between io request buffer(pages) and userspace buffer is done inside ublk_map_io() or ublk_unmap_io(). This way performs very well in case of pre-allocated userspace io buffer.
For dynamically allocated or external userspace backend io buffer, UBLK_F_NEED_GET_DATA is added for ublk server to provide buffer by one extra command communication for WRITE request. For READ, userspace simply provides buffer, but can't know when the buffer is done[1].
Add UBLK_F_USER_COPY by moving io data copy out of kernel by providing read()/write() on /dev/ublkcN, and simply let ublk server do the io data copy. This way makes both side cleaner, the cost is that one extra syscall for copy io data between request and backend buffer.
With UBLK_F_USER_COPY, it actually becomes possible to run per-io zero copy now, such as, only do zero copy for big size IO, so it can be thought as one prep patch for supporting zero copy. Meantime zero copy still needs to expose read()/write() buffer for some corner case, such as passthrough IO.
[1] READ buffer in UBLK_F_NEED_GET_DATA https://lore.kernel.org/linux-block/[email protected]/T/#m23bd4b8634c0a054e6797063167b469949a247bb
ublksrv loop usercopy code:
https://github.com/ming1/ubdsrv/commits/usercopy
Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
62fe99ce |
| 19-May-2023 |
Ming Lei <[email protected]> |
ublk: add read()/write() support for ublk char device
Support pread()/pwrite() on ublk char device for reading/writing request io buffer, so data copy between io request buffer and userspace buffer
ublk: add read()/write() support for ublk char device
Support pread()/pwrite() on ublk char device for reading/writing request io buffer, so data copy between io request buffer and userspace buffer can be moved to ublk server from ublk driver. Then UBLK_F_NEED_GET_DATA becomes not necessary, so ublk server can allocate buffer without one extra round uring command communication for userspace to provide buffer.
IO buffer can be located by iocb->ki_pos which encodes buffer offset, io tag and queue id info, and type of iocb->ki_pos is u64, so it is big enough for holding reasonable queue depth, nr_queues and max io buffer size.
Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc2, v6.4-rc1, v6.3 |
|
| #
2d786e66 |
| 18-Apr-2023 |
Ming Lei <[email protected]> |
block: ublk: switch to ioctl command encoding
All ublk commands(control, IO) should have taken ioctl command encoding from the beginning, because ioctl command encoding defines each code uniquely, s
block: ublk: switch to ioctl command encoding
All ublk commands(control, IO) should have taken ioctl command encoding from the beginning, because ioctl command encoding defines each code uniquely, so driver can figure out wrong command sent from userspace easily; 2) it might help security subsystem for audit uring cmd[1].
Unfortunately we didn't do that way, and it could be one lesson for ublk driver.
So switch to ioctl command encoding now, we still support commands encoded in old way, but they become legacy definition. Any new command should take ioctl encoding.
See ublksrv code for switching to ioctl command encoding in [2].
[1] https://lore.kernel.org/io-uring/CAHC9VhSVzujW9LOj5Km80AjU0EfAuukoLrxO6BEfnXeK_s6bAg@mail.gmail.com/ [2] https://github.com/ming1/ubdsrv/commits/ioctl_cmd_encoding
Cc: Christoph Hellwig <[email protected]> Cc: Ken Kurematsu <[email protected]> Signed-off-by: Ming Lei <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3 |
|
| #
4093cb5a |
| 06-Jan-2023 |
Ming Lei <[email protected]> |
ublk_drv: add mechanism for supporting unprivileged ublk device
unprivileged ublk device is helpful for container use case, such as: ublk device created in one unprivileged container can be controll
ublk_drv: add mechanism for supporting unprivileged ublk device
unprivileged ublk device is helpful for container use case, such as: ublk device created in one unprivileged container can be controlled and accessed by this container only.
Implement this feature by adding flag of UBLK_F_UNPRIVILEGED_DEV, and if this flag isn't set, any control command has been run from privileged user. Otherwise, any control command can be sent from any unprivileged user, but the user has to be permitted to access the ublk char device to be controlled.
In case of UBLK_F_UNPRIVILEGED_DEV:
1) for command UBLK_CMD_ADD_DEV, it is always allowed, and user needs to provide owner's uid/gid in this command, so that udev can set correct ownership for the created ublk device, since the device owner uid/gid can be queried via command of UBLK_CMD_GET_DEV_INFO.
2) for other control commands, they can only be run successfully if the current user is allowed to access the specified ublk char device, for running the permission check, path of the ublk char device has to be provided by these commands.
Also add one control of command UBLK_CMD_GET_DEV_INFO2 which always include the char dev path in payload since userspace may not have knowledge if this device is created in unprivileged mode.
For applying this mechanism, system administrator needs to take the following policies:
1) chmod 0666 /dev/ublk-control
2) change ownership of ublkcN & ublkbN - chown owner_uid:owner_gid /dev/ublkcN - chown owner_uid:owner_gid /dev/ublkbN
Both can be done via one simple udev rule.
Userspace:
https://github.com/ming1/ubdsrv/tree/unprivileged-ublk
'ublk add -t $TYPE --un_privileged=1' is for creating one un-privileged ublk device if the user is un-privileged.
Link: https://lore.kernel.org/linux-block/[email protected]/ Suggested-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
abb864d3 |
| 06-Jan-2023 |
Ming Lei <[email protected]> |
ublk_drv: add device parameter UBLK_PARAM_TYPE_DEVT
Userspace side only knows device ID, but the associated path of ublkc* and ublkb* could be changed by udev, and that depends on userspace's policy
ublk_drv: add device parameter UBLK_PARAM_TYPE_DEVT
Userspace side only knows device ID, but the associated path of ublkc* and ublkb* could be changed by udev, and that depends on userspace's policy, so add parameter of UBLK_PARAM_TYPE_DEVT for retrieving major/minor of the ublkc* and ublkb*, then user may figure out major/minor of the ublk disks he/she owns. With major/minor, it is easy to find the device node path.
Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7 |
|
| #
c732a852 |
| 23-Sep-2022 |
ZiyangZhang <[email protected]> |
ublk_drv: add START_USER_RECOVERY and END_USER_RECOVERY support
START_USER_RECOVERY and END_USER_RECOVERY are two new control commands to support user recovery feature.
After a crash, user should s
ublk_drv: add START_USER_RECOVERY and END_USER_RECOVERY support
START_USER_RECOVERY and END_USER_RECOVERY are two new control commands to support user recovery feature.
After a crash, user should send START_USER_RECOVERY, it will: (1) check if (a)current ublk_device is UBLK_S_DEV_QUIESCED which was set by quiesce_work and (b)chardev is released (2) reinit all ubqs, including: (a) put the task_struct and reset ->ubq_daemon to NULL. (b) reset all ublk_io. (3) reset ub->mm to NULL.
Then, user should start a new process and send FETCH_REQ on each ubq_daemon.
Finally, user should send END_USER_RECOVERY, it will: (1) wait for all new ubq_daemons getting ready. (2) update ublksrv_pid (3) unquiesce the request queue and expect incoming ublk_queue_rq() (4) convert ub's state to UBLK_S_DEV_LIVE
Note: we can handle STOP_DEV between START_USER_RECOVERY and END_USER_RECOVERY. This is helpful to users who cannot start new process after sending START_USER_RECOVERY ctrl-cmd.
Signed-off-by: ZiyangZhang <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
a0d41dc1 |
| 23-Sep-2022 |
ZiyangZhang <[email protected]> |
ublk_drv: support UBLK_F_USER_RECOVERY_REISSUE
UBLK_F_USER_RECOVERY_REISSUE implies that: With a dying ubq_daemon, ublk_drv let monitor_work requeues rq issued to userspace(ublksrv) before the ubq_d
ublk_drv: support UBLK_F_USER_RECOVERY_REISSUE
UBLK_F_USER_RECOVERY_REISSUE implies that: With a dying ubq_daemon, ublk_drv let monitor_work requeues rq issued to userspace(ublksrv) before the ubq_daemon is dying.
UBLK_F_USER_RECOVERY_REISSUE is designed for backends which: (1) tolerate double-write since ublk_drv may issue the same rq twice. (2) does not let frontend users get I/O error, such as read-only FS and VM backend.
Signed-off-by: ZiyangZhang <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
77a440e2 |
| 23-Sep-2022 |
ZiyangZhang <[email protected]> |
ublk_drv: define macros for recovery feature and check them
Define some macros for recovery feature.
UBLK_S_DEV_QUIESCED implies that ublk_device is quiesced and is ready for recovery. This state c
ublk_drv: define macros for recovery feature and check them
Define some macros for recovery feature.
UBLK_S_DEV_QUIESCED implies that ublk_device is quiesced and is ready for recovery. This state can be observed by userspace.
UBLK_F_USER_RECOVERY implies that: (1) ublk_drv enables recovery feature. It won't let monitor_work to automatically abort rqs and release the device. (2) With a dying ubq_daemon, ublk_drv ends(aborts) rqs issued to userspace(ublksrv) before crash. (3) With a dying ubq_daemon, in task work and ublk_queue_rq(), ublk_drv requeues rqs.
Signed-off-by: ZiyangZhang <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19 |
|
| #
4e18403d |
| 28-Jul-2022 |
ZiyangZhang <[email protected]> |
ublk_cmd.h: add one new ublk command: UBLK_IO_NEED_GET_DATA
Add one new ublk command: UBLK_IO_NEED_GET_DATA. It is prepared for a new feature designed for a user application who wants to allocate IO
ublk_cmd.h: add one new ublk command: UBLK_IO_NEED_GET_DATA
Add one new ublk command: UBLK_IO_NEED_GET_DATA. It is prepared for a new feature designed for a user application who wants to allocate IO buffer and set IO buffer address only after it receives an IO request from ublksrv.
Reviewed-by: Ming Lei <[email protected]> Signed-off-by: ZiyangZhang <[email protected]> Link: https://lore.kernel.org/r/c8a64b6b51c78340da7daa9e1054608695e79619.1659011443.git.ZiyangZhang@linux.alibaba.com Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
4bf9cbf3 |
| 30-Jul-2022 |
Ming Lei <[email protected]> |
ublk_drv: cleanup ublksrv_ctrl_dev_info
Remove all block device related info from ublksrv_ctrl_dev_info, meantime reduce its size into 64 bytes because:
1) ublksrv_ctrl_dev_info becomes cleaner wit
ublk_drv: cleanup ublksrv_ctrl_dev_info
Remove all block device related info from ublksrv_ctrl_dev_info, meantime reduce its size into 64 bytes because:
1) ublksrv_ctrl_dev_info becomes cleaner without including any block related info
2) generic set/get parameter command can be used to set block related setting easily and cleanly
3) generic set/get parameter command can be used for extending ublk without needing more info in ublksrv_ctrl_dev_info
Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
0aa73170 |
| 30-Jul-2022 |
Ming Lei <[email protected]> |
ublk_drv: add SET_PARAMS/GET_PARAMS control command
Add two commands to set/get parameters generically.
One important goal of ublk is to provide generic framework for making block device by userspa
ublk_drv: add SET_PARAMS/GET_PARAMS control command
Add two commands to set/get parameters generically.
One important goal of ublk is to provide generic framework for making block device by userspace flexibly.
As one generic block device, there are still lots of block parameters, such as max_sectors, write_cache/fua, discard related limits, zoned parameters, ...., so this patch starts to add generic mechanism for set/get device parameters.
Both generic block parameters(all kinds of queue settings) and ublk feature parameters can be covered with this way, then it becomes quite easy to extend in future.
Add two parameter types are used so far: basic(covers basic queue setting and misc settings which can't be grouped easily) and discard, basic type must be set, and discard type becomes optional now
This way provides mechanism to simulate any kind of generic block device from userspace easily, from both block queue setting viewpoint or ublk feature viewpoint.
The style of putting all parameters together is suggested by Christoph.
Suggested-by: Christoph Hellwig <[email protected]> Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v5.19-rc8 |
|
| #
6d8c5afc |
| 22-Jul-2022 |
Ming Lei <[email protected]> |
ublk_drv: make sure that correct flags(features) returned to userspace
Userspace may support more features or new added flags, but the driver side can be old, so make sure correct flags(features) re
ublk_drv: make sure that correct flags(features) returned to userspace
Userspace may support more features or new added flags, but the driver side can be old, so make sure correct flags(features) returned to userpsace, then userspace can work as expected.
Also mark the 2nd flags as reversed, just use the 1st one. When we run out of flags, the reserved one can be handled at that time.
Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: ZiyangZhang <[email protected]> Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
5f8bcc83 |
| 21-Jul-2022 |
Christoph Hellwig <[email protected]> |
ublk: remove UBLK_IO_F_PREFLUSH
REQ_PREFLUSH is turned into REQ_OP_FLUSH by the flush state machine and thus never seen by a blk-mq based driver.
Signed-off-by: Christoph Hellwig <[email protected]> Revie
ublk: remove UBLK_IO_F_PREFLUSH
REQ_PREFLUSH is turned into REQ_OP_FLUSH by the flush state machine and thus never seen by a blk-mq based driver.
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
d276a223 |
| 18-Jul-2022 |
Christoph Hellwig <[email protected]> |
ublk: remove UBLK_IO_F_INTEGRITY
The ublk protocol has no mechanism to actually transfer the integrity metadata, so don't define this flag, which requires that an integrity payload is attached to a
ublk: remove UBLK_IO_F_INTEGRITY
The ublk protocol has no mechanism to actually transfer the integrity metadata, so don't define this flag, which requires that an integrity payload is attached to a bio.
Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v5.19-rc7 |
|
| #
0edb3696 |
| 13-Jul-2022 |
Ming Lei <[email protected]> |
ublk_drv: support to complete io command via task_work_add
Use task_work_add if it is available, since task_work_add can bring up better performance, especially batching signaling ->ubq_daemon can b
ublk_drv: support to complete io command via task_work_add
Use task_work_add if it is available, since task_work_add can bring up better performance, especially batching signaling ->ubq_daemon can be done.
It is observed that task_work_add() can boost iops by +4% on random 4k io test. Also except for completing io command, all other code paths are same with completing io command via io_uring_cmd_complete_in_task.
Meantime add one flag of UBLK_F_URING_CMD_COMP_IN_TASK for comparing the mode easily.
Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
71f28f31 |
| 13-Jul-2022 |
Ming Lei <[email protected]> |
ublk_drv: add io_uring based userspace block driver
This is the driver part of userspace block driver(ublk driver), the other part is userspace daemon part(ublksrv)[1].
The two parts communicate by
ublk_drv: add io_uring based userspace block driver
This is the driver part of userspace block driver(ublk driver), the other part is userspace daemon part(ublksrv)[1].
The two parts communicate by io_uring's IORING_OP_URING_CMD with one shared cmd buffer for storing io command, and the buffer is read only for ublksrv, each io command is indexed by io request tag directly, and is written by ublk driver.
For example, when one READ io request is submitted to ublk block driver, ublk driver stores the io command into cmd buffer first, then completes one IORING_OP_URING_CMD for notifying ublksrv, and the URING_CMD is issued to ublk driver beforehand by ublksrv for getting notification of any new io request, and each URING_CMD is associated with one io request by tag.
After ublksrv gets the io command, it translates and handles the ublk io request, such as, for the ublk-loop target, ublksrv translates the request into same request on another file or disk, like the kernel loop block driver. In ublksrv's implementation, the io is still handled by io_uring, and share same ring with IORING_OP_URING_CMD command. When the target io request is done, the same IORING_OP_URING_CMD is issued to ublk driver for both committing io request result and getting future notification of new io request.
Another thing done by ublk driver is to copy data between kernel io request and ublksrv's io buffer:
1) before ubsrv handles WRITE request, copy the request's data into ublksrv's userspace io buffer, so that ublksrv can handle the write request
2) after ubsrv handles READ request, copy ublksrv's userspace io buffer into this READ request, then ublk driver can complete the READ request
Zero copy may be switched if mm is ready to support it.
ublk driver doesn't handle any logic of the specific user space driver, so it is small/simple enough.
[1] ublksrv
https://github.com/ming1/ubdsrv
Signed-off-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|