History log of /linux-6.15/block/bdev.c (Results 1 – 25 of 94)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4
# 5f33b522 23-Apr-2025 Christoph Hellwig <[email protected]>

block: don't autoload drivers on stat

blkdev_get_no_open can trigger the legacy autoload of block drivers. A
simple stat of a block device has not historically done that, so disable
this behavior a

block: don't autoload drivers on stat

blkdev_get_no_open can trigger the legacy autoload of block drivers. A
simple stat of a block device has not historically done that, so disable
this behavior again.

Fixes: 9abcfbd235f5 ("block: Add atomic write support for statx")
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# d13b7090 23-Apr-2025 Christoph Hellwig <[email protected]>

block: remove the backing_inode variable in bdev_statx

backing_inode is only used once, so remove it and update the comment
describing the bdev lookup to be a bit more clear.

Signed-off-by: Christo

block: remove the backing_inode variable in bdev_statx

backing_inode is only used once, so remove it and update the comment
describing the bdev lookup to be a bit more clear.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# e03463d2 23-Apr-2025 Darrick J. Wong <[email protected]>

block: hoist block size validation code to a separate function

Hoist the block size validation code to bdev_validate_blocksize so that
we can call it from filesystems that don't care about the bdev

block: hoist block size validation code to a separate function

Hoist the block size validation code to bdev_validate_blocksize so that
we can call it from filesystems that don't care about the bdev pagecache
manipulations of set_blocksize.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/r/174543795720.4139148.840349813093799165.stgit@frogsfrogsfrogs
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# c0e473a0 23-Apr-2025 Darrick J. Wong <[email protected]>

block: fix race between set_blocksize and read paths

With the new large sector size support, it's now the case that
set_blocksize can change i_blksize and the folio order in a manner that
conflicts

block: fix race between set_blocksize and read paths

With the new large sector size support, it's now the case that
set_blocksize can change i_blksize and the folio order in a manner that
conflicts with a concurrent reader and causes a kernel crash.

Specifically, let's say that udev-worker calls libblkid to detect the
labels on a block device. The read call can create an order-0 folio to
read the first 4096 bytes from the disk. But then udev is preempted.

Next, someone tries to mount an 8k-sectorsize filesystem from the same
block device. The filesystem calls set_blksize, which sets i_blksize to
8192 and the minimum folio order to 1.

Now udev resumes, still holding the order-0 folio it allocated. It then
tries to schedule a read bio and do_mpage_readahead tries to create
bufferheads for the folio. Unfortunately, blocks_per_folio == 0 because
the page size is 4096 but the blocksize is 8192 so no bufferheads are
attached and the bh walk never sets bdev. We then submit the bio with a
NULL block device and crash.

Therefore, truncate the page cache after flushing but before updating
i_blksize. However, that's not enough -- we also need to lock out file
IO and page faults during the update. Take both the i_rwsem and the
invalidate_lock in exclusive mode for invalidations, and in shared mode
for read/write operations.

I don't know if this is the correct fix, but xfs/259 found it.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Tested-by: Shin'ichiro Kawasaki <[email protected]>
Link: https://lore.kernel.org/r/174543795699.4139148.2086129139322431423.stgit@frogsfrogsfrogs
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.15-rc3
# 777d0961 17-Apr-2025 Christoph Hellwig <[email protected]>

fs: move the bdex_statx call to vfs_getattr_nosec

Currently bdex_statx is only called from the very high-level
vfs_statx_path function, and thus bypassing it for in-kernel calls
to vfs_getattr or vf

fs: move the bdex_statx call to vfs_getattr_nosec

Currently bdex_statx is only called from the very high-level
vfs_statx_path function, and thus bypassing it for in-kernel calls
to vfs_getattr or vfs_getattr_nosec.

This breaks querying the block ѕize of the underlying device in the
loop driver and also is a pitfall for any other new kernel caller.

Move the call into the lowest level helper to ensure all callers get
the right results.

Fixes: 2d985f8c6b91 ("vfs: support STATX_DIOALIGN on block devices")
Fixes: f4774e92aab8 ("loop: take the file system minimum dio alignment into account")
Reported-by: "Darrick J. Wong" <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

show more ...


Revision tags: v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6
# a64e5a59 07-Mar-2025 Luis Chamberlain <[email protected]>

bdev: add back PAGE_SIZE block size validation for sb_set_blocksize()

The commit titled "block/bdev: lift block size restrictions to 64k"
lifted the block layer's max supported block size to 64k ins

bdev: add back PAGE_SIZE block size validation for sb_set_blocksize()

The commit titled "block/bdev: lift block size restrictions to 64k"
lifted the block layer's max supported block size to 64k inside the
helper blk_validate_block_size() now that we support large folios.
However in lifting the block size we also removed the silly use
cases many filesystems have to use sb_set_blocksize() to *verify*
that the block size <= PAGE_SIZE. The call to sb_set_blocksize() was
used to check the block size <= PAGE_SIZE since historically we've
always supported userspace to create for example 64k block size
filesystems even on 4k page size systems, but what we didn't allow
was mounting them. Older filesystems have been using the check with
sb_set_blocksize() for years.

While, we could argue that such checks should be filesystem specific,
there are much more users of sb_set_blocksize() than LBS enabled
filesystem on upstream, so just do the easier thing and bring back
the PAGE_SIZE check for sb_set_blocksize() users and only skip it
for LBS enabled filesystems.

This will ensure that tests such as generic/466 when run in a loop
against say, ext4, won't try to try to actually mount a filesystem with
a block size larger than your filesystem supports given your PAGE_SIZE
and in the worst case crash.

Cc: Kent Overstreet <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Kent Overstreet <[email protected]>
Reviewed-by: "Darrick J. Wong" <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

show more ...


Revision tags: v6.14-rc5, v6.14-rc4
# 425fbcd6 21-Feb-2025 Luis Chamberlain <[email protected]>

bdev: use bdev_io_min() for statx block size

You can use lsblk to query for a block device block device block size:

lsblk -o MIN-IO /dev/nvme0n1
MIN-IO
4096

The min-io is the minimum IO the block

bdev: use bdev_io_min() for statx block size

You can use lsblk to query for a block device block device block size:

lsblk -o MIN-IO /dev/nvme0n1
MIN-IO
4096

The min-io is the minimum IO the block device prefers for optimal
performance. In turn we map this to the block device block size.
The current block size exposed even for block devices with an
LBA format of 16k is 4k. Likewise devices which support 4k LBA format
but have a larger Indirection Unit of 16k have an exposed block size
of 4k.

This incurs read-modify-writes on direct IO against devices with a
min-io larger than the page size. To fix this, use the block device
min io, which is the minimal optimal IO the device prefers.

With this we now get:

lsblk -o MIN-IO /dev/nvme0n1
MIN-IO
16384

And so userspace gets the appropriate information it needs for optimal
performance. This is verified with blkalgn against mkfs against a
device with LBA format of 4k but an NPWG of 16k (min io size)

mkfs.xfs -f -b size=16k /dev/nvme3n1
blkalgn -d nvme3n1 --ops Write

Block size : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 0 | |
8192 -> 16383 : 0 | |
16384 -> 32767 : 66 |****************************************|
32768 -> 65535 : 0 | |
65536 -> 131071 : 0 | |
131072 -> 262143 : 2 |* |
Block size: 14 - 66
Block size: 17 - 2

Algn size : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 0 | |
256 -> 511 : 0 | |
512 -> 1023 : 0 | |
1024 -> 2047 : 0 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 0 | |
8192 -> 16383 : 0 | |
16384 -> 32767 : 66 |****************************************|
32768 -> 65535 : 0 | |
65536 -> 131071 : 0 | |
131072 -> 262143 : 2 |* |
Algn size: 14 - 66
Algn size: 17 - 2

Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: John Garry <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# 47dd6753 21-Feb-2025 Luis Chamberlain <[email protected]>

block/bdev: lift block size restrictions to 64k

We now can support blocksizes larger than PAGE_SIZE, so in theory
we should be able to lift the restriction up to the max supported page
cache order.

block/bdev: lift block size restrictions to 64k

We now can support blocksizes larger than PAGE_SIZE, so in theory
we should be able to lift the restriction up to the max supported page
cache order. However bound ourselves to what we can currently validate
and test. Through blktests and fstest we can validate up to 64k today.

Reviewed-by: Hannes Reinecke <[email protected]>
Reviewed-by: "Matthew Wilcox (Oracle)" <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# 3c209171 21-Feb-2025 Hannes Reinecke <[email protected]>

block/bdev: enable large folio support for large logical block sizes

Call mapping_set_folio_min_order() when modifying the logical block
size to ensure folios are allocated with the correct size.

R

block/bdev: enable large folio support for large logical block sizes

Call mapping_set_folio_min_order() when modifying the logical block
size to ensure folios are allocated with the correct size.

Reviewed-by: Luis Chamberlain <[email protected]>
Reviewed-by: "Matthew Wilcox (Oracle)" <[email protected]>
Signed-off-by: Hannes Reinecke <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: John Garry <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>

show more ...


Revision tags: v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4
# 26fff8a4 18-Dec-2024 Luis Chamberlain <[email protected]>

block/bdev: use helper for max block size check

We already have a helper for checking the limits on the block size
both low and high, just use that.

No functional changes.

Reviewed-by: John Garry

block/bdev: use helper for max block size check

We already have a helper for checking the limits on the block size
both low and high, just use that.

No functional changes.

Reviewed-by: John Garry <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6
# aa3d8a36 26-Aug-2024 NeilBrown <[email protected]>

block: change wait on bd_claiming to use a var_waitqueue

bd_prepare_to_claim() waits for a var to change, not for a bit to be
cleared. Change from bit_waitqueue() to __var_waitqueue() and
correspond

block: change wait on bd_claiming to use a var_waitqueue

bd_prepare_to_claim() waits for a var to change, not for a bit to be
cleared. Change from bit_waitqueue() to __var_waitqueue() and
correspondingly use wake_up_var(). This will allow a future patch which
change the "bit" function to expect an "unsigned long *" instead of
"void *".

Signed-off-by: NeilBrown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2
# b55d26bd 03-Aug-2024 Deven Bowers <[email protected]>

block,lsm: add LSM blob and new LSM hooks for block devices

This patch introduces a new LSM blob to the block_device structure,
enabling the security subsystem to store security-sensitive data relat

block,lsm: add LSM blob and new LSM hooks for block devices

This patch introduces a new LSM blob to the block_device structure,
enabling the security subsystem to store security-sensitive data related
to block devices. Currently, for a device mapper's mapped device containing
a dm-verity target, critical security information such as the roothash and
its signing state are not readily accessible. Specifically, while the
dm-verity volume creation process passes the dm-verity roothash and its
signature from userspace to the kernel, the roothash is stored privately
within the dm-verity target, and its signature is discarded
post-verification. This makes it extremely hard for the security subsystem
to utilize these data.

With the addition of the LSM blob to the block_device structure, the
security subsystem can now retain and manage important security metadata
such as the roothash and the signing state of a dm-verity by storing them
inside the blob. Access decisions can then be based on these stored data.

The implementation follows the same approach used for security blobs in
other structures like struct file, struct inode, and struct superblock.
The initialization of the security blob occurs after the creation of the
struct block_device, performed by the security subsystem. Similarly, the
security blob is freed by the security subsystem before the struct
block_device is deallocated or freed.

This patch also introduces a new hook security_bdev_setintegrity() to save
block device's integrity data to the new LSM blob. For example, for
dm-verity, it can use this hook to expose its roothash and signing state
to LSMs, then LSMs can save these data into the LSM blob.

Please note that the new hook should be invoked every time the security
information is updated to keep these data current. For example, in
dm-verity, if the mapping table is reloaded and configured to use a
different dm-verity target with a new roothash and signing information,
the previously stored data in the LSM blob will become obsolete. It is
crucial to re-invoke the hook to refresh these data and ensure they are up
to date. This necessity arises from the design of device-mapper, where a
device-mapper device is first created, and then targets are subsequently
loaded into it. These targets can be modified multiple times during the
device's lifetime. Therefore, while the LSM blob is allocated during the
creation of the block device, its actual contents are not initialized at
this stage and can change substantially over time. This includes
alterations from data that the LSM 'trusts' to those it does not, making
it essential to handle these changes correctly. Failure to address this
dynamic aspect could potentially allow for bypassing LSM checks.

Signed-off-by: Deven Bowers <[email protected]>
Signed-off-by: Fan Wu <[email protected]>
[PM: merge fuzz, subject line tweaks]
Signed-off-by: Paul Moore <[email protected]>

show more ...


Revision tags: v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5
# 9abcfbd2 20-Jun-2024 Prasad Singamsetty <[email protected]>

block: Add atomic write support for statx

Extend statx system call to return additional info for atomic write support
support if the specified file is a block device.

Reviewed-by: Martin K. Peterse

block: Add atomic write support for statx

Extend statx system call to return additional info for atomic write support
support if the specified file is a block device.

Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Prasad Singamsetty <[email protected]>
Signed-off-by: John Garry <[email protected]>
Reviewed-by: Keith Busch <[email protected]>
Acked-by: Darrick J. Wong <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.10-rc4
# d9c23321 14-Jun-2024 Jiapeng Chong <[email protected]>

bdev: make blockdev_mnt static

The blockdev_mnt are not used outside the file bdev.c, so the modification
is defined as static.

block/bdev.c:377:17: warning: symbol 'blockdev_mnt' was not declared.

bdev: make blockdev_mnt static

The blockdev_mnt are not used outside the file bdev.c, so the modification
is defined as static.

block/bdev.c:377:17: warning: symbol 'blockdev_mnt' was not declared. Should it be static?

Reported-by: Abaci Robot <[email protected]>
jpg: Remove closes bugzilla link
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jiapeng Chong <[email protected]>
Signed-off-by: John Garry <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Fixes: 8f3a608827d1 ("bdev: open block device as files")
Tested-by: John Garry <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7
# 203c1ce0 29-Apr-2024 Al Viro <[email protected]>

RIP ->bd_inode

Signed-off-by: Al Viro <[email protected]>


# df65f166 28-Apr-2024 Al Viro <[email protected]>

block/bdev.c: use the knowledge of inode/bdev coallocation

Here we know that bdevfs inodes are coallocated with struct block_device
and we can get to ->bd_inode value without any dereferencing. Int

block/bdev.c: use the knowledge of inode/bdev coallocation

Here we know that bdevfs inodes are coallocated with struct block_device
and we can get to ->bd_inode value without any dereferencing. Introduce
an inlined helper (static, *not* exported, purely internal for bdev.c)
that gets an associated inode by block_device - BD_INODE(bdev).

NOTE: leave it static; nobody outside of block/bdev.c has any business
playing with that.

Signed-off-by: Al Viro <[email protected]>

show more ...


Revision tags: v6.9-rc6, v6.9-rc5, v6.9-rc4
# 224941e8 11-Apr-2024 Al Viro <[email protected]>

use ->bd_mapping instead of ->bd_inode->i_mapping

Just the low-hanging fruit...

Signed-off-by: Al Viro <[email protected]>
Link: https://lore.kernel.org/r/20240411145346.2516848-2-viro@zeniv.

use ->bd_mapping instead of ->bd_inode->i_mapping

Just the low-hanging fruit...

Signed-off-by: Al Viro <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# e33aef2c 11-Apr-2024 Al Viro <[email protected]>

block_device: add a pointer to struct address_space (page cache of bdev)

points to ->i_data of coallocated inode.

Signed-off-by: Al Viro <[email protected]>
Link: https://lore.kernel.org/r/20

block_device: add a pointer to struct address_space (page cache of bdev)

points to ->i_data of coallocated inode.

Signed-off-by: Al Viro <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# 2638c208 28-Apr-2024 Al Viro <[email protected]>

missing helpers: bdev_unhash(), bdev_drop()

bdev_unhash(): make block device invisible to lookups by device number
bdev_drop(): drop reference to associated inode.

Both are internal, for use by gen

missing helpers: bdev_unhash(), bdev_drop()

bdev_unhash(): make block device invisible to lookups by device number
bdev_drop(): drop reference to associated inode.

Both are internal, for use by genhd and partition-related code - similar
to bdev_add(). The logics in there (especially the lifetime-related
parts of it) ought to be cleaned up, but that's a separate story; here
we just encapsulate getting to associated inode.

Signed-off-by: Al Viro <[email protected]>

show more ...


# 186ddac2 11-Apr-2024 Yu Kuai <[email protected]>

block: move two helpers into bdev.c

disk_live() and block_size() access bd_inode directly, prepare to remove
the field bd_inode from block_device, and only access bd_inode in block
layer.

Signed-of

block: move two helpers into bdev.c

disk_live() and block_size() access bd_inode directly, prepare to remove
the field bd_inode from block_device, and only access bd_inode in block
layer.

Signed-off-by: Yu Kuai <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Christian Brauner <[email protected]>

show more ...


# ac2b6f9d 12-Apr-2024 Al Viro <[email protected]>

bdev: move ->bd_has_subit_bio to ->__bd_flags

In bdev_alloc() we have all flags initialized to false, so
assignment to ->bh_has_submit_bio n there is a no-op unless
we have partno != 0 and flag alre

bdev: move ->bd_has_subit_bio to ->__bd_flags

In bdev_alloc() we have all flags initialized to false, so
assignment to ->bh_has_submit_bio n there is a no-op unless
we have partno != 0 and flag already set on entire device.

In device_add_disk() we have just allocated the block_device
in question and it had been a full-device one, so the flag
is guaranteed to be still clear when we get to assignment.

Signed-off-by: Al Viro <[email protected]>

show more ...


# 4c80105e 12-Apr-2024 Al Viro <[email protected]>

bdev: move ->bd_write_holder into ->__bd_flags

Signed-off-by: Al Viro <[email protected]>


# 1116b9fa 12-Apr-2024 Al Viro <[email protected]>

bdev: infrastructure for flags

Replace bd_partno with a 32bit field (__bd_flags). The lower 8 bits
contain the partition number, the upper 24 are for flags.

Helpers: bdev_{test,set,clear}_flag(bde

bdev: infrastructure for flags

Replace bd_partno with a 32bit field (__bd_flags). The lower 8 bits
contain the partition number, the upper 24 are for flags.

Helpers: bdev_{test,set,clear}_flag(bdev, flag), with atomic_or()
and atomic_andnot() used to set/clear.

NOTE: this commit does not actually move any flags over there - they
are still bool fields. As the result, it shifts the fields wrt
cacheline boundaries; that's going to be restored once the first
3 flags are dealt with.

Signed-off-by: Al Viro <[email protected]>

show more ...


# d18a8679 02-May-2024 Al Viro <[email protected]>

make set_blocksize() fail unless block device is opened exclusive

Signed-off-by: Al Viro <[email protected]>


# ead083ae 18-Apr-2024 Al Viro <[email protected]>

set_blocksize(): switch to passing struct file *

Signed-off-by: Al Viro <[email protected]>


1234