|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1 |
|
| #
d05af90d |
| 25-Mar-2025 |
Yu Kuai <[email protected]> |
md/raid10: fix missing discard IO accounting
md_account_bio() is not called from raid10_handle_discard(), now that we handle bitmap inside md_account_bio(), also fix missing bitmap_startwrite for di
md/raid10: fix missing discard IO accounting
md_account_bio() is not called from raid10_handle_discard(), now that we handle bitmap inside md_account_bio(), also fix missing bitmap_startwrite for discard.
Test whole disk discard for 20G raid10:
Before: Device d/s dMB/s drqm/s %drqm d_await dareq-sz md0 48.00 16.00 0.00 0.00 5.42 341.33
After: Device d/s dMB/s drqm/s %drqm d_await dareq-sz md0 68.00 20462.00 0.00 0.00 2.65 308133.65
Link: https://lore.kernel.org/linux-raid/[email protected] Fixes: 528bc2cf2fcc ("md/raid10: enable io accounting") Signed-off-by: Yu Kuai <[email protected]> Acked-by: Coly Li <[email protected]>
show more ...
|
|
Revision tags: v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5 |
|
| #
d301f164 |
| 27-Feb-2025 |
Zheng Qixing <[email protected]> |
badblocks: use sector_t instead of int to avoid truncation of badblocks length
There is a truncation of badblocks length issue when set badblocks as follow:
echo "2055 4294967299" > bad_blocks cat
badblocks: use sector_t instead of int to avoid truncation of badblocks length
There is a truncation of badblocks length issue when set badblocks as follow:
echo "2055 4294967299" > bad_blocks cat bad_blocks 2055 3
Change 'sectors' argument type from 'int' to 'sector_t'.
This change avoids truncation of badblocks length for large sectors by replacing 'int' with 'sector_t' (u64), enabling proper handling of larger disk sizes and ensuring compatibility with 64-bit sector addressing.
Fixes: 9e0e252a048b ("badblocks: Add core badblock management code") Signed-off-by: Zheng Qixing <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Acked-by: Coly Li <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
7e5102dd |
| 27-Feb-2025 |
Zheng Qixing <[email protected]> |
md: improve return types of badblocks handling functions
rdev_set_badblocks() only indicates success/failure, so convert its return type from int to boolean for better semantic clarity.
rdev_clear_
md: improve return types of badblocks handling functions
rdev_set_badblocks() only indicates success/failure, so convert its return type from int to boolean for better semantic clarity.
rdev_clear_badblocks() return value is never used by any caller, convert it to void. This removes unnecessary value returns.
Also update narrow_write_error() in both raid1 and raid10 to use boolean return type to match rdev_set_badblocks().
Signed-off-by: Zheng Qixing <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
3db44044 |
| 06-Mar-2025 |
Xiao Ni <[email protected]> |
md/raid10: wait barrier before returning discard request with REQ_NOWAIT
raid10_handle_discard should wait barrier before returning a discard bio which has REQ_NOWAIT. And there is no need to print
md/raid10: wait barrier before returning discard request with REQ_NOWAIT
raid10_handle_discard should wait barrier before returning a discard bio which has REQ_NOWAIT. And there is no need to print warning calltrace if a discard bio has REQ_NOWAIT flag. Quality engineer usually checks dmesg and reports error if dmesg has warning/error calltrace.
Fixes: c9aa889b035f ("md: raid10 add nowait support") Signed-off-by: Xiao Ni <[email protected]> Acked-by: Coly Li <[email protected]> Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
| #
e879a0d9 |
| 27-Feb-2025 |
Yu Kuai <[email protected]> |
md/raid1,raid10: don't ignore IO flags
If blk-wbt is enabled by default, it's found that raid write performance is quite bad because all IO are throttled by wbt of underlying disks, due to flag REQ_
md/raid1,raid10: don't ignore IO flags
If blk-wbt is enabled by default, it's found that raid write performance is quite bad because all IO are throttled by wbt of underlying disks, due to flag REQ_IDLE is ignored. And turns out this behaviour exist since blk-wbt is introduced.
Other than REQ_IDLE, other flags should not be ignored as well, for example REQ_META can be set for filesystems, clearing it can cause priority reverse problems; And REQ_NOWAIT should not be cleared as well, because io will wait instead of failing directly in underlying disks.
Fix those problems by keep IO flags from master bio.
Fises: f51d46d0e7cb ("md: add support for REQ_NOWAIT") Fixes: e34cbd307477 ("blk-wbt: add general throttling mechanism") Fixes: 5404bc7a87b9 ("[PATCH] Allow file systems to differentiate between data and meta reads") Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4, v6.14-rc3 |
|
| #
c594de04 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: don't export md_cluster_ops
Add a new field 'cluster_ops' and initialize it md_setup_cluster(), so that the gloable variable 'md_cluter_ops' doesn't need to be exported. Also prepare to switch m
md: don't export md_cluster_ops
Add a new field 'cluster_ops' and initialize it md_setup_cluster(), so that the gloable variable 'md_cluter_ops' doesn't need to be exported. Also prepare to switch md-cluster to use md_submod_head.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Su Yue <[email protected]>
show more ...
|
| #
3d44e1d1 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: switch personalities to use md_submodule_head
Remove the global list 'pers_list', and switch to use md_submodule_head, which is managed by xarry. Prepare to unify registration and unregistration
md: switch personalities to use md_submodule_head
Remove the global list 'pers_list', and switch to use md_submodule_head, which is managed by xarry. Prepare to unify registration and unregistration for all sub modules.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
| #
bf0a7326 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: only include md-cluster.h if necessary
md-cluster is only supportted by raid1 and raid10, there is no need to include md-cluster.h for other personalities.
Also move APIs that is only used in m
md: only include md-cluster.h if necessary
md-cluster is only supportted by raid1 and raid10, there is no need to include md-cluster.h for other personalities.
Also move APIs that is only used in md-cluster.c from md.h to md-cluster.h.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Su Yue <[email protected]>
show more ...
|
| #
fbe8f2fa |
| 12-Feb-2025 |
Bart Van Assche <[email protected]> |
md/raid*: Fix the set_queue_limits implementations
queue_limits_cancel_update() must only be called if queue_limits_start_update() is called first. Remove the queue_limits_cancel_update() calls from
md/raid*: Fix the set_queue_limits implementations
queue_limits_cancel_update() must only be called if queue_limits_start_update() is called first. Remove the queue_limits_cancel_update() calls from the raid*_set_limits() functions because there is no corresponding queue_limits_start_update() call.
Cc: Christoph Hellwig <[email protected]> Fixes: c6e56cf6b2e7 ("block: move integrity information into queue_limits") Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/linux-raid/[email protected]/ Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc2, v6.14-rc1, v6.13 |
|
| #
6a7e17b2 |
| 16-Jan-2025 |
John Garry <[email protected]> |
block: Add common atomic writes enable flag
Currently only stacked devices need to explicitly enable atomic writes by setting BLK_FEAT_ATOMIC_WRITES_STACKED flag.
This does not work well for device
block: Add common atomic writes enable flag
Currently only stacked devices need to explicitly enable atomic writes by setting BLK_FEAT_ATOMIC_WRITES_STACKED flag.
This does not work well for device mapper stacking devices, as there many sets of limits are stacked and what is the 'bottom' and 'top' device can swapped. This means that BLK_FEAT_ATOMIC_WRITES_STACKED needs to be set for many queue limits, which is messy.
Generalize enabling atomic writes enabling by ensuring that all devices must explicitly set a flag - that includes NVMe, SCSI sd, and md raid.
Signed-off-by: John Garry <[email protected]> Reviewed-by: Mike Snitzer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc7 |
|
| #
cd5fc653 |
| 09-Jan-2025 |
Yu Kuai <[email protected]> |
md/md-bitmap: move bitmap_{start, end}write to md upper layer
There are two BUG reports that raid5 will hang at bitmap_startwrite([1],[2]), root cause is that bitmap start write and end write is unb
md/md-bitmap: move bitmap_{start, end}write to md upper layer
There are two BUG reports that raid5 will hang at bitmap_startwrite([1],[2]), root cause is that bitmap start write and end write is unbalanced, it's not quite clear where, and while reviewing raid5 code, it's found that bitmap operations can be optimized. For example, for a 4 disks raid5, with chunksize=8k, if user issue a IO (0 + 48k) to the array:
┌────────────────────────────────────────────────────────────┐ │chunk 0 │ │ ┌────────────┬─────────────┬─────────────┬────────────┼ │ sh0 │A0: 0 + 4k │A1: 8k + 4k │A2: 16k + 4k │A3: P │ │ ┼────────────┼─────────────┼─────────────┼────────────┼ │ sh1 │B0: 4k + 4k │B1: 12k + 4k │B2: 20k + 4k │B3: P │ ┼──────┴────────────┴─────────────┴─────────────┴────────────┼ │chunk 1 │ │ ┌────────────┬─────────────┬─────────────┬────────────┤ │ sh2 │C0: 24k + 4k│C1: 32k + 4k │C2: P │C3: 40k + 4k│ │ ┼────────────┼─────────────┼─────────────┼────────────┼ │ sh3 │D0: 28k + 4k│D1: 36k + 4k │D2: P │D3: 44k + 4k│ └──────┴────────────┴─────────────┴─────────────┴────────────┘
Before this patch, 4 stripe head will be used, and each sh will attach bio for 3 disks, and each attached bio will trigger bitmap_startwrite() once, which means total 12 times. - 3 times (0 + 4k), for (A0, A1 and A2) - 3 times (4 + 4k), for (B0, B1 and B2) - 3 times (8 + 4k), for (C0, C1 and C3) - 3 times (12 + 4k), for (D0, D1 and D3)
After this patch, md upper layer will calculate that IO range (0 + 48k) is corresponding to the bitmap (0 + 16k), and call bitmap_startwrite() just once.
Noted that this patch will align bitmap ranges to the chunks, for example, if user issue a IO (0 + 4k) to array:
- Before this patch, 1 time (0 + 4k), for A0; - After this patch, 1 time (0 + 8k) for chunk 0;
Usually, one bitmap bit will represent more than one disk chunk, and this doesn't have any difference. And even if user really created a array that one chunk contain multiple bits, the overhead is that more data will be recovered after power failure.
Also remove STRIPE_BITMAP_PENDING since it's not used anymore.
[1] https://lore.kernel.org/all/CAJpMwyjmHQLvm6zg1cmQErttNNQPDAAXPKM3xgTjMhbfts986Q@mail.gmail.com/ [2] https://lore.kernel.org/all/[email protected]/
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
4f0e7d0e |
| 09-Jan-2025 |
Yu Kuai <[email protected]> |
md/md-bitmap: remove the last parameter for bimtap_ops->endwrite()
For the case that IO failed for one rdev, the bit will be mark as NEEDED in following cases:
1) If badblocks is set and rdev is no
md/md-bitmap: remove the last parameter for bimtap_ops->endwrite()
For the case that IO failed for one rdev, the bit will be mark as NEEDED in following cases:
1) If badblocks is set and rdev is not faulty; 2) If rdev is faulty;
Case 1) is useless because synchronize data to badblocks make no sense. Case 2) can be replaced with mddev->degraded.
Also remove R1BIO_Degraded, R10BIO_Degraded and STRIPE_DEGRADED since case 2) no longer use them.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
08c50142 |
| 09-Jan-2025 |
Yu Kuai <[email protected]> |
md/md-bitmap: factor behind write counters out from bitmap_{start/end}write()
behind_write is only used in raid1, prepare to refactor bitmap_{start/end}write(), there are no functional changes.
Sig
md/md-bitmap: factor behind write counters out from bitmap_{start/end}write()
behind_write is only used in raid1, prepare to refactor bitmap_{start/end}write(), there are no functional changes.
Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Xiao Ni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1 |
|
| #
a1d9b4fd |
| 18-Nov-2024 |
John Garry <[email protected]> |
md/raid10: Atomic write support
Set BLK_FEAT_ATOMIC_WRITES_STACKED to enable atomic writes.
For an attempt to atomic write to a region which has bad blocks, error the write as we just cannot do thi
md/raid10: Atomic write support
Set BLK_FEAT_ATOMIC_WRITES_STACKED to enable atomic writes.
For an attempt to atomic write to a region which has bad blocks, error the write as we just cannot do this. It is unlikely to find devices which support atomic writes and bad blocks.
Reviewed-by: Yu Kuai <[email protected]> Signed-off-by: John Garry <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.12 |
|
| #
4cf58d95 |
| 11-Nov-2024 |
John Garry <[email protected]> |
md/raid10: Handle bio_split() errors
Add proper bio_split() error handling. For any error, call raid_end_bio_io() and return. Except for discard, where we end the bio directly.
Reviewed-by: Yu Kuai
md/raid10: Handle bio_split() errors
Add proper bio_split() error handling. For any error, call raid_end_bio_io() and return. Except for discard, where we end the bio directly.
Reviewed-by: Yu Kuai <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc7, v6.12-rc6 |
|
| #
d419284c |
| 31-Oct-2024 |
Yu Kuai <[email protected]> |
md/raid10: don't wait for Faulty rdev in wait_blocked_rdev()
Faulty rdev should never be accessed anymore, hence there is no point to wait for bad block to be acknowledged in this case while handlin
md/raid10: don't wait for Faulty rdev in wait_blocked_rdev()
Faulty rdev should never be accessed anymore, hence there is no point to wait for bad block to be acknowledged in this case while handling write request.
Signed-off-by: Yu Kuai <[email protected]> Tested-by: Mariusz Tkaczyk <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc5, v6.12-rc4, v6.12-rc3 |
|
| #
825711e0 |
| 09-Oct-2024 |
Yu Kuai <[email protected]> |
md/raid10: fix null ptr dereference in raid10_size()
In raid10_run() if raid10_set_queue_limits() succeed, the return value is set to zero, and if following procedures failed raid10_run() will retur
md/raid10: fix null ptr dereference in raid10_size()
In raid10_run() if raid10_set_queue_limits() succeed, the return value is set to zero, and if following procedures failed raid10_run() will return zero while mddev->private is still NULL, causing null ptr dereference in raid10_size().
Fix the problem by only overwrite the return value if raid10_set_queue_limits() failed.
Fixes: 3d8466ba68d4 ("md/raid10: use the atomic queue limit update APIs") Cc: [email protected] Reported-and-tested-by: ValdikSS <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6 |
|
| #
77c09640 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_resize() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Signed-off-by
md/md-bitmap: merge md_bitmap_resize() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
e1791dae |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: pass in mddev directly for md_bitmap_resize()
And move the condition "if (mddev->bitmap)" into md_bitmap_resize() as well, on the one hand make code cleaner, on the other hand try not
md/md-bitmap: pass in mddev directly for md_bitmap_resize()
And move the condition "if (mddev->bitmap)" into md_bitmap_resize() as well, on the one hand make code cleaner, on the other hand try not to access bitmap directly.
Since we are here, also change the parameter 'init' from int to bool.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
48eb9581 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_unplug_async() into md_bitmap_unplug()
Add a parameter 'bool sync' to distinguish them, and md_bitmap_unplug_async() won't be exported anymore, hence bitmap_operations
md/md-bitmap: merge md_bitmap_unplug_async() into md_bitmap_unplug()
Add a parameter 'bool sync' to distinguish them, and md_bitmap_unplug_async() won't be exported anymore, hence bitmap_operations only need one op to cover them.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
15db1eca |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_cond_end_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also c
md/md-bitmap: merge md_bitmap_cond_end_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
077b18ab |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_close_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also chan
md/md-bitmap: merge md_bitmap_close_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
1415f402 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_end_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also change
md/md-bitmap: merge md_bitmap_end_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
9be669bd |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: remove the parameter 'aborted' for md_bitmap_end_sync()
For internal callers, aborted are always set to false, while for external callers, aborted are always set to true.
Hence there
md/md-bitmap: remove the parameter 'aborted' for md_bitmap_end_sync()
For internal callers, aborted are always set to false, while for external callers, aborted are always set to true.
Hence there is no need to always pass in true for exported api.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
fe6a19d4 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_start_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also chan
md/md-bitmap: merge md_bitmap_start_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible.
Also fix lots of code style.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|