|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2 |
|
| #
b7c178d9 |
| 08-Apr-2025 |
Meir Elisha <[email protected]> |
md/raid1: Add check for missing source disk in process_checks()
During recovery/check operations, the process_checks function loops through available disks to find a 'primary' source with successful
md/raid1: Add check for missing source disk in process_checks()
During recovery/check operations, the process_checks function loops through available disks to find a 'primary' source with successfully read data.
If no suitable source disk is found after checking all possibilities, the 'primary' index will reach conf->raid_disks * 2. Add an explicit check for this condition after the loop. If no source disk was found, print an error message and return early to prevent further processing without a valid primary source.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Meir Elisha <[email protected]> Suggested-and-reviewed-by: Yu Kuai <[email protected]> Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
|
Revision tags: v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5 |
|
| #
d301f164 |
| 27-Feb-2025 |
Zheng Qixing <[email protected]> |
badblocks: use sector_t instead of int to avoid truncation of badblocks length
There is a truncation of badblocks length issue when set badblocks as follow:
echo "2055 4294967299" > bad_blocks cat
badblocks: use sector_t instead of int to avoid truncation of badblocks length
There is a truncation of badblocks length issue when set badblocks as follow:
echo "2055 4294967299" > bad_blocks cat bad_blocks 2055 3
Change 'sectors' argument type from 'int' to 'sector_t'.
This change avoids truncation of badblocks length for large sectors by replacing 'int' with 'sector_t' (u64), enabling proper handling of larger disk sizes and ensuring compatibility with 64-bit sector addressing.
Fixes: 9e0e252a048b ("badblocks: Add core badblock management code") Signed-off-by: Zheng Qixing <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Acked-by: Coly Li <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
7e5102dd |
| 27-Feb-2025 |
Zheng Qixing <[email protected]> |
md: improve return types of badblocks handling functions
rdev_set_badblocks() only indicates success/failure, so convert its return type from int to boolean for better semantic clarity.
rdev_clear_
md: improve return types of badblocks handling functions
rdev_set_badblocks() only indicates success/failure, so convert its return type from int to boolean for better semantic clarity.
rdev_clear_badblocks() return value is never used by any caller, convert it to void. This removes unnecessary value returns.
Also update narrow_write_error() in both raid1 and raid10 to use boolean return type to match rdev_set_badblocks().
Signed-off-by: Zheng Qixing <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
e879a0d9 |
| 27-Feb-2025 |
Yu Kuai <[email protected]> |
md/raid1,raid10: don't ignore IO flags
If blk-wbt is enabled by default, it's found that raid write performance is quite bad because all IO are throttled by wbt of underlying disks, due to flag REQ_
md/raid1,raid10: don't ignore IO flags
If blk-wbt is enabled by default, it's found that raid write performance is quite bad because all IO are throttled by wbt of underlying disks, due to flag REQ_IDLE is ignored. And turns out this behaviour exist since blk-wbt is introduced.
Other than REQ_IDLE, other flags should not be ignored as well, for example REQ_META can be set for filesystems, clearing it can cause priority reverse problems; And REQ_NOWAIT should not be cleared as well, because io will wait instead of failing directly in underlying disks.
Fix those problems by keep IO flags from master bio.
Fises: f51d46d0e7cb ("md: add support for REQ_NOWAIT") Fixes: e34cbd307477 ("blk-wbt: add general throttling mechanism") Fixes: 5404bc7a87b9 ("[PATCH] Allow file systems to differentiate between data and meta reads") Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4, v6.14-rc3 |
|
| #
c594de04 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: don't export md_cluster_ops
Add a new field 'cluster_ops' and initialize it md_setup_cluster(), so that the gloable variable 'md_cluter_ops' doesn't need to be exported. Also prepare to switch m
md: don't export md_cluster_ops
Add a new field 'cluster_ops' and initialize it md_setup_cluster(), so that the gloable variable 'md_cluter_ops' doesn't need to be exported. Also prepare to switch md-cluster to use md_submod_head.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Su Yue <[email protected]>
show more ...
|
| #
3d44e1d1 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: switch personalities to use md_submodule_head
Remove the global list 'pers_list', and switch to use md_submodule_head, which is managed by xarry. Prepare to unify registration and unregistration
md: switch personalities to use md_submodule_head
Remove the global list 'pers_list', and switch to use md_submodule_head, which is managed by xarry. Prepare to unify registration and unregistration for all sub modules.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
| #
bf0a7326 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: only include md-cluster.h if necessary
md-cluster is only supportted by raid1 and raid10, there is no need to include md-cluster.h for other personalities.
Also move APIs that is only used in m
md: only include md-cluster.h if necessary
md-cluster is only supportted by raid1 and raid10, there is no need to include md-cluster.h for other personalities.
Also move APIs that is only used in md-cluster.c from md.h to md-cluster.h.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Su Yue <[email protected]>
show more ...
|
| #
5fbcf76e |
| 15-Feb-2025 |
Zheng Qixing <[email protected]> |
md/raid1: fix memory leak in raid1_run() if no active rdev
When `raid1_set_limits()` fails or when the array has no active `rdev`, the allocated memory for `conf` is not properly freed.
Add raid1_f
md/raid1: fix memory leak in raid1_run() if no active rdev
When `raid1_set_limits()` fails or when the array has no active `rdev`, the allocated memory for `conf` is not properly freed.
Add raid1_free() call to properly free the conf in error path.
Fixes: 799af947ed13 ("md/raid1: don't free conf on raid0_run failure") Signed-off-by: Zheng Qixing <[email protected]> Link: https://lore.kernel.org/linux-raid/[email protected] Singed-off-by: Yu Kuai <[email protected]>
show more ...
|
| #
fbe8f2fa |
| 12-Feb-2025 |
Bart Van Assche <[email protected]> |
md/raid*: Fix the set_queue_limits implementations
queue_limits_cancel_update() must only be called if queue_limits_start_update() is called first. Remove the queue_limits_cancel_update() calls from
md/raid*: Fix the set_queue_limits implementations
queue_limits_cancel_update() must only be called if queue_limits_start_update() is called first. Remove the queue_limits_cancel_update() calls from the raid*_set_limits() functions because there is no corresponding queue_limits_start_update() call.
Cc: Christoph Hellwig <[email protected]> Fixes: c6e56cf6b2e7 ("block: move integrity information into queue_limits") Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/linux-raid/[email protected]/ Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc2, v6.14-rc1, v6.13 |
|
| #
6a7e17b2 |
| 16-Jan-2025 |
John Garry <[email protected]> |
block: Add common atomic writes enable flag
Currently only stacked devices need to explicitly enable atomic writes by setting BLK_FEAT_ATOMIC_WRITES_STACKED flag.
This does not work well for device
block: Add common atomic writes enable flag
Currently only stacked devices need to explicitly enable atomic writes by setting BLK_FEAT_ATOMIC_WRITES_STACKED flag.
This does not work well for device mapper stacking devices, as there many sets of limits are stacked and what is the 'bottom' and 'top' device can swapped. This means that BLK_FEAT_ATOMIC_WRITES_STACKED needs to be set for many queue limits, which is messy.
Generalize enabling atomic writes enabling by ensuring that all devices must explicitly set a flag - that includes NVMe, SCSI sd, and md raid.
Signed-off-by: John Garry <[email protected]> Reviewed-by: Mike Snitzer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc7 |
|
| #
cd5fc653 |
| 09-Jan-2025 |
Yu Kuai <[email protected]> |
md/md-bitmap: move bitmap_{start, end}write to md upper layer
There are two BUG reports that raid5 will hang at bitmap_startwrite([1],[2]), root cause is that bitmap start write and end write is unb
md/md-bitmap: move bitmap_{start, end}write to md upper layer
There are two BUG reports that raid5 will hang at bitmap_startwrite([1],[2]), root cause is that bitmap start write and end write is unbalanced, it's not quite clear where, and while reviewing raid5 code, it's found that bitmap operations can be optimized. For example, for a 4 disks raid5, with chunksize=8k, if user issue a IO (0 + 48k) to the array:
┌────────────────────────────────────────────────────────────┐ │chunk 0 │ │ ┌────────────┬─────────────┬─────────────┬────────────┼ │ sh0 │A0: 0 + 4k │A1: 8k + 4k │A2: 16k + 4k │A3: P │ │ ┼────────────┼─────────────┼─────────────┼────────────┼ │ sh1 │B0: 4k + 4k │B1: 12k + 4k │B2: 20k + 4k │B3: P │ ┼──────┴────────────┴─────────────┴─────────────┴────────────┼ │chunk 1 │ │ ┌────────────┬─────────────┬─────────────┬────────────┤ │ sh2 │C0: 24k + 4k│C1: 32k + 4k │C2: P │C3: 40k + 4k│ │ ┼────────────┼─────────────┼─────────────┼────────────┼ │ sh3 │D0: 28k + 4k│D1: 36k + 4k │D2: P │D3: 44k + 4k│ └──────┴────────────┴─────────────┴─────────────┴────────────┘
Before this patch, 4 stripe head will be used, and each sh will attach bio for 3 disks, and each attached bio will trigger bitmap_startwrite() once, which means total 12 times. - 3 times (0 + 4k), for (A0, A1 and A2) - 3 times (4 + 4k), for (B0, B1 and B2) - 3 times (8 + 4k), for (C0, C1 and C3) - 3 times (12 + 4k), for (D0, D1 and D3)
After this patch, md upper layer will calculate that IO range (0 + 48k) is corresponding to the bitmap (0 + 16k), and call bitmap_startwrite() just once.
Noted that this patch will align bitmap ranges to the chunks, for example, if user issue a IO (0 + 4k) to array:
- Before this patch, 1 time (0 + 4k), for A0; - After this patch, 1 time (0 + 8k) for chunk 0;
Usually, one bitmap bit will represent more than one disk chunk, and this doesn't have any difference. And even if user really created a array that one chunk contain multiple bits, the overhead is that more data will be recovered after power failure.
Also remove STRIPE_BITMAP_PENDING since it's not used anymore.
[1] https://lore.kernel.org/all/CAJpMwyjmHQLvm6zg1cmQErttNNQPDAAXPKM3xgTjMhbfts986Q@mail.gmail.com/ [2] https://lore.kernel.org/all/[email protected]/
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
4f0e7d0e |
| 09-Jan-2025 |
Yu Kuai <[email protected]> |
md/md-bitmap: remove the last parameter for bimtap_ops->endwrite()
For the case that IO failed for one rdev, the bit will be mark as NEEDED in following cases:
1) If badblocks is set and rdev is no
md/md-bitmap: remove the last parameter for bimtap_ops->endwrite()
For the case that IO failed for one rdev, the bit will be mark as NEEDED in following cases:
1) If badblocks is set and rdev is not faulty; 2) If rdev is faulty;
Case 1) is useless because synchronize data to badblocks make no sense. Case 2) can be replaced with mddev->degraded.
Also remove R1BIO_Degraded, R10BIO_Degraded and STRIPE_DEGRADED since case 2) no longer use them.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
08c50142 |
| 09-Jan-2025 |
Yu Kuai <[email protected]> |
md/md-bitmap: factor behind write counters out from bitmap_{start/end}write()
behind_write is only used in raid1, prepare to refactor bitmap_{start/end}write(), there are no functional changes.
Sig
md/md-bitmap: factor behind write counters out from bitmap_{start/end}write()
behind_write is only used in raid1, prepare to refactor bitmap_{start/end}write(), there are no functional changes.
Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Xiao Ni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1 |
|
| #
f2a38abf |
| 18-Nov-2024 |
John Garry <[email protected]> |
md/raid1: Atomic write support
Set BLK_FEAT_ATOMIC_WRITES_STACKED to enable atomic writes.
For an attempt to atomic write to a region which has bad blocks, error the write as we just cannot do this
md/raid1: Atomic write support
Set BLK_FEAT_ATOMIC_WRITES_STACKED to enable atomic writes.
For an attempt to atomic write to a region which has bad blocks, error the write as we just cannot do this. It is unlikely to find devices which support atomic writes and bad blocks.
Reviewed-by: Yu Kuai <[email protected]> Signed-off-by: John Garry <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.12 |
|
| #
b1a7ad8b |
| 11-Nov-2024 |
John Garry <[email protected]> |
md/raid1: Handle bio_split() errors
Add proper bio_split() error handling. For any error, call raid_end_bio_io() and return.
For the case of an in the write path, we need to undo the increment in t
md/raid1: Handle bio_split() errors
Add proper bio_split() error handling. For any error, call raid_end_bio_io() and return.
For the case of an in the write path, we need to undo the increment in the rdev pending count and NULLify the r1_bio->bios[] pointers.
For read path failure, we need to undo rdev pending count increment from the earlier read_balance() call.
Reviewed-by: Yu Kuai <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc7, v6.12-rc6 |
|
| #
ff31a7ef |
| 31-Oct-2024 |
Yu Kuai <[email protected]> |
md/raid1: don't wait for Faulty rdev in wait_blocked_rdev()
Faulty rdev should never be accessed anymore, hence there is no point to wait for bad block to be acknowledged in this case while handling
md/raid1: don't wait for Faulty rdev in wait_blocked_rdev()
Faulty rdev should never be accessed anymore, hence there is no point to wait for bad block to be acknowledged in this case while handling write request.
Signed-off-by: Yu Kuai <[email protected]> Tested-by: Mariusz Tkaczyk <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
88ed59c4 |
| 31-Oct-2024 |
Yu Kuai <[email protected]> |
md/raid1: factor out helper to handle blocked rdev from raid1_write_request()
Currently raid1 is preparing IO for underlying disks while checking if any disk is blocked, if so allocated resources mu
md/raid1: factor out helper to handle blocked rdev from raid1_write_request()
Currently raid1 is preparing IO for underlying disks while checking if any disk is blocked, if so allocated resources must be released, then waiting for rdev to be unblocked and try to prepare IO again.
Make code cleaner by checking blocked rdev first, it doesn't matter if rdev is blocked while issuing IO, the IO will wait for rdev to be unblocked or not.
Signed-off-by: Yu Kuai <[email protected]> Tested-by: Mariusz Tkaczyk <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6 |
|
| #
59fdd433 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: make in memory structure internal
Now that struct bitmap_page and bitmap is not used externally anymore, move them from md-bitmap.h to md-bitmap.c (expect that dm-raid is still using d
md/md-bitmap: make in memory structure internal
Now that struct bitmap_page and bitmap is not used externally anymore, move them from md-bitmap.h to md-bitmap.c (expect that dm-raid is still using define marco 'COUNTER_MAX').
Also fix some checkpatch warnings.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
49f5f5e3 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_wait_behind_writes() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
S
md/md-bitmap: merge md_bitmap_wait_behind_writes() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
77c09640 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_resize() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Signed-off-by
md/md-bitmap: merge md_bitmap_resize() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
e1791dae |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: pass in mddev directly for md_bitmap_resize()
And move the condition "if (mddev->bitmap)" into md_bitmap_resize() as well, on the one hand make code cleaner, on the other hand try not
md/md-bitmap: pass in mddev directly for md_bitmap_resize()
And move the condition "if (mddev->bitmap)" into md_bitmap_resize() as well, on the one hand make code cleaner, on the other hand try not to access bitmap directly.
Since we are here, also change the parameter 'init' from int to bool.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
48eb9581 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_unplug_async() into md_bitmap_unplug()
Add a parameter 'bool sync' to distinguish them, and md_bitmap_unplug_async() won't be exported anymore, hence bitmap_operations
md/md-bitmap: merge md_bitmap_unplug_async() into md_bitmap_unplug()
Add a parameter 'bool sync' to distinguish them, and md_bitmap_unplug_async() won't be exported anymore, hence bitmap_operations only need one op to cover them.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
15db1eca |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_cond_end_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also c
md/md-bitmap: merge md_bitmap_cond_end_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
077b18ab |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_close_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also chan
md/md-bitmap: merge md_bitmap_close_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
1415f402 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_end_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also change
md/md-bitmap: merge md_bitmap_end_sync() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Also change the parameter from bitmap to mddev, to avoid access bitmap outside md-bitmap.c as much as possible.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|