|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1 |
|
| #
8fa7292f |
| 05-Apr-2025 |
Thomas Gleixner <[email protected]> |
treewide: Switch/rename to timer_delete[_sync]()
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree over and remove the historical wrapper inlines.
Conversion was done with c
treewide: Switch/rename to timer_delete[_sync]()
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree over and remove the historical wrapper inlines.
Conversion was done with coccinelle plus manual fixups where necessary.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5 |
|
| #
7e5102dd |
| 27-Feb-2025 |
Zheng Qixing <[email protected]> |
md: improve return types of badblocks handling functions
rdev_set_badblocks() only indicates success/failure, so convert its return type from int to boolean for better semantic clarity.
rdev_clear_
md: improve return types of badblocks handling functions
rdev_set_badblocks() only indicates success/failure, so convert its return type from int to boolean for better semantic clarity.
rdev_clear_badblocks() return value is never used by any caller, convert it to void. This removes unnecessary value returns.
Also update narrow_write_error() in both raid1 and raid10 to use boolean return type to match rdev_set_badblocks().
Signed-off-by: Zheng Qixing <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
c8775aef |
| 27-Feb-2025 |
Zheng Qixing <[email protected]> |
badblocks: return boolean from badblocks_set() and badblocks_clear()
Change the return type of badblocks_set() and badblocks_clear() from int to bool, indicating success or failure. Specifically:
-
badblocks: return boolean from badblocks_set() and badblocks_clear()
Change the return type of badblocks_set() and badblocks_clear() from int to bool, indicating success or failure. Specifically:
- _badblocks_set() and _badblocks_clear() functions now return true for success and false for failure. - All calls to these functions are updated to handle the new boolean return type. - This change improves code clarity and ensures a more consistent handling of success and failure states.
Signed-off-by: Zheng Qixing <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Acked-by: Coly Li <[email protected]> Acked-by: Ira Weiny <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4 |
|
| #
85428702 |
| 20-Feb-2025 |
Yu Kuai <[email protected]> |
md: fix mddev uaf while iterating all_mddevs list
While iterating all_mddevs list from md_notify_reboot() and md_exit(), list_for_each_entry_safe is used, and this can race with deletint the next md
md: fix mddev uaf while iterating all_mddevs list
While iterating all_mddevs list from md_notify_reboot() and md_exit(), list_for_each_entry_safe is used, and this can race with deletint the next mddev, causing UAF:
t1: spin_lock //list_for_each_entry_safe(mddev, n, ...) mddev_get(mddev1) // assume mddev2 is the next entry spin_unlock t2: //remove mddev2 ... mddev_free spin_lock list_del spin_unlock kfree(mddev2) mddev_put(mddev1) spin_lock //continue dereference mddev2->all_mddevs
The old helper for_each_mddev() actually grab the reference of mddev2 while holding the lock, to prevent from being freed. This problem can be fixed the same way, however, the code will be complex.
Hence switch to use list_for_each_entry, in this case mddev_put() can free the mddev1 and it's not safe as well. Refer to md_seq_show(), also factor out a helper mddev_put_locked() to fix this problem.
Cc: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/linux-raid/[email protected] Fixes: f26514342255 ("md: stop using for_each_mddev in md_notify_reboot") Fixes: 16648bac862f ("md: stop using for_each_mddev in md_exit") Reported-and-tested-by: Guillaume Morin <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc3 |
|
| #
87a86277 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: switch md-cluster to use md_submodle_head
To make code cleaner, and prepare to add kconfig for bitmap.
Also remove the unsed global variables pers_lock, md_cluster_ops and md_cluster_mod, and e
md: switch md-cluster to use md_submodle_head
To make code cleaner, and prepare to add kconfig for bitmap.
Also remove the unsed global variables pers_lock, md_cluster_ops and md_cluster_mod, and exported symbols register_md_cluster_operations(), unregister_md_cluster_operations() and md_cluster_ops.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Su Yue <[email protected]>
show more ...
|
| #
c594de04 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: don't export md_cluster_ops
Add a new field 'cluster_ops' and initialize it md_setup_cluster(), so that the gloable variable 'md_cluter_ops' doesn't need to be exported. Also prepare to switch m
md: don't export md_cluster_ops
Add a new field 'cluster_ops' and initialize it md_setup_cluster(), so that the gloable variable 'md_cluter_ops' doesn't need to be exported. Also prepare to switch md-cluster to use md_submod_head.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Su Yue <[email protected]>
show more ...
|
| #
3d44e1d1 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: switch personalities to use md_submodule_head
Remove the global list 'pers_list', and switch to use md_submodule_head, which is managed by xarry. Prepare to unify registration and unregistration
md: switch personalities to use md_submodule_head
Remove the global list 'pers_list', and switch to use md_submodule_head, which is managed by xarry. Prepare to unify registration and unregistration for all sub modules.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
| #
d3beb7c9 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: introduce struct md_submodule_head and APIs
Prepare to unify registration and unregistration of md personalities and md-cluster, also prepare for add kconfig for md-bitmap.
Link: https://lore.k
md: introduce struct md_submodule_head and APIs
Prepare to unify registration and unregistration of md personalities and md-cluster, also prepare for add kconfig for md-bitmap.
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
| #
9faab548 |
| 15-Feb-2025 |
Yu Kuai <[email protected]> |
md: merge common code into find_pers()
- pers_lock() are held and released from caller - try_module_get() is called from caller - error message from caller
Merge above code into find_pers(), and re
md: merge common code into find_pers()
- pers_lock() are held and released from caller - try_module_get() is called from caller - error message from caller
Merge above code into find_pers(), and rename it to get_pers(), also add a wrapper to module_put() as put_pers().
Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]> Reviewed-by: Su Yue <[email protected]>
show more ...
|
| #
105ca2a2 |
| 25-Feb-2025 |
Christoph Hellwig <[email protected]> |
block: split struct bio_integrity_payload
Many of the fields in struct bio_integrity_payload are only needed for the default integrity buffer in the block layer, and the variable sized array at the
block: split struct bio_integrity_payload
Many of the fields in struct bio_integrity_payload are only needed for the default integrity buffer in the block layer, and the variable sized array at the end of the structure makes it very hard to embed into caller allocated structures.
Reduce struct bio_integrity_payload to the minimal structure needed in common code and create two separate containing structures for the automatically generated payload and the caller allocated payload. The latter is a simple wrapper for struct bio_integrity_payload and the bvecs, while the former contains the additional fields moved out of struct bio_integrity_payload.
Always use a dedicated mempool for automatic integrity metadata instead of depending on bio_set that is submitter controlled and thus often doesn't have the mempool initialized and stop using mempools for the submitter buffers as they aren't in the NOIO I/O submission path where we need to guarantee forward progress.
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Tested-by: Anuj Gupta <[email protected]> Reviewed-by: Anuj Gupta <[email protected]> Reviewed-by: Kanchan Joshi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
show more ...
|
| #
4b10a3bc |
| 13-Feb-2025 |
Li Nan <[email protected]> |
md: ensure resync is prioritized over recovery
If a new disk is added during resync, the resync process is interrupted, and recovery is triggered, causing the previous resync to be lost. In reality,
md: ensure resync is prioritized over recovery
If a new disk is added during resync, the resync process is interrupted, and recovery is triggered, causing the previous resync to be lost. In reality, disk addition should not terminate resync, fix it.
Steps to reproduce the issue: mdadm -CR /dev/md0 -l1 -n3 -x1 /dev/sd[abcd] mdadm --fail /dev/md0 /dev/sdc
Fixes: 24dd469d728d ("[PATCH] md: allow a manual resync with md") Signed-off-by: Li Nan <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/linux-raid/[email protected] Signed-off-by: Yu Kuai <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc2, v6.14-rc1 |
|
| #
1751f872 |
| 28-Jan-2025 |
Joel Granados <[email protected]> |
treewide: const qualify ctl_tables where applicable
Add the const qualifier to all the ctl_tables in the tree except for watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls, loadpin_sysc
treewide: const qualify ctl_tables where applicable
Add the const qualifier to all the ctl_tables in the tree except for watchdog_hardlockup_sysctl, memory_allocation_profiling_sysctls, loadpin_sysctl_table and the ones calling register_net_sysctl (./net, drivers/inifiniband dirs). These are special cases as they use a registration function with a non-const qualified ctl_table argument or modify the arrays before passing them on to the registration function.
Constifying ctl_table structs will prevent the modification of proc_handler function pointers as the arrays would reside in .rodata. This is made possible after commit 78eb4ea25cd5 ("sysctl: treewide: constify the ctl_table argument of proc_handlers") constified all the proc_handlers.
Created this by running an spatch followed by a sed command: Spatch: virtual patch
@ depends on !(file in "net") disable optional_qualifier @
identifier table_name != { watchdog_hardlockup_sysctl, iwcm_ctl_table, ucma_ctl_table, memory_allocation_profiling_sysctls, loadpin_sysctl_table }; @@
+ const struct ctl_table table_name [] = { ... };
sed: sed --in-place \ -e "s/struct ctl_table .table = &uts_kern/const struct ctl_table *table = \&uts_kern/" \ kernel/utsname_sysctl.c
Reviewed-by: Song Liu <[email protected]> Acked-by: Steven Rostedt (Google) <[email protected]> # for kernel/trace/ Reviewed-by: Martin K. Petersen <[email protected]> # SCSI Reviewed-by: Darrick J. Wong <[email protected]> # xfs Acked-by: Jani Nikula <[email protected]> Acked-by: Corey Minyard <[email protected]> Acked-by: Wei Liu <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Reviewed-by: Bill O'Donnell <[email protected]> Acked-by: Baoquan He <[email protected]> Acked-by: Ashutosh Dixit <[email protected]> Acked-by: Anna Schumaker <[email protected]> Signed-off-by: Joel Granados <[email protected]>
show more ...
|
| #
8d28d0dd |
| 24-Jan-2025 |
Yu Kuai <[email protected]> |
md/md-bitmap: Synchronize bitmap_get_stats() with bitmap lifetime
After commit ec6bb299c7c3 ("md/md-bitmap: add 'sync_size' into struct md_bitmap_stats"), following panic is reported:
Oops: general
md/md-bitmap: Synchronize bitmap_get_stats() with bitmap lifetime
After commit ec6bb299c7c3 ("md/md-bitmap: add 'sync_size' into struct md_bitmap_stats"), following panic is reported:
Oops: general protection fault, probably for non-canonical address RIP: 0010:bitmap_get_stats+0x2b/0xa0 Call Trace: <TASK> md_seq_show+0x2d2/0x5b0 seq_read_iter+0x2b9/0x470 seq_read+0x12f/0x180 proc_reg_read+0x57/0xb0 vfs_read+0xf6/0x380 ksys_read+0x6c/0xf0 do_syscall_64+0x82/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Root cause is that bitmap_get_stats() can be called at anytime if mddev is still there, even if bitmap is destroyed, or not fully initialized. Deferenceing bitmap in this case can crash the kernel. Meanwhile, the above commit start to deferencing bitmap->storage, make the problem easier to trigger.
Fix the problem by protecting bitmap_get_stats() with bitmap_info.mutex.
Cc: [email protected] # v6.12+ Fixes: 32a7627cf3a3 ("[PATCH] md: optimised resync using Bitmap based intent logging") Reported-and-tested-by: Harshit Mogalapalli <[email protected]> Closes: https://lore.kernel.org/linux-raid/[email protected]/T/#m6e5086c95201135e4941fe38f9efa76daf9666c5 Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.13, v6.13-rc7 |
|
| #
cd5fc653 |
| 09-Jan-2025 |
Yu Kuai <[email protected]> |
md/md-bitmap: move bitmap_{start, end}write to md upper layer
There are two BUG reports that raid5 will hang at bitmap_startwrite([1],[2]), root cause is that bitmap start write and end write is unb
md/md-bitmap: move bitmap_{start, end}write to md upper layer
There are two BUG reports that raid5 will hang at bitmap_startwrite([1],[2]), root cause is that bitmap start write and end write is unbalanced, it's not quite clear where, and while reviewing raid5 code, it's found that bitmap operations can be optimized. For example, for a 4 disks raid5, with chunksize=8k, if user issue a IO (0 + 48k) to the array:
┌────────────────────────────────────────────────────────────┐ │chunk 0 │ │ ┌────────────┬─────────────┬─────────────┬────────────┼ │ sh0 │A0: 0 + 4k │A1: 8k + 4k │A2: 16k + 4k │A3: P │ │ ┼────────────┼─────────────┼─────────────┼────────────┼ │ sh1 │B0: 4k + 4k │B1: 12k + 4k │B2: 20k + 4k │B3: P │ ┼──────┴────────────┴─────────────┴─────────────┴────────────┼ │chunk 1 │ │ ┌────────────┬─────────────┬─────────────┬────────────┤ │ sh2 │C0: 24k + 4k│C1: 32k + 4k │C2: P │C3: 40k + 4k│ │ ┼────────────┼─────────────┼─────────────┼────────────┼ │ sh3 │D0: 28k + 4k│D1: 36k + 4k │D2: P │D3: 44k + 4k│ └──────┴────────────┴─────────────┴─────────────┴────────────┘
Before this patch, 4 stripe head will be used, and each sh will attach bio for 3 disks, and each attached bio will trigger bitmap_startwrite() once, which means total 12 times. - 3 times (0 + 4k), for (A0, A1 and A2) - 3 times (4 + 4k), for (B0, B1 and B2) - 3 times (8 + 4k), for (C0, C1 and C3) - 3 times (12 + 4k), for (D0, D1 and D3)
After this patch, md upper layer will calculate that IO range (0 + 48k) is corresponding to the bitmap (0 + 16k), and call bitmap_startwrite() just once.
Noted that this patch will align bitmap ranges to the chunks, for example, if user issue a IO (0 + 4k) to array:
- Before this patch, 1 time (0 + 4k), for A0; - After this patch, 1 time (0 + 8k) for chunk 0;
Usually, one bitmap bit will represent more than one disk chunk, and this doesn't have any difference. And even if user really created a array that one chunk contain multiple bits, the overhead is that more data will be recovered after power failure.
Also remove STRIPE_BITMAP_PENDING since it's not used anymore.
[1] https://lore.kernel.org/all/CAJpMwyjmHQLvm6zg1cmQErttNNQPDAAXPKM3xgTjMhbfts986Q@mail.gmail.com/ [2] https://lore.kernel.org/all/[email protected]/
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc6 |
|
| #
127186cf |
| 02-Jan-2025 |
Yu Kuai <[email protected]> |
md: reintroduce md-linear
THe md-linear is removed by commit 849d18e27be9 ("md: Remove deprecated CONFIG_MD_LINEAR") because it has been marked as deprecated for a long time.
However, md-linear is
md: reintroduce md-linear
THe md-linear is removed by commit 849d18e27be9 ("md: Remove deprecated CONFIG_MD_LINEAR") because it has been marked as deprecated for a long time.
However, md-linear is used widely for underlying disks with different size, sadly we didn't know this until now, and it's true useful to create partitions and assemble multiple raid and then append one to the other.
People have to use dm-linear in this case now, however, they will prefer to minimize the number of involved modules.
Fixes: 849d18e27be9 ("md: Remove deprecated CONFIG_MD_LINEAR") Cc: [email protected] Signed-off-by: Yu Kuai <[email protected]> Acked-by: Coly Li <[email protected]> Acked-by: Mike Snitzer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6 |
|
| #
29967332 |
| 31-Oct-2024 |
Yu Kuai <[email protected]> |
md: don't record new badblocks for faulty rdev
Faulty will be checked before issuing IO to the rdev, however, rdev can be faulty at any time, hence it's possible that rdev_set_badblocks() will be ca
md: don't record new badblocks for faulty rdev
Faulty will be checked before issuing IO to the rdev, however, rdev can be faulty at any time, hence it's possible that rdev_set_badblocks() will be called for faulty rdev. In this case, mddev->sb_flags will be set and some other path can be blocked by updating super block.
Since faulty rdev will not be accesed anymore, there is no need to record new babblocks for faulty rdev and forcing updating super block.
Noted this is not a bugfix, just prevent updating superblock in some corner cases, and will help to slice a bug related to external metadata[1], testing also shows that devices are removed faster in the case IO error.
[1] https://lore.kernel.org/all/[email protected]/
Signed-off-by: Yu Kuai <[email protected]> Tested-by: Mariusz Tkaczyk <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
50e82748 |
| 31-Oct-2024 |
Yu Kuai <[email protected]> |
md: don't wait faulty rdev in md_wait_for_blocked_rdev()
md_wait_for_blocked_rdev() is called for write IO while rdev is blocked, howerver, rdev can be faulty after choosing this rdev to write, and
md: don't wait faulty rdev in md_wait_for_blocked_rdev()
md_wait_for_blocked_rdev() is called for write IO while rdev is blocked, howerver, rdev can be faulty after choosing this rdev to write, and faulty rdev should never be accessed anymore, hence there is no point to wait for faulty rdev to be unblocked.
Signed-off-by: Yu Kuai <[email protected]> Tested-by: Mariusz Tkaczyk <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1 |
|
| #
62ce0782 |
| 19-Sep-2024 |
Li Nan <[email protected]> |
md: ensure child flush IO does not affect origin bio->bi_status
When a flush is issued to an RAID array, a child flush IO is created and issued for each member disk in the RAID array. Since commit b
md: ensure child flush IO does not affect origin bio->bi_status
When a flush is issued to an RAID array, a child flush IO is created and issued for each member disk in the RAID array. Since commit b75197e86e6d ("md: Remove flush handling"), each child flush IO has been chained with the original bio. As a result, the failure of any child IO could modify the bi_status of the original bio, potentially impacting the upper-layer filesystem.
Fix the issue by preventing child flush IO from altering the original bio->bi_status as before. However, this design introduces a known issue: in the event of a power failure, if a flush IO on a member disk fails, the upper layers may not be informed. This issue is not easy to fix and will not be addressed for the time being in this issue.
Fixes: b75197e86e6d ("md: Remove flush handling") Signed-off-by: Li Nan <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.11, v6.11-rc7 |
|
| #
d981ed84 |
| 04-Sep-2024 |
Xiao Ni <[email protected]> |
md: Add new_level sysfs interface
Now reshape supports two ways: with backup file or without backup file. For the situation without backup file, it needs to change data offset. It doesn't need syste
md: Add new_level sysfs interface
Now reshape supports two ways: with backup file or without backup file. For the situation without backup file, it needs to change data offset. It doesn't need systemd service mdadm-grow-continue. So it can finish the reshape job in one process environment. It can know the new level from mdadm --grow command and can change to new level after reshape finishes.
For the situation with backup file, it needs systemd service mdadm-grow-continue to monitor reshape progress. So there are two process envolved. One is mdadm --grow command whick kicks off reshape and wakes up mdadm-grow-continue service. The second process is the service, which doesn't know the new level from the first process.
In kernel space mddev->new_level is used to record the new level when doing reshape. This patch adds a new interface to help mdadm update new_level and sync it to metadata. Then mdadm-grow-continue can read the right new_level.
Commit log revised by Song Liu. Please refer to the link for more details.
Signed-off-by: Xiao Ni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
2d2b3bc1 |
| 03-Sep-2024 |
Mateusz Kusiak <[email protected]> |
md: Report failed arrays as broken in mdstat
Depending on if array has personality, it is either reported as active or inactive. This patch adds third status "broken" for arrays with personality tha
md: Report failed arrays as broken in mdstat
Depending on if array has personality, it is either reported as active or inactive. This patch adds third status "broken" for arrays with personality that became inoperative. The reason is end users tend to assume that "active" indicates array is operational.
Add "broken" state for inoperative arrays with personality and refactor the code.
Signed-off-by: Mateusz Kusiak <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc6 |
|
| #
b75197e8 |
| 27-Aug-2024 |
Yu Kuai <[email protected]> |
md: Remove flush handling
For flush request, md has a special flush handling to merge concurrent flush request into single one, however, the whole mechanism is based on a disk level spin_lock 'mddev
md: Remove flush handling
For flush request, md has a special flush handling to merge concurrent flush request into single one, however, the whole mechanism is based on a disk level spin_lock 'mddev->lock'. And fsync can be called quite often in some user cases, for consequence, spin lock from IO fast path can cause performance degradation.
Fortunately, the block layer already has flush handling to merge concurrent flush request, and it only acquires hctx level spin lock. (see details in blk-flush.c)
This patch removes the flush handling in md, and converts to use general block layer flush handling in underlying disks.
Flush test for 4 nvme raid10: start 128 threads to do fsync 100000 times, on arm64, see how long it takes.
Test script: void* thread_func(void* arg) { int fd = *(int*)arg; for (int i = 0; i < FSYNC_COUNT; i++) { fsync(fd); } return NULL; }
int main() { int fd = open("/dev/md0", O_RDWR); if (fd < 0) { perror("open"); exit(1); }
pthread_t threads[THREADS]; struct timeval start, end;
gettimeofday(&start, NULL);
for (int i = 0; i < THREADS; i++) { pthread_create(&threads[i], NULL, thread_func, &fd); }
for (int i = 0; i < THREADS; i++) { pthread_join(threads[i], NULL); }
gettimeofday(&end, NULL);
close(fd);
long long elapsed = (end.tv_sec - start.tv_sec) * 1000000LL + (end.tv_usec - start.tv_usec); printf("Elapsed time: %lld microseconds\n", elapsed);
return 0; }
Test result: about 10 times faster: Before this patch: 50943374 microseconds After this patch: 5096347 microseconds
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
49f5f5e3 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_wait_behind_writes() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
S
md/md-bitmap: merge md_bitmap_wait_behind_writes() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
18db2a9c |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_daemon_work() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Signed-o
md/md-bitmap: merge md_bitmap_daemon_work() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
3c9883e7 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge bitmap_unplug() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Signed-off-by: Y
md/md-bitmap: merge bitmap_unplug() into bitmap_operations
So that the implementation won't be exposed, and it'll be possible to invent a new bitmap by replacing bitmap_operations.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|
| #
48eb9581 |
| 26-Aug-2024 |
Yu Kuai <[email protected]> |
md/md-bitmap: merge md_bitmap_unplug_async() into md_bitmap_unplug()
Add a parameter 'bool sync' to distinguish them, and md_bitmap_unplug_async() won't be exported anymore, hence bitmap_operations
md/md-bitmap: merge md_bitmap_unplug_async() into md_bitmap_unplug()
Add a parameter 'bool sync' to distinguish them, and md_bitmap_unplug_async() won't be exported anymore, hence bitmap_operations only need one op to cover them.
Signed-off-by: Yu Kuai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Song Liu <[email protected]>
show more ...
|