|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1 |
|
| #
999cc1bb |
| 22-Jan-2025 |
Kent Overstreet <[email protected]> |
bcachefs: Separate running/runnable in wp stats
We've got per-writepoint statistics to see how well the writepoint index update threads are pipelining; this separates running vs. runnable so we can
bcachefs: Separate running/runnable in wp stats
We've got per-writepoint statistics to see how well the writepoint index update threads are pipelining; this separates running vs. runnable so we can see at a glance if they're blocking.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
| #
9e903352 |
| 27-Jan-2025 |
Kent Overstreet <[email protected]> |
bcachefs: Fix discard path journal flushing
The discard path is supposed to issue journal flushes when there's too many buckets empty buckets that need a journal commit before they can be written to
bcachefs: Fix discard path journal flushing
The discard path is supposed to issue journal flushes when there's too many buckets empty buckets that need a journal commit before they can be written to again, but at some point this code seems to have been lost.
Bring it back with a new optimization to make sure we don't issue too many journal flushes: the journal now tracks the sequence number of the most recent flush in progress, which the discard path uses when deciding which buckets need a journal flush.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
|
Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5 |
|
| #
c6705091 |
| 20-Apr-2024 |
Kent Overstreet <[email protected]> |
bcachefs: Allocator prefers not to expand mi.btree_allocated bitmap
We now have a small bitmap in the member info section of the superblock for "regions that have btree nodes", so that if we ever ha
bcachefs: Allocator prefers not to expand mi.btree_allocated bitmap
We now have a small bitmap in the member info section of the superblock for "regions that have btree nodes", so that if we ever have to scan for btree nodes in repair we don't have to scan the whole device(s).
This tweaks the allocator to prefer allocating from regions that are already marked in this bitmap.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc4, v6.9-rc3 |
|
| #
e2a316b3 |
| 01-Apr-2024 |
Kent Overstreet <[email protected]> |
bcachefs: BCH_WATERMARK_interior_updates
This adds a new watermark, higher priority than BCH_WATERMARK_reclaim, for interior btree updates. We've seen a deadlock where journal replay triggers a ton
bcachefs: BCH_WATERMARK_interior_updates
This adds a new watermark, higher priority than BCH_WATERMARK_reclaim, for interior btree updates. We've seen a deadlock where journal replay triggers a ton of btree node merges, and these use up all available open buckets and then interior updates get stuck.
One cause of this is that we're currently lacking btree node merging on write buffer btrees - that needs to be fixed as well.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6 |
|
| #
1e81f89b |
| 07-Aug-2023 |
Kent Overstreet <[email protected]> |
bcachefs: Fix assorted checkpatch nits
Signed-off-by: Kent Overstreet <[email protected]>
|
|
Revision tags: v6.5-rc5 |
|
| #
bf5a261c |
| 02-Aug-2023 |
Kent Overstreet <[email protected]> |
bcachefs: Assorted fixes for clang
clang had a few more warnings about enum conversion, and also didn't like the opts.c initializer.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1 |
|
| #
ec14fc60 |
| 27-Jun-2023 |
Kent Overstreet <[email protected]> |
bcachefs: Kill JOURNAL_WATERMARK
This unifies JOURNAL_WATERMARK with BCH_WATERMARK; we're working towards specifying watermarks once in the transaction commit path.
Signed-off-by: Kent Overstreet <
bcachefs: Kill JOURNAL_WATERMARK
This unifies JOURNAL_WATERMARK with BCH_WATERMARK; we're working towards specifying watermarks once in the transaction commit path.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
| #
494036d8 |
| 27-Jun-2023 |
Kent Overstreet <[email protected]> |
bcachefs: BCH_WATERMARK_reclaim
Add another watermark for journal reclaim - this is needed for the next patches, that unify BCH_WATERMARK with JOURNAL_WATERMARK.
Signed-off-by: Kent Overstreet <ken
bcachefs: BCH_WATERMARK_reclaim
Add another watermark for journal reclaim - this is needed for the next patches, that unify BCH_WATERMARK with JOURNAL_WATERMARK.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
|
Revision tags: v6.4 |
|
| #
e53a961c |
| 24-Jun-2023 |
Kent Overstreet <[email protected]> |
bcachefs: Rename enum alloc_reserve -> bch_watermark
This is prep work for consolidating with JOURNAL_WATERMARK.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
Revision tags: v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1 |
|
| #
7635e1a6 |
| 25-Feb-2023 |
Kent Overstreet <[email protected]> |
bcachefs: Rework open bucket partial list allocation
Now, any open_bucket can go on the partial list: allocating from the partial list has been moved to its own dedicated function, open_bucket_add_b
bcachefs: Rework open bucket partial list allocation
Now, any open_bucket can go on the partial list: allocating from the partial list has been moved to its own dedicated function, open_bucket_add_bucets() -> bucket_alloc_set_partial().
In particular, this means that erasure coded buckets can safely go on the partial list; the new location works with the "allocate an ec bucket first, then the rest" logic.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
| #
e84face6 |
| 02-Mar-2023 |
Kent Overstreet <[email protected]> |
bcachefs: RESERVE_stripe
Rework stripe creation path - new algorithm for deciding when to create new stripes or reuse existing stripes.
We add a new allocation watermark, RESERVE_stripe, above RESE
bcachefs: RESERVE_stripe
Rework stripe creation path - new algorithm for deciding when to create new stripes or reuse existing stripes.
We add a new allocation watermark, RESERVE_stripe, above RESERVE_none. Then we always try to create a new stripe by doing RESERVE_stripe allocations; if this fails, we reuse an existing stripe and allocate buckets for it with the reserve watermark for the given write (RESERVE_none or RESERVE_movinggc).
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
| #
b1cfe5ed |
| 02-Mar-2023 |
Kent Overstreet <[email protected]> |
bcachefs: Improve dev_alloc_debug_to_text()
Now we also print the number of buckets reserved for each watermark.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
Revision tags: v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4 |
|
| #
a8b3a677 |
| 02-Nov-2022 |
Kent Overstreet <[email protected]> |
bcachefs: Nocow support
This adds support for nocow mode, where we do writes in-place when possible. Patch components:
- New boolean filesystem and inode option, nocow: note that when nocow is
bcachefs: Nocow support
This adds support for nocow mode, where we do writes in-place when possible. Patch components:
- New boolean filesystem and inode option, nocow: note that when nocow is enabled, data checksumming and compression are implicitly disabled
- To prevent in-place writes from racing with data moves (data_update.c) or bucket reuse (i.e. a bucket being reused and re-allocated while a nocow write is in flight, we have a new locking mechanism.
Buckets can be locked for either data update or data move, using a fixed size hash table of two_state_shared locks. We don't have any chaining, meaning updates and moves to different buckets that hash to the same lock will wait unnecessarily - we'll want to watch for this becoming an issue.
- The allocator path also needs to check for in-place writes in flight to a given bucket before giving it out: thus we add another counter to bucket_alloc_state so we can track this.
- Fsync now may need to issue cache flushes to block devices instead of flushing the journal. We add a device bitmask to bch_inode_info, ei_devs_need_flush, which tracks devices that need to have flushes issued - note that this will lead to unnecessary flushes when other codepaths have already issued flushes, we may want to replace this with a sequence number.
- New nocow write path: look up extents, and if they're writable write to them - otherwise fall back to the normal COW write path.
XXX: switch to sequence numbers instead of bitmask for devs needing journal flush
XXX: ei_quota_lock being a mutex means bch2_nocow_write_done() needs to run in process context - see if we can improve this
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
| #
ae10fe01 |
| 04-Nov-2022 |
Kent Overstreet <[email protected]> |
bcachefs: bucket_alloc_state
This refactoring puts our various allocation path counters into a dedicated struct - the upcoming nocow patch is going to add another counter.
Signed-off-by: Kent Overs
bcachefs: bucket_alloc_state
This refactoring puts our various allocation path counters into a dedicated struct - the upcoming nocow patch is going to add another counter.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1 |
|
| #
822835ff |
| 01-Apr-2022 |
Kent Overstreet <[email protected]> |
bcachefs: Fold bucket_state in to BCH_DATA_TYPES()
Previously, we were missing accounting for buckets in need_gc_gens and need_discard states. This matters because buckets in those states need other
bcachefs: Fold bucket_state in to BCH_DATA_TYPES()
Previously, we were missing accounting for buckets in need_gc_gens and need_discard states. This matters because buckets in those states need other btree operations done before they can be used, so they can't be conuted when checking current number of free buckets against the allocation watermark.
Also, we weren't directly counting free buckets at all. Now, data type 0 == BCH_DATA_free, and free buckets are counted; this means we can get rid of the separate (poorly defined) count of unavailable buckets.
This is a new on disk format version, with upgrade and fsck required for the accounting changes.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
|
Revision tags: v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1 |
|
| #
f25d8215 |
| 10-Jan-2022 |
Kent Overstreet <[email protected]> |
bcachefs: Kill allocator threads & freelists
Now that we have new persistent data structures for the allocator, this patch converts the allocator to use them.
Now, foreground bucket allocation uses
bcachefs: Kill allocator threads & freelists
Now that we have new persistent data structures for the allocator, this patch converts the allocator to use them.
Now, foreground bucket allocation uses the freespace btree to find buckets to allocate, instead of popping buckets off the freelist.
The background allocator threads are no longer needed and are deleted, as well as the allocator freelists. Now we only need background tasks for invalidating buckets containing cached data (when we are low on empty buckets), and for issuing discards.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
| #
b17d3cec |
| 31-Oct-2022 |
Kent Overstreet <[email protected]> |
bcachefs: Run btree updates after write out of write_point
In the write path, after the write to the block device(s) complete we have to punt to process context to do the btree update.
Instead of u
bcachefs: Run btree updates after write out of write_point
In the write path, after the write to the block device(s) complete we have to punt to process context to do the btree update.
Instead of using the work item embedded in op->cl, this patch switches to a per write-point work item. This helps with two different issues:
- lock contention: btree updates to the same writepoint will (usually) be updating the same alloc keys - context switch overhead: when we're bottlenecked on btree updates, having a thread (running out of a work item) checking the write point for completed ops is cheaper than queueing up a new work item and waking up a kworker.
In an arbitrary benchmark, 4k random writes with fio running inside a VM, this patch resulted in a 10% improvement in total iops.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
| #
3e154711 |
| 13-Mar-2022 |
Kent Overstreet <[email protected]> |
bcachefs: x-macroize alloc_reserve enum
This makes an array of strings available, like our other enums.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
Revision tags: v5.16, v5.16-rc8, v5.16-rc7 |
|
| #
9ddffaf8 |
| 26-Dec-2021 |
Kent Overstreet <[email protected]> |
bcachefs: Put open_buckets in a hashtable
This is so that the copygc code doesn't have to refer to bucket_mark.owned_by_allocator - assisting in getting rid of the in memory bucket array.
Signed-of
bcachefs: Put open_buckets in a hashtable
This is so that the copygc code doesn't have to refer to bucket_mark.owned_by_allocator - assisting in getting rid of the in memory bucket array.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
| #
abe19d45 |
| 26-Dec-2021 |
Kent Overstreet <[email protected]> |
bcachefs: Refactor open_bucket code
Prep work for adding a hash table of open buckets - instead of embedding a bch_extent_ptr, we need to refer to the bucket directly so that we're not calling secto
bcachefs: Refactor open_bucket code
Prep work for adding a hash table of open buckets - instead of embedding a bch_extent_ptr, we need to refer to the bucket directly so that we're not calling sector_to_bucket() in the hash table lookup code, which has an expensive divide.
Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
|
Revision tags: v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12 |
|
| #
bae895a5 |
| 18-Apr-2021 |
Kent Overstreet <[email protected]> |
bcachefs: Add allocator thread state to sysfs
Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
|
|
Revision tags: v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5 |
|
| #
2abe5420 |
| 21-Jan-2021 |
Kent Overstreet <[email protected]> |
bcachefs: Persist 64 bit io clocks
Originally, bcachefs - going back to bcache - stored, for each bucket, a 16 bit counter corresponding to how long it had been since the bucket was read from. But,
bcachefs: Persist 64 bit io clocks
Originally, bcachefs - going back to bcache - stored, for each bucket, a 16 bit counter corresponding to how long it had been since the bucket was read from. But, this required periodically rescaling counters on every bucket to avoid wraparound. That wasn't an issue in bcache, where we'd perodically rewrite the per bucket metadata all at once, but in bcachefs we're trying to avoid having to walk every single bucket.
This patch switches to persisting 64 bit io clocks, corresponding to the 64 bit bucket timestaps introduced in the previous patch with KEY_TYPE_alloc_v2.
Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
|
Revision tags: v5.11-rc4, v5.11-rc3 |
|
| #
890e3f5b |
| 07-Jan-2021 |
Kent Overstreet <[email protected]> |
bcachefs: Reserve some open buckets for btree allocations
This reverts part of the change from "bcachefs: Don't use BTREE_INSERT_USE_RESERVE so much" - it turns out we still should be reserving open
bcachefs: Reserve some open buckets for btree allocations
This reverts part of the change from "bcachefs: Don't use BTREE_INSERT_USE_RESERVE so much" - it turns out we still should be reserving open buckets for btree node allocations, because otherwise data bucket allocations (especially with erasure coding enabled) can use up all our open buckets and we won't be able to do the metadata update that lets us release those open bucket references. Oops.
Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
|
Revision tags: v5.11-rc2, v5.11-rc1 |
|
| #
8deed5f4 |
| 15-Dec-2020 |
Kent Overstreet <[email protected]> |
bcachefs: Use separate new stripes for copygc and non-copygc
Allocations for copygc have to be kept separate from everything else, so that copygc doesn't get starved.
Signed-off-by: Kent Overstreet
bcachefs: Use separate new stripes for copygc and non-copygc
Allocations for copygc have to be kept separate from everything else, so that copygc doesn't get starved.
Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|
| #
3187aa8d |
| 21-Dec-2020 |
Kent Overstreet <[email protected]> |
bcachefs: Don't use BTREE_INSERT_USE_RESERVE so much
Previously, we were using BTREE_INSERT_RESERVE in a lot of places where it no longer makes sense.
- we now have more open_buckets than we used
bcachefs: Don't use BTREE_INSERT_USE_RESERVE so much
Previously, we were using BTREE_INSERT_RESERVE in a lot of places where it no longer makes sense.
- we now have more open_buckets than we used to, and the reserves work better, so we shouldn't need to use BTREE_INSERT_RESERVE just because we're holding open_buckets pinned anymore.
- We have the btree key cache for updates to the alloc btree, meaning we no longer need the btree reserve to ensure the allocator can make forward progress.
This means that we should only need a reserve for btree updates to ensure that copygc can make forward progress.
Since it's now just for copygc, we can also fold RESERVE_BTREE into RESERVE_MOVINGGC (the allocator's freelist reserve).
Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
show more ...
|