Commit graph

207 commits

Author SHA1 Message Date
Kent Overstreet
60558d55f7 bcachefs: Plumb bkey_validate_context to journal_entry_validate
This lets us print the exact location in the journal if it was found in
the journal, or correctly print if it was found in the superblock.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:22 -05:00
Kent Overstreet
9bdb3b73e7 bcachefs: Check for extent crc uncompressed/compressed size mismatch
When not compressed, these must be equal - this fixes an assertion pop
in bch2_rechecksum_bio().

Reported-by: syzbot+50d3544c9b8db9c99fd2@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:21 -05:00
Kent Overstreet
a6f4794fcd bcachefs: struct bkey_validate_context
Add a new parameter to bkey validate functions, and use it to improve
invalid bkey error messages: we can now print the btree and depth it
came from, or if it came from the journal, or is a btree root.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:20 -05:00
Kent Overstreet
e1cb4f56dc bcachefs: Bias reads more in favor of faster device
Per reports of performance issues on mixed multi device filesystems
where we're issuing too much IO to the spinning rust - tweak this
algorithm.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:19 -05:00
Kent Overstreet
61f854da4c bcachefs: Reserve 8 bits in bch_reflink_p
Better repair for reflink pointers, as well as propagating new inode
options to indirect extents, are going to require a few extra bits
bch_reflink_p: so claim a few from the high end of the destination
index.

Also add some missing bounds checking.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:17 -05:00
Kent Overstreet
161d13835e bcachefs: Move bch_extent_rebalance code to rebalance.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:16 -05:00
Kent Overstreet
a652c56590 bcachefs: Improve trace_rebalance_extent
We now say explicitly which pointers are being moved or compressed

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:16 -05:00
Kent Overstreet
3de8b72731 bcachefs: Simplify option logic in rebalance
Since bch2_move_get_io_opts() now synchronizes io_opts with options from
bch_extent_rebalance, delete the ad-hoc logic in rebalance.c that
previously did this.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:16 -05:00
Kent Overstreet
6aa0bd0fd5 bcachefs: get_update_rebalance_opts()
bch2_move_get_io_opts() now synchronizes options loaded from the
filesystem and inode (if present, i.e. not walking the reflink btree
directly) with options from the bch_extent_rebalance_entry, updating the
extent if necessary.

Since bch_extent_rebalance tracks where its option came from we can
preserve "inode options override filesystem options", even for indirect
extents where we don't have access to the inode the options came from.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:16 -05:00
Kent Overstreet
2d21d11253 bcachefs: New bch_extent_rebalance fields
- Add more io path options to bch_extent_rebalance
- For each option, track whether it came from the filesystem or the
  inode

This will be used for improved rebalance support for reflinked data.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:16 -05:00
Kent Overstreet
3000855cab bcachefs: io_opts_to_rebalance_opts()
New helper to simplify bch2_bkey_set_needs_rebalance()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:16 -05:00
Kent Overstreet
5fffe1a3c3 bcachefs: rename bch_extent_rebalance fields to match other opts structs
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:16 -05:00
Kent Overstreet
282faf9524 bcachefs: kill __bch2_bkey_sectors_need_rebalance()
Single caller, fold into bch2_bkey_sectors_need_rebalance()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:16 -05:00
Kent Overstreet
c8908959ae bcachefs: kill bch2_bkey_needs_rebalance()
Dead code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:15 -05:00
Kent Overstreet
015fafc49b bcachefs: small cleanup for extent ptr bitmasks
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:15 -05:00
Kent Overstreet
eacb755568 bcachefs: bch2_io_opts_fixups()
Centralize some io path option fixups - they weren't always being
applied correctly:

- background_compression uses compression if unset
- background_target uses foreground_target if unset
- nocow disables most fancy io path options

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:15 -05:00
Kent Overstreet
e0c8369bc8 bcachefs: Add version check for bch_btree_ptr_v2.sectors_written validate
A user popped up with a very old (0.11) filesystem that needed repair
and wasn't recently backed up.

Reported-by: Manoa <manoa@mail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:15 -05:00
Kent Overstreet
840c2fbcc5 bcachefs: Fix assertion pop in bch2_ptr_swab()
This runs on extents that haven't yet been validated, so we don't want
to assert that we have a valid entry type.

Reported-by: syzbot+4f29c3f12f864d8a8d17@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-11-12 03:46:57 -05:00
Kent Overstreet
a34eef6dd1 bcachefs: Don't keep tons of cached pointers around
We had a bug report where the data update path was creating an extent
that failed to validate because it had too many pointers; almost all of
them were cached.

To fix this, we have:
- want_cached_ptr(), a new helper that checks if we even want a cached
  pointer (is on appropriate target, device is readable).

- bch2_extent_set_ptr_cached() now only sets a pointer cached if we want
  it.

- bch2_extent_normalize_by_opts() now ensures that we only have a single
  cached pointer that we want.

While working on this, it was noticed that this doesn't work well with
reflinked data and per-file options. Another patch series is coming that
plumbs through additional io path options through bch_extent_rebalance,
with improved option handling.

Reported-by: Reed Riley <reed@riley.engineer>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-29 06:34:10 -04:00
Kent Overstreet
d5c5b337f8 bcachefs: Don't drop devices with stripe pointers
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21 11:39:49 -04:00
Kent Overstreet
54a12984a9 bcachefs: EIO errcode cleanup
We want to be using private errcodes whenever possible, for better error
messages.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21 11:39:48 -04:00
Kent Overstreet
805ddc2042 bcachefs: bch2_dev_rcu_noerror()
bch2_dev_rcu() now properly errors if the device is invalid

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-21 11:39:48 -04:00
Kent Overstreet
52df04f039 bcachefs: More BCH_SB_MEMBER_INVALID support
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:46 -04:00
Kent Overstreet
df88febc20 bcachefs: Simplify bch2_bkey_drop_ptrs()
bch2_bkey_drop_ptrs() had a some complicated machinery for avoiding
O(n^2) when dropping multiple pointers - but when n is only going to be
~4, it's not worth it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:46 -04:00
Kent Overstreet
d26935690c bcachefs: Fix bch2_extents_match() false positive
This was caught as a very rare nonce inconsistency, on systems with
encryption and replication (and tiering, or some form of rebalance
operation running):

[Wed Jul 17 13:30:03 2024] about to insert invalid key in data update path
[Wed Jul 17 13:30:03 2024] old: u64s 10 type extent 671283510:6392:U32_MAX len 16 ver 106595503: durability: 2 crc: c_size 8 size 16 offset 0 nonce 0 csum chacha20_poly1305_80 compress zstd ptr: 3:355968:104 gen 7 ptr: 4:513244:48 gen 6 rebalance: target hdd compression zstd
[Wed Jul 17 13:30:03 2024] k:   u64s 10 type extent 671283510:6400:U32_MAX len 16 ver 106595508: durability: 2 crc: c_size 8 size 16 offset 0 nonce 0 csum chacha20_poly1305_80 compress zstd ptr: 3:355968:112 gen 7 ptr: 4:513244:56 gen 6 rebalance: target hdd compression zstd
[Wed Jul 17 13:30:03 2024] new: u64s 14 type extent 671283510:6392:U32_MAX len 8 ver 106595508: durability: 2 crc: c_size 8 size 16 offset 0 nonce 0 csum chacha20_poly1305_80 compress zstd ptr: 3:355968:112 gen 7 cached ptr: 4:513244:56 gen 6 cached rebalance: target hdd compression zstd crc: c_size 8 size 16 offset 8 nonce 0 csum chacha20_poly1305_80 compress zstd ptr: 1:10860085:32 gen 0 ptr: 0:17285918:408 gen 0
[Wed Jul 17 13:30:03 2024] bcachefs (cca5bc65-fe77-409d-a9fa-465a6e7f4eae): fatal error - emergency read only

bch2_extents_match() was reporting true for extents that did not
actually point to the same data.

bch2_extent_match() iterates over pairs of pointers, looking for
pointers that point to the same location on disk (with matching
generation numbers). However one or both extents may have been trimmed
(or merged) and they might not have the same disk offset: it corrects
for this by subtracting the key offset and the checksum entry offset.

However, this failed when an extent was immediately partially
overwritten, and the new overwrite was allocated the next adjacent disk
space.

Normally, with compression off, this would never cause a bug, since the
new extent would have to be immediately after the old extent for the
pointer offsets to match, and the rebalance index update path is not
looking for an extent outside the range of the extent it moved.

However with compression enabled, extents take up less space on disk
than they do in the btree index space - and spuriously matching after
partial overwrite is possible.

To fix this, add a secondary check, that strictly checks that the
regions pointed to on disk overlap.

https://github.com/koverstreet/bcachefs/issues/717

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-26 20:33:12 -04:00
Kent Overstreet
49aa783039 bcachefs: Fix rebalance_work accounting
rebalance_work was keying off of the presence of rebelance_opts in the
extent - but that was incorrect, we keep those around after rebalance
for indirect extents since the inode's options are not directly
available

Fixes: 20ac515a9c ("bcachefs: bch_acct_rebalance_work")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-24 10:16:21 -04:00
Kent Overstreet
2102bdac67 bcachefs: Extra debug for data move path
We don't have sufficient information to debug:

https://github.com/koverstreet/bcachefs/issues/726

- print out durability of extent ptrs, when non default
- print the number of replicas we need in data_update_to_text()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-19 21:42:08 -04:00
Kent Overstreet
d97de0d017 bcachefs: Make bkey_fsck_err() a wrapper around fsck_err()
bkey_fsck_err() was added as an interface that looks like fsck_err(),
but previously all it did was ensure that the appropriate error counter
was incremented in the superblock.

This is a cleanup and bugfix patch that converts it to a wrapper around
fsck_err(). This is needed to fix an issue with the upgrade path to
disk_accounting_v3, where the "silent fix" error list now includes
bkey_fsck errors; fsck_err() handles this in a unified way, and since we
need to change printing of bkey fsck errors from the caller to the inner
bkey_fsck_err() calls, this ends up being a pretty big change.

Als,, rename .invalid() methods to .validate(), for clarity, while we're
changing the function signature anyways (to drop the printbuf argument).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-13 23:00:50 -04:00
Kent Overstreet
a2cb8a6236 bcachefs: Self healing on read IO error
This repurposes the promote path, which already knows how to call
data_update() after a read: we now automatically rewrite bad data when
we get a read error and then successfully retry from a different
replica.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14 19:00:16 -04:00
Kent Overstreet
9d9d212e26 bcachefs: bch2_extent_crc_unpacked_to_text()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14 19:00:16 -04:00
Kent Overstreet
9432e90df1 bcachefs: Check for invalid bucket from bucket_gen(), gc_bucket()
Turn more asserts into proper recoverable error paths.

Reported-by: syzbot+246b47da27f8e7e7d6fb@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-10 13:17:16 -04:00
Kent Overstreet
65eaf4e24a bcachefs: s/bkey_invalid_flags/bch_validate_flags
We're about to start using bch_validate_flags for superblock section
validation - it's no longer bkey specific.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09 16:23:36 -04:00
Kent Overstreet
99179fb898 bcachefs: Invalid devices are now checked for by fsck, not .invalid methods
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09 16:23:36 -04:00
Kent Overstreet
a9422fd404 bcachefs: kill bch2_dev_bkey_exists() in bkey_pick_read_device()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:23 -04:00
Kent Overstreet
9cadb4ea56 bcachefs: bch2_extent_normalize() -> bch2_dev_rcu()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:23 -04:00
Kent Overstreet
8e3cc2003f bcachefs: bch2_bkey_has_target() -> bch2_dev_rcu()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:23 -04:00
Kent Overstreet
8feecbed24 bcachefs: extent_ptr_invalid() -> bch2_dev_rcu()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:23 -04:00
Kent Overstreet
3858aa4268 bcachefs: ptr_stale() -> dev_ptr_stale()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:23 -04:00
Kent Overstreet
302c980a81 bcachefs: extent_ptr_durability() -> bch2_dev_rcu()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:23 -04:00
Kent Overstreet
3793b3f91f bcachefs: bch2_extent_merge() -> bch2_dev_rcu()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:23 -04:00
Kent Overstreet
23f308ae19 bcachefs: bch2_dev_safe() -> bch2_dev_rcu()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:22 -04:00
Kent Overstreet
9a768ab75b bcachefs: bch2_bkey_drop_ptrs() declares loop iter
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:22 -04:00
Kent Overstreet
9de40d77f0 bcachefs: Check for writing btree_ptr_v2.sectors_written == 0
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:20 -04:00
Kent Overstreet
2f724563fc bcachefs: member helper cleanups
Some renaming for better consistency

bch2_member_exists	-> bch2_member_alive
bch2_dev_exists		-> bch2_member_exists
bch2_dev_exsits2	-> bch2_dev_exists
bch_dev_locked		-> bch2_dev_locked
bch_dev_bkey_exists	-> bch2_dev_bkey_exists

new helper - bch2_dev_safe

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:19 -04:00
Kent Overstreet
d155272b6e bcachefs: bucket_valid()
cut out a branch from doing it the obvious way

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:19 -04:00
Kent Overstreet
9abb6dd7ce bcachefs: Standardize helpers for printing enum strs with bounds checks
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-13 22:48:17 -04:00
Kent Overstreet
dc32c118ec bcachefs: fix unsafety in bch2_extent_ptr_to_text()
Need to check if we have a valid bucket before checking if ptr is stale

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-13 22:48:16 -04:00
Kent Overstreet
4409b8081d bcachefs: Repair pass for scanning for btree nodes
If a btree root or interior btree node goes bad, we're going to lose a
lot of data, unless we can recover the nodes that it pointed to by
scanning.

Fortunately btree node headers are fully self describing, and
additionally the magic number is xored with the filesytem UUID, so we
can do so safely.

This implements the scanning - next patch will rework topology repair to
make use of the found nodes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-03 14:44:18 -04:00
Kent Overstreet
805b535a8a bcachefs: Check btree ptr min_key in .invalid
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-31 20:36:11 -04:00
Kent Overstreet
d7e77f53e9 bcachefs: opts->compression can now also be applied in the background
The "apply this compression method in the background" paths now use the
compression option if background_compression is not set; this means that
setting or changing the compression option will cause existing data to
be compressed accordingly in the background.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:10 -05:00