The Rust implementation is multithreaded, performs better in general and
does custom compression of btree nodes to achieve much better compression
ratios. unpack also checksums expanded metadata to validate it.
Format version has jumped to 3, no backwards compatibility, but I think
that's ok since we never made a release that contained the C++ version
of these tools.
Benchmarks
==========
On an 8 core, 16 hyperthread machine.
metadata 1G, full:
Pack size pack time unpack time
------------------------------------------------------
C++ 193M 50.3s 6.9s (no verify)
Rust 70M 1.4s 1.8s (verify)
metadata 16G, sparse:
Pack size pack time unpack time
------------------------------------------------------
C++ 21M 68s 1s (no verify)
Rust 4M 8.6s 0.5s (verify)
Encapsulate file descriptor into an object, to ensure that an fd will be
closed properly while exception raised, e.g., the block_cache throws
exception during the block_manager's construction.
I hadn't realised that check_file_exists() also checked that it was
a regular file, which we don't want for the couple of uses I recently
added.
This patch adds an optional arg must_be_regular_file, and defaults
it to true, preserving the original behaviour. The recent additions
have this set to false.
Repair was falling back to non-repair behaviour if it thought the roots
were ok. Now if --repair is specified the same dumping code is always
executed.
metadata_emitter is actually a visitor that passes on it's data
to an encapsulated emitter object.
metadata_emitter -> metadata_emit_visitor
metadata_tree_emitter -> metadata_tree_emit_visitor
The first pass of the repair process scans all metadata working out the
largest orphan btrees. This scan doesn't use as much validation as
the btree_walk function which subsequently gets called.
This patch catches any exceptions thrown by the btree walk function
and removes that btree from consideration.
We've had a trickle of users who accidentally activate the same pool on a
VM and host at the same time. Typically the host doesn't do any IO, but
the kernel will still rewrite the superblock on shutdown. This leaves
the superblock pointing to very out of date btree roots and so we get
massive metadata loss.
This patch changes thin_repair, and thin_dump --repair. They now hunt
for the most recent, undamaged and consistent roots of the device and
mapping trees, and use that as the starting point of the repair.
You need to apply doc/bm-journal.patch to create the journal.
thin_journal_check confirms that if the machine had crashed at any time
during the test run no metadata corruption would have occured.