More on data integrity: Enter Btrfs!

Those who read my previous post know that my number one concern these days is disk data integrity. As disks become bigger and cheaper, their uncorrectable error rates are not improving. That and the other failure points in the chain tempt Murphy to come out from behind the curtain of denial. It’s getting scary (read my last post for my experiences with this).

The current mainstream Linux filesystems do not address the problem: ext3 does no checksumming, and ext4 adds only journal checksums (but not data block checksums). Your data is still vulnerable to silent corruption.

Sun’s interesting ZFS filesystem had seemed like the sole bright spot on the horizon, but the licensing issues that prevent its inclusion in the Linux kernel are disheartening – we cannot just sit and wait for that to change; and we don’t have to!

There is a new game in town that has a lot of promise. It comes from Chris Mason at Oracle, and it is called “Btrfs“. I had noticed this project a few months ago, but I did not entertain the idea of trying it because it seemed to be so far off in the future – a great idea, but not anywhere near usable. Well, either I got the wrong impression, or things have changed. Quite a lot of Btrfs works, and in my limited testing, it works damn well. It seems quite fast too, compared to what I experienced when trying out ZFS-FUSE. Btrfs is in the Linux kernel, you see, which gives it a big performance advantage and allows it more access to the hardware as well.

Recent Googling revealed that not only are people giving this new fs a spin, but there are already assorted ebuilds out there in various overlays (not surprizing, since building Btrfs and its utilities is a snap). I quickly cooked up my own ebuilds (for the kernel module and the utilities), and they are now in Gentoo:

sys-fs/btrfs
sys-fs/btrfs-progs

There are some actual kernel source patches available out there (the ebuild provides a separate module), but at this early stage it is probably more handy to be able to insert Btrfs in any kernel and be able to upgrade it separately. Now, please note that there is a big warning on the Btrfs site (which I also put in the ebuild):

WARNING: Btrfs is under heavy development, and is not suitable for any uses other than benchmarking and review. The Btrfs disk format is not yet finalized.

So yes, it’s experimental, and upgrades can change the disk format (meaning you’ll have to re-format and re-populate). So call me crazy: I created a big partition and copied my 104GB home directory to a Btrfs filesystem, and I’m playing with it now as I type this. I will be keeping very recent backups, of course, but this is really very cool. I figure I need to put myself out there and help by giving it some serious testing.

I won’t go into details about what Btrfs offers in terms of features (just visit the site), but it appears to be aimed directly at ZFS’s feature set while better fitting into the “Linux way” of doing disk management. The big thing for me is the data checksumming – not having this feels a little like flying blind with no safety net. I am really excited that something is being done to address my number one issue. Go Btrfs!

One thought on “More on data integrity: Enter Btrfs!”

  1. Indeed, I am quite excited about Btrfs as well. I just tried your ebuilds and it works quite well. This will be the solution for todays large hard disks and often shoddy hardware RAID implementations.

    I wish more of kernel gurus would help speed this awesome FS into mainline.

Comments are closed.