Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Discuss] On Btrfs raid and odd-count disks



> From: discuss-bounces+blu=nedharvey.com at blu.org [mailto:discuss-
> bounces+blu=nedharvey.com at blu.org] On Behalf Of Derek Atkins
> 
> > ZFS prevents write holes by enforcing atomicity of all writes to
> > storage. It does this by controlling all of the I/O caching involved in
> > the write process from system RAM down to the write acceleration cache
> > on the disks themselves. ZFS updates the file system only after all
> > cache points have confirmed being flushed.
> >
> > If any of these points lie about their status then write holes can
> > appear under power fault conditions. 

True, but at least, with ZFS & BTRFS, any subsequent read of corrupt data will be detected as a result of cksums.

Also, since we're talking about redundant storage, ZFS (and presumably BTRFS, cuz it's obvious.) will attempt to correct the error.  If a single disk (or a number smaller than your redundancy protection level) wrote corrupt data (or no data) then the cksum fails, and the FS will try all possible combinations of eliminating devices and re-reading, to identify which device(s) contains corrupt data, and if it finds some combination that produces a good cksum, it will attempt to re-write the data to whichever disk(s) failed.


> Fair enough...  I don't know if standard (e.g. DM-level) RAID5 or RAID6
> provide for said "scrubbing"?  

Nope.
Scrubbing is only possible thanks to cksum'ing at the raid level.  Without that, your raid is dependent on the underlying devices to correctly report errors.  But if an error isn't noticed by hardware and escalated to the OS, then the error passes standard raid undetected.

How often does that happen?  Well, in my experience, heavy usage on several TB of enterprise-sata hardware produces a bit error about once every 1-2 years, as identified by the zfs cksum counter incrementing, without the hard drive error counter incrementing.  This means the error passed the drive undetected, and was identified and corrected by ZFS.




BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org