Boston Linux & UNIX was originally founded in 1994 as part of The Boston Computer Society. We meet on the third Wednesday of each month at the Massachusetts Institute of Technology, in Building E51.

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Reminder -- RAID 5 is not your friend



Dan Ritter wrote:
> RAID 5 is not your friend.

It depends. Most current systems will do RAID6 now, so it's probably
moot. Anyhow, Read on...

> A server with a mirrored setup for system disks and a RAID 5 for
> storage reported a disk gone bad in the storage system. OK, the
> alert is received, and we plan to replace the disk in the
> morning.
> 
> Before we can get around to it, another disk in the storage
> system also dies. Poof.

Typically, this is caused by the Spare disk that the system rebuilds on
having bad blocks. The system starts to rebuild on the Spare, encounters
a bad block and the rebuild dies. It seems this is typical of lower-end
SATA Raids. Many enterprise-level hardware raid controllers with SATA
will allow you to schedule 'bad-block scrubs'. What this does is during
that scheduled time, the controller will go through the system and scan
each disk in the system for potentially bad blocks, including the Spare.
This helps ensure that the type of failure described above is avoided.
For obvious reasons it can't be eliminated altogether, but minimizes the
likelihood of it happening, and makes RAID 5 that much more reliable on
SATA. However I do recommend at least RAID 6 on SATA.

Rant: this sort of config likely isn't possible with the vendor embedded
raid controllers that come with typical HP/Dell/IBM server hardware. In
reality I would never recommend using those for anything more than
mirroring internal disks. Enterprise storage for critical data needs to
be purpose built hardware that you spend more for than you spent on your
server (magnitudes more). To the best of my knowledge, Buffalo, Netgear,
and PlaySkool don't make enterprise-level raid hardware. If you're
hacking together some white-box homemade solution, or buying something
with a name you've only ever seen in consumer-level products, you're
building a tree fort, and you should expect tree fort level results.
Assess how much you're going to lose in productivity per hour that this
device is inaccessible, and then evaluate how much it's worth to NOT
have that happen.

Grant M.
-- 
Grant Mongardi
Senior Systems Engineer
NAPC

gmongardi-cGmSLFmkI3Y at public.gmane.org
http://www.napc.com/
blog.napc.com
781.894.3114 phone
781.894.3997 fax

NAPC | technology matters








BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org