Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How do hard drives handle bad blocks nowadays?



Thanks a lot for your very informative response.  I'll have to read 
through the man-pages for hdparm and smartctl.

    Mark

On 4/3/2011 5:57 PM, Chuck Anderson wrote:
> On Sun, Apr 03, 2011 at 05:00:27PM -0400, MBR wrote:
>> It's now two decades later, and I'm trying to understand what's changed
>> since then.  In particular I recently cloned a laptop drive (IDE) to a
>> new drive.  When I did so, I encountered 2 bad blocks on the new drive.
>> Based on my recollection from the late 1980s, I didn't think 2 bad
>> blocks was a big deal because I assumed I could manually enter their
>> addresses into the bad block list and they'd be replaced by spare
>> blocks.  But I haven't managed to find a tool to allow me to examine
>> and/or edit the bad block list.
> Modern ATA (IDE) drives do this remapping automatically, and
> transparently to the host system--the LBA block number stays the same,
> but the underlying physical sector is moved by the drive firmware to a
> spare sector that was reserved for this purpose.  Apparently, this
> feature can be turned on and off with hdparm -D.
>
> SCSI drives can also do this, and may be configured with this turned
> off by default since they are expected to be used in RAID arrays and
> servers that would handle this disk management on a higher level.
>
>> After doing some web searches and a bit of reading on this, I get the
>> impression that nowadays all modern drives implement S.M.A.R.T.
>> (Self-Monitoring, Analysis, and Reporting Technology) and that using
>> S.M.A.R.T. they all handle this behind the scenes.  If that's true, then
>> presumably the only time I should ever see a disk report a bad block is
>> when there are no more spare blocks left.  Am I right about that?
> The remapping only happens on write, not read.  This is so that you
> can keep trying to read a bad block in the hopes that you might
> eventually recover the data with a good read or partial good read.
> Once you write to the sector, it then attempts the reallocation.
> After it is reallocated, there is no easy way to get at the old
> sector's data--it is effectively orphaned on the disk. (If that old
> sector happened to have sensitive data on it, there is now no way for
> you to erase it, hence the development of Anti-Forensic Splitting for
> use with encryption schemes such as LUKS to mitigate against this
> issue.)
>
> I've had drives that were stubborn about reallocating automatically
> with "normal" overwrites.  I had to poke the sectors manually with
> hdparm:
>
> hdparm --read-sector<sector-number>   # check if it's really bad
> hdparm --write-sector<sector-number>  # repair (reallocate) bad sector
>
>> If so, then the fact that I encountered write errors on two blocks on
>> the drive suggests that the brand new drive was in pretty bad shape to
>> begin with.
> Check smartctl -a /dev/foo and look for "pending" and "reallocated"
> sectors.  I usually replace a disk once it starts getting any of
> those.  A new disk shouldn't have any IMO, and I'd RMA it if that were
> the case.  I do have some older drives that were given to me that have
> 1 or 2 reallocated sectors that I might use for scratch storage as
> long as the pending or reallocated counts don't keep increasing.
>
>> Is there some tool that will allow me to examine the disk's bad block list?
> For ATA, I'm not aware of how to examine the defect list.  For SCSI,
> you can use sdparm or sg3_utils.  smartctl -a will at least tell you
> how many have been reallocated.
>
> I usually do the following to test suspect drives:
>
> smartctl -l selftest # look for existing test results
> smartctl -t short    # do a quick test
> smartctl -l selftest # look at the results
> smartctl -t long     # do a long test (could take an hour or more)
> smartctl -l selftest # look at the results
>
>> Also, should I use 'dd' to test all blocks before I put a drive into
>> service, or is there a better tool out there?
> Besides the above tests, I've often used dd for reading and writing
> the entire drive as an extra sanity test, and to force overwrites and
> possibly reallocate any bad sectors:
>
> dd if=/dev/zero of=/dev/foo bs=32M
> dd if=/dev/foo of=/dev/null bs=32M
>
> In another window:
>
> while true; do killall -USR1 dd; sleep 10; done
>
> Watch the first window for once-per-10-second status updates from dd
> :-)
> _______________________________________________






BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org