Forcing a hard disk to reallocate bad sectors

Hard disk head

Sometimes a hard disk is hinting on an upcoming failure. Some disks start to make unexpected sounds, others are silent and only cause some noise in your syslog. In most cases the disk will automatically reallocate one or two damaged sectors and you should start planning on buying a new disk while your data is safe. However, sometimes the disk won’t automatically reallocate these sectors and you’ll have to do that manually yourself. Luckily, this doesn’t include any rocket science.

A few days ago, one of my disks reported some problems in my syslog while rebuilding a RAID5-array:

Jan 29 18:19:54 dragon kernel: [66774.973049] end_request: I/O error, dev sdb, sector 1261069669
Jan 29 18:19:54 dragon kernel: [66774.973054] raid5:md3: read error not correctable (sector 405431640 on sdb6).
Jan 29 18:19:54 dragon kernel: [66774.973059] raid5: Disk failure on sdb6, disabling device.

Jan 29 18:20:11 dragon kernel: [66792.180513] sd 3:0:0:0: [sdb] Unhandled sense code
Jan 29 18:20:11 dragon kernel: [66792.180516] sd 3:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jan 29 18:20:11 dragon kernel: [66792.180521] sd 3:0:0:0: [sdb] Sense Key : Medium Error [current] [descriptor]
Jan 29 18:20:11 dragon kernel: [66792.180547] sd 3:0:0:0: [sdb] Add. Sense: Unrecovered read error – auto reallocate failed
Jan 29 18:20:11 dragon kernel: [66792.180553] sd 3:0:0:0: [sdb] CDB: Read(10): 28 00 4b 2a 6c 4c 00 00 c0 00
Jan 29 18:20:11 dragon kernel: [66792.180564] end_request: I/O error, dev sdb, sector 1261071601

Modern hard disk drives are equipped with a small amount of spare sectors to reallocate damaged sectors. However, a sector only gets relocated when a write operation fails. A failing read operation will, in most cases, only throw an I/O error. In the unlikely event a second read does succeed, some disks perform a auto-reallocation and data is preserved. In my case, the second read failed miserably (“Unrecovered read error – auto reallocate failed“).

The read errors were caused by a sync of a new RAID5 array, which was initially running in degraded mode (on /dev/sdb and /dev/sdc, with /dev/sdd missing). Obviously, mdadm kicked sdb out of the already degraded RAID5-array, leaving nothing but sdc. That’s not something to be very happy about…

The only solution to this problem, was to force sdb to dynamically relocate the damaged sectors. That way, mdadm wouldn’t encounter the read errors and the initial sync of the array would succeed.  A tool like hdparm can help you with forcing a disk to reallocate a sector, by simply issuing a write command to the damaged sector. First, check out the number of reallocated sectors on the disk:

$ smartctl -a /dev/sdb | grep -i reallocated

5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       –       0
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       –       0

The zeroes at the end of the lines indicate that there are no reallocated sectors on /dev/sdb. Let’s check whether sector 1261069669 is really damaged:

$ hdparm –read-sector 1261069669 /dev/sdb

/dev/sdb: Input/Output error

Now, issue the write command (note that hdparm will completely bypass regular block layer read/write mechanisms) to the damaged sector(s). Note that the data on these sectors will be lost forever!

$ hdparm –write-sector 1261069669 /dev/sdb

/dev/sdc:
Use of –write-sector is VERY DANGEROUS.
You are trying to deliberately overwrite a low-level sector on the media
This is a BAD idea, and can easily result in total data loss.
Please supply the –yes-i-know-what-i-am-doing flag if you really want this.

Program aborted.

$ hdparm –write-sector 1261069669 –yes-i-know-what-i-am-doing /dev/sdb

/dev/sdb: re-writing sector 1261069669: succeeded

$hdparm –write-sector 1261071601 –yes-i-know-what-i-am-doing /dev/sdb

/dev/sdb: re-writing sector 1261071601: succeeded

Now, use hdparm again to check the availability of the reallocated sectors:

$ hdparm –read-sector 1261069669

/dev/sdb:
reading sector 1261069669: succeeded
(a lot of zeroes should follow)

And using SMART we can check whether the disk has registered two reallocated sectors:

$ smartctl -a /dev/sdb | grep -i reallocated

5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       –       2
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       –       2

It’s actually quite simple to force mdadm to continue using sdb as if nothing ever happened:

$ mdadm –assemble –force /dev/md3 /dev/sdb6 /dev/sdc6

(mdadm will complain about being forced to increase the event counter of sdb6)

$ mdadm /dev/md3 –add /dev/sdd6

And a few minutes later, the array is as good as new!

9 thoughts on “Forcing a hard disk to reallocate bad sectors

  1. Mateusz Korniak

    Hi, great tutorial !

    First, I think it’s worth mentioning that it is possible to kick only affected partition from array by comparing LBA of sector with output of sfdisk -d /dev/sdd.

    Second, I wonder if full sync should be forced before adding partition back to array ?
    So I would execute:

    mdadm –zero-superblock /dev/sdd6

    before:

    mdadm /dev/md3 –add /dev/sdd6

    We do not want to partially zeroed /dev/sdd6 become part of workig array without full resync (due to usage of bitmap).

    Regards,

  2. Carlos D. Garza

    Great article. It did a great job of demystifying bad sector relocation for me. My issue wasn’t raid related so I was a little more desperate to get the drive into a bootable state just long enough to move the important data off of it. It worked so thanks again.

  3. Reik Red

    Great trick to use hdparm to write the sector.

    I have a case were using dd to overwrite the sector produces an IO error and no reallocation. But using hpdparm works and forces the reallocation. Amazing.

  4. Pingback: marcando setor defeituoso sem formatar

  5. Pingback: GaSiD.org.uk » Blog Archive » Dealing with I/O errors on Linux (including fun with software RAID)

  6. Pingback: Offline uncorrectable sectors

  7. Pingback: Solving system error -3009 when formatting hard drive (Yast, OpenSUSE) – PCR's notepad

  8. Pingback: Unrecovered read error – auto reallocate failed | Promethix by Chris Law

  9. Pingback: Bitcoin Yubikey | Bitcoin Success

Comments are closed.