(original article)

Re: Smart or simple RAID recovery??

08 September 2010, 05:17 UTC

You're right that automatic recovery has some complex implications, but simply accurate reporting of the location of the error is extremely useful diagnostic information. During scrubbing, I encounter an unexpected inconsistency. I would dearly like to know which drive is the most likely suspect! I have been fighting this problem unsuccessfully in a RAID-5 array (the RAID-10 on another partition of the same drives never shows any problems, sigh), and am hoping that a 3-way mirrored setup will allow better diagnostics.

I'd like to have voting implemented inside the RAID code for that reason alone.

It could also tell me which logical block(s) on the md device are suspect, so I can find any affected files and possibly apply a higher-level consistency check. (Comparison with backups, for example.)

You're right that always reading and checking a whole stripe would be very valuable. Actually, in RAID-1 and -10, you only need to read ceil(n/2) copies. If they agree, any additional ones would be superfluous.




[æ]