RAID10

13 May 2004, 11:23 UTC

While it is quite possible to provide a RAID10 array by making some RAID1 arrays and combining them with RAID0, this approach has some problems.

One problem is that it is not as straight forward as creating a single array.

Another problem is that there is less flexability for layout. Interesting tricks such as a RAID10 array with an odd number of devices can be achieved with a unified RAID10 implementation.

So, it might be good to write a RAID10 module. This would be very similar to the RAID1 module, but with different mappings from logical address to physical address.

resync

Reconstructing a failed drive is fairly straight-forward. It simply requires progressing though the blocks of the failed drive, doing a reverse-map to find the logical address, then a forward map to find that block on another device. This block is read and scheduled for writing.

Resyncing an array that may have errors (or checking an array for errors) is not quite so straight forward.

One approach would be to simply walk the logical address space linearly and fix or check each block. However this might produce non-optimal seek patterns on some drives. This can particularly be a problem on arrays with an odd number of drives, as read-ahead and write-behind may be intermixed on the one drive yielding a lot of seeking.

When resyncing a mostly-good array, it is probably simplest to do a check-and-repair pass rather than a repair-eveything and then most IO would be Read-ahead will mostly be linear. The occasional writes would just cause small delays.

However when creating a new array, check-and-repair can be very slow. It might be best to recommend a zero-everything pass on such an array. This would be much faster than check-and-repair or repair-all.

geometry

There are two sorts of geometries that could be useful under raid10.

One lays out mirrored sets of chunks in a RAID0 striped pattern. In there are n devices and each chunk is to have m copies, then m must be at most n. The m copies of chunk zero are layed out at the start of each device. Then the m copies of chunk 1, on subsequen devices, wrapping on the next chunk-space of the first device after all devices have recieved their first chunk. The degenerates to RAID0 if m is 0, and to RAID1 if m is n.

The other divides the extend of each device by some small number - p - and treats the array as p large segments, each starting of a different device but following a raid0 layout within that segment. Then mirrored chunks are layed out in separate segments, one is each. This means that sequential reads can stay in one segment and get the full RAID0 benefit of multiple drives. Writes could be substantially slower due to extra seeking needed.

A common arrangement would be to have p equal to m. If these were both 2, then each block would be stored once in the first half on one drive, and once in the second half of another drive. It would be like one RAID0 array followed by another.

These two number, p and m, describe the layout of the array. They are sufficient, together with the number of devices, for all geometry mappings. They can be stored together in the 32bit "layout" field of the md superblock.




[æ]