Hello Neil.
Can you please comment on the following idea? (please add whatever general or specific feedback you feel like).
"Asymmetric Performance Mirroring - Taking advantage of Gen 1 SSDs in home computers." I'd like to realize the benefits of solid state drives (SSDs) while not losing the advantage of HDDs and also not falling victim to the serious drawbacks of SSDs. My goal is to create a cheap but ultraresponsive PC for home or business use. (Summary of SSD performance: Very good random read, moderate sequential read, adequate to poor random write, "stuttering").
What if an SSD was paired with an HDD in software RAID 1 with smart multiple device optimisation bespoke designed for this asymmetry? It seems like mdadm is there already, or at least close!
Ideally, a [typical] user would install an SSD of size X on a sata channel, and another HDD of size Y>X on a different sata channel. The SSD would be a single primary partition (sda0). The HDD would be two partitions, the first of which is sized X (sdb0). sda0+sdb0 would be used to create the RAID 1 redundant array (sdl0) and this logical drive would have the OS and "some" applications installed. It would be considered the "performance" partition. sdb1 would have media files, and larger applications if necessary.
Read Races ----------- A read operation to any logical block of sdl0 would initially be sent to the SSD, then the same request sent to the HDD. The manager would then supply data from the read of the SSD without waiting for the HDD to complete (as I think the mdadm driver already does). A subsequent read operation can look at prior complete times. If it notes that the SSD won the last "race", then it can continue with the same order: issue cmd to SSD first then to HDD. It can also cancel the first request to the HDD if possible. If the user is performing some sustained operations, or if the SSD gets into one of the stuttering states where completions are made to wait for internal garbage collection or erasure events, then the HDD might win the time to completion race. At this point, it would be nice for the manager to further optimize by inverting the order of devices to which the manager issues identical commands. At this point, the HDD is picking up the slack for the SDD and the end user is seeing the total benefit. After a sustained bandwidth sensitive user has stopped requesting, the queue has drained, the manager should see that the SDD is wining the individual races and should invert the cmd issue priority again (favoring random reads).
Filtering Writes ----------------- I'm not sure how this works now, but perhaps the manager can be adapted to help the SSD a little bit. Its been reported on other forums that a utility for windows which caches smaller writes can help the "stuttering" of SSDs. Unfortunately the utility hurts larger write performance. Perhaps the same race analogy can be used: The manager times write completion and if it sees that the HDD starts to win, it simply queues up the writes to the SSD without blocking the user. Data is safe on the HDD if power is removed. Maybe the end user ensures data with UPS system. Maybe the manager has to write to a log in memory to indicate a repair is needed when? The desire here is that the SSD does not slow the system down. My general prediction is that sometimes the SSD will complete a write first (there are spare flash blocks ready to be written in a well maintained flash system) and sometimes the HDD will complete a write first (heads are located ideally and/or HDD cache is not full, and/or the SSD requires internal flash read-modify-write management).
Regards, and thanks for reading, Tom
