Converting RAID5 to RAID6 and other shape changing in md/raid

17 August 2009, 00:09 UTC

Back in early 2006 md/raid5 gained the ability to increase the number of devices in a RAID5, thus making more space available. As you can imagine, this is a slow process as every block of data (except possibly those in the first stripe) needs to be relocated. i.e they need to be read from one place and written to another. md/raid5 allows this reshaping to happen while the array is live. It temporarily blocks access to a few stripes at a time while those stripes a rearranged. So instead of the whole array being unavailable for several hours, little bits are unavailable for a fraction of a second each.

Then in early 2007 we gained the same functionality for RAID6. This was no more complex than RAID5, it just involved a little more code and testing.

Now, in mid 2009, we have most of the rest of the reshaping options that had been planned. These include changing the stripe size, changing the layout (i.e. where the parity blocks get stored) and reducing the number of devices.

Changing the layout provides valuable functionality as it is an important part of converting a RAID5 to a RAID6.

# How Level Changing Works
# Complexities of re-striping data
# Reducing the number of devices
# Currently supported reshapes
# Future Work
# Comments...

How Level Changing Works

If we think of "RAID5" as a little more generic than the standard definition, and allow it to be any layout which stripes data plus 1 parity block across a number of devices, then we can think of RAID4 as just a special case of RAID5. Then we can imagine a conversion from RAID0 to RAID5 as taking two steps. The first converts to RAID5 using the RAID4 layout with the parity disk as the last disk. This clearly doesn't require any data to be relocated so the change can be instant. It creates a degraded RAID5 in a RAID4 layout so it is not complete, but it is clearly a step in the right direction.

I'm sure you can see what comes next. After converting the RAID0 to a degraded RAID5 with an unusual layout we would use the new change-the-layout functionality to convert to a real RAID5.

It is a very similar process that can now be used to convert a RAID5 to a RAID6. We first change the RAID5 to RAID6 with a non-standard layout that has the parity blocks distributed as normal, but the Q blocks all on the last device (a new device). So this is RAID6 using the RAID6 driver, but with a non-RAID6 layout. So we "simply" change the layout and the job is done.

A RAID6 can be converted to a RAID5 by the reverse process. First we change the layout to a layout that is almost RAID5 but with an extra Q disk. Then we convert to real RAID5 by forgetting about the Q disk.

Complexities of re-striping data

In all of this the messiest part is ensuring that the data survives a crash or other system shutdown. With the first reshape which just allowed increasing the number of devices, this was quite easy. For most of the time there is a gap in the devices between where data in the old layout in being read, and where data in the new layout is being written. This gap allows us to have two copies of that data. If we disable writes to a small section while it is being reshaped, then after a crash we know that the old layout still has good data, and simply re-layout the last few stripes from where-ever we recorded that we were up to.

This doesn't work for the first few stripes as they require writing the new layout over the old layout. So after a crash the old layout is probably corrupted and the new layout may be incomplete. So mdadm takes care to make a backup of those first few stripes and when it assembles an array that was still in the early phase of a reshape it first restores from the backup.

For a reshape that does not change the number of devices, such as changing chunksize or layout, every write will be over-writing the old layout of that same data so after a crash there will definitely be a range of blocks that we cannot know whether they are in the old layout or the new layout or a bit of both. So we need to always have a backup of the range of blocks that are currently being reshaped.

This is the most complex part of the new functionality in mdadm 3.1 (which is not released yet but can be found in the devel-3.1 branch for git://neil.brown.name/mdadm). mdadm monitors the reshape, setting an upper bound of how far it can progress at any time and making sure the area that it is allow to rearrange has writes disabled and has been backed-up.

This means that all the data is copied twice, once to the backup and once to the new layout on the array. This clearly means that such a reshape will go very slowly. But that is the price we have to pay for safety. It is like insurance. You might hate having to pay it, but you would hate it much more if you didn't and found that you needed it.

One way to avoid the extra cost of doing the backup is to increase the number of devices at the same time. e.g. if you had a 4-drive RAID5 you could convert it to a 6-drive RAID6 with a command like

mdadm --grow /dev/md0 --level=6 --raid-disk=6
then it will combine the increase in the number of devices with the change to the layout and will copy every block only once (except for the backup it needs to do for the first few stripes).

Reducing the number of devices

The other change worth discussing is the ability to reduce the number of devices. This can be useful to reverse an increase in number of devices that was not intentional, to shrink an array to gain back a spare device, or as part of converting an array from several smaller devices to fewer large devices (thus saving power for example).

While increasing the number of devices or reshaping while leaving the number of devices the same always proceeds from the start of the devices towards the end, reducing the number of devices proceeds from the end of the devices to the beginning. This means that the critical section when a backup is needed happens right at the end of the reshape process, so mdadm runs in the background watching the array and does the backup just when it is needed.

Naturally decreasing the number of devices reduced the amount of available space in the array and the very first write-out to the new arrangement will destroy data that was at the end of the original array. Data that hopefully isn't wanted any more. However people sometimes make mistake and as reducing the number of devices is immediately destructive of data mdadm requires a little more care. In particular, before reducing the number of devices in a RAID5 or RAID6 you must first reduce the size of the array using the new --array-size= option to mdadm --grow. This truncates the array in a non-destructive way. You can check if the data that you care about is still accessible and then when you are sure, use mdadm --grow --raid-disks= to reduce the number of devices.

If you have replaced all the devices with larger devices, you can avoid the need to reduce the size of the array by increasing the component size at the same time as reducing the number of devices. e.g. on a 4-disk RAID5,

mdadm --grow --size max --raid-disk 3
... or at least you should be able to. The current mdadm pre-release don't get that right but hopefully it will before mdadm-3.1 is really released.

Currently supported reshapes

A bit of a summary of what is actually supported seems in order here.

RAID1

A RAID1 can change the number of devices or change the size of individual devices. A 2 drive RAID1 can be converted to a 2 drive RAID5.

RAID4

A RAID4 can change the number of devices or the size of individual devices. It cannot be converted to RAID5 yet (though that should be trivial to implement).

RAID5

A RAID5 can change the number of devices, the size of the individual devices, the chunk size and the layout. A 2 drive RAID5 can be converted to RAID1, and a 3 or more drive RAID5 can be converted to RAID6.

RAID6

A RAID6 can change the number of devices, the size of the individual devices, the chunk size and the layout. And RAID6 can be converted to RAID5 by first changing the layout to be similar to RAID5, then changing the level.

LINEAR

A lINEAR array can have a device added to it which will simply increase its size.

RAID0 and RAID10

These arrays cannot be reshaped at all at present.

Future Work

While the work describe here greatly increases the options for reshaping an array there is still more to do (apart from just fixing the bugs in the current code).

While conversion from RAID0 to RAID5 was used as an example above, it isn't actually implemented yet. md/RAID0 is broader than the regular definition of RAID0 in that the devices can be different sizes. If they are, then the array cannot be converted to RAID5 at all, so a general conversion is not possible.

It would still be good to support a simple conversion if the RAID0 does just use the same amount of space from each device. It would also be nice to teach RAID5 to handle arrays with devices of different sizes. There are some complications there as you could have a hot spare that can replace some devices but not all. There would also be the need to store the active sizes of all devices in the metadata - something RAID0 doesn't need as it doesn't need to be able to cope with missing devices.

If we did have support for variable drive sizes in a RAID5, then we could implement extending a RAID0 by converting it to a degraded RAID5, perform the conversion there using common code, then convert back to RAID0

It would also be nice to add some reshape options to RAID10. Currently you cannot change a RAID10 at all. The first step here would be enumerating exactly what conversions make sense with the various possible layouts (near/far/offset) and then finding how to implement them most easily. But that is a job for another days.




Comments...

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (19 August 2009, 02:52 UTC)

Wow. Impressive work. Thank you for all of your hard and careful work on this.

Minor nit, you have a typo of sunheading rather than subheading in the Raid5 supported operations section.

-ben

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (19 August 2009, 03:31 UTC)
I've fixed the sunheading - thanks.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (19 August 2009, 15:02 UTC)

I have a question about this new feature. mdadm --grow --size max --raid-disk 3

Here is what I am doing. I currently have a 6 Drive Raid 5 consisting of 5x1TB, and a 1TB LVM (2x320GB, and 1x500GB (OS included in extra space)). I just recently bought 3 more 1TB drives and was hoping to build a new Raid 5 using 256k chunks instead of 64k chunks. I wanted to gradually remove and shrink the original array freeing up drives and then move the drives over to the new array.

I am getting the error "cannot reduce number of data disks yet", and I am using version 2.6.7.1 in Ubuntu 9.04

Is this possible at all in any version of mdadm, or am I going to have to get creative?

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (19 August 2009, 20:56 UTC)

Yes it is possible with the new code. However it isn't quite released yet (Though you can pull it from my git tree). You will need mdadm-3.1 and linux-2.6.31.

If you just want to change the chunk size, the new mdadm will do that for you
mdadm --grow /dev/mdX --chunk-size=128

If you want to reduce the number of devices in an array you can. You will have to shrink the filesystem first, then shrink the array with
mdadm --grow --array-size

and then change the number of devices.


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (20 August 2009, 04:09 UTC)

Thanks, I managed to find a way to backup all my data on my old array so I can kill it and just add the drives to the new array.

Hopefully next time I need to shrink it will be released and stable, awesome to see the quick feedback.

I trust mdadm software raid way more than any hardware raid.Allows me to build a relatively cheap system and give me all the protection I need. And it's worked awesome switching between 4 different motherboards now, always detects everything no problem, awesome.

And it's nice to be able to do everything live.

Keep it up, Thanks.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (21 August 2009, 07:35 UTC)
Neil,

Following your instructions, i upgraded to mdadm, but this command looks not to work

mdadm --grow /dev/md0 --level=6 --raid-disk=6 madam: /dev/md0: Cannot reshape array without increasing size (yet).

I wanted to convert a 5-disk RAID5 to 6-disk RAID6.

If i use this command below, there is another error: mdadm --grow /dev/md0 --level=6 --raid-disk=6 --size=max madam: can change at most one of size, raiddisks, bitmap, and layout.

Could you give some instructions? Thanks.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (22 September 2009, 12:08 UTC)

Hi,

I'm considering reshaping an almost full (99%) 7-disk RAID5 array to a 9-disk RAID6 array, but the fact that mdadm-3.1 (which I understand is the only one capable of doing that) hasn't been released (even in -rc) yet has me a little worried...

Do you have any ETA for a release? Do you consider the code stable enough already? I know I should backup before the reshape, but it's not technically (and financially ;-P) feasible; that's why I use el-cheapo software RAID on el-cheapo consumer grade hardware... :)

Thanks for your hard work

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (23 September 2009, 05:37 UTC)

No, I don't really have an ETA for a release. Maybe next week, almost certainly before the end of October. The only thing that I need to do before a release it to write a bunch of test cases so I can increase my confidence in the code.

I think you should be safe running the code from the 'master' branch of my git tree. However if anything does go wrong, I suggest that you don't try to fix it, but rather stop touching the array and email all the details to linux-raid@vger.kernel.org (feel free to Cc me explicitly). I will work with you to find out exactly what happened and to fix it. But I really don't expect that to be needed.

Whether it works or not, I would really appreciate an email to linux-raid@vger.kernel.org detailing your experience.

Thanks.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (06 October 2009, 15:52 UTC)

I was wondering how to reduce the number of devices on a RAID-5 without having to buy extra devices for the move... and you just implemented it!

On Linux 2.6.31.2 I performed a test with a RAID-5 of 4 LVM volumes, hosting an ext3 filesystem. After shrinking the filesystem, I reduce the array size with mdadm from the branch devel-3.1 then I reduce the number of devices to 3 and grow the filesystem. It works perfectly.

Thank you and congratulations for all your work.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (18 October 2009, 09:11 UTC)

Dear Neil,

I love this mdadm-stuff, as it adding a huge benefit for linux (and of course for myself). You are talking about future-capabilities, which I would love to see: "If we did have support for variable drive sizes in a RAID5"

My server has actually a raid6 of 6 active 500GB-drives and 1 spare. I would like to add another 1.5TB-drive, but I do not like to split this new drive into 3 * 500GB, as then failing this single 1.5TB drive would render my raid unusable. I have seen ideas of striping e.g. the existing 500GB-disks to a 1-TB-disk, so I could assemble my raid6 with a new 1 *TB native and e.g. 3 * striped 500GB (=3 * 1-TB-disk). However this would result into a be degraded raid6 as well, unless I add 3 *1TB to this raid.

So having the capability of mixing different sizes in a raid6 would be genious. But I guess there would be limitations, as of the size for the spare and I am not sure what would happen if the largest-drive would fail.

However, even with todays flexibility, this mdadm is an outstanding useful tool! congrats!

christian

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (18 October 2009, 12:19 UTC)

christian: You can stack the MD levels... So make the 6 500G disks into two 3-disk RAID-0s. Then you have 3 1.5T sized devices (the two RAID-0s and the 1.5T partition) and can build your final RAID-5 on top of that.

Ugly but doable.

-ben

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (18 October 2009, 17:57 UTC)
Dear Ben,

thats what I meant with my (afterwards suboptimal) statement: 3 * striped 500GB. So many thanks, as you suggested what I already thought. I feel now safer going down this road of stacking md-levels. Thanks alot for your useful help

christian

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (21 October 2009, 03:02 UTC)
So I guess I'm the newbie here, because I can't seem to get mdadm v3.1 installed...


git clone git://neil.brown.name/mdadm

git branch --track devel-3.1 origin/devel-3.1

git checkout -f devel-3.1

make

sudo make install


All seems to work just fine, until mdadm --version:

mdadm - v3.0.2 - 25th September 2009


Even doing ./mdadm --version within the devel-3.1 branch shows 3.0.2


Am I just confused, or did I do something wrong.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (24 December 2009, 13:12 UTC)

Firstly -I am impressed with the new RAID6 and reshape features added recently - respect!


I am currently reshaping a RAID6 array from 6x1.5TB discs to 8x1.5TB discs. Simply - I have added 2 identical discs to an existing (almost empty) 6disc x RAID6 array. The operation started a 4 days ago and reporting running at 7000kb/s, and was consuming 5-6% CPU (load average 1.4-1.6 from memory). This was slower than I expected. (No raid controller, using 5x onboard SATA ports and 2x 2-port SIL SATA PCE-Express cards, Quad Core AMD 3GHz, 4GB RAM etc)

After 1-2 days I noted the rate of progress dropped very significantly, and "/proc/mdstat" now reports anything from 5000kbs to 200kbs, mostly around 500kbs. Everything was running OK but just miserably slow. After 4 days its now 71.2% complete.

I have just shutdown and restarted the machine and onreboot it has restarted the reshape and is currently running at a "proper" speed again at between 8000 and 10000 kbs (8-10MB/s). However, it may be after a few hours the rate drops back to 500kbs again?

Is this normal? What have I misconfigured?


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (03 January 2010, 18:04 UTC)

Loving the software, have just kicked off a raid 5 to 6 conversion from 6 to 7 drives. It is running at 2384K/sec averagely dunno if that is normal? Used my mirrored raid 1 drive as the backup.

Good job!

(PS it'd be really, really handy if your posts/messages had a date on them?)

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (03 January 2010, 21:44 UTC)

Yes, when you do an in-place raid conversion (array size doesn't change) it will always be very slow as all data has to be written twice and there isn't much opportunity for streaming.

And thanks for the suggestion, there should be dates everywhere now.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (05 January 2010, 20:25 UTC)
Neil,

I was wondering if I could pick your brain and get opinions on my upgrade plan. I am currently running a 4x500GB RAID5 array that is about 80% full now. I've decided to upgrade to 5x2TB drives in a RAID6 configuration for extra redundancy. One thing that is a bit of a problem is that my machine can only support 5 drives, so I can't just attach the new drives and copy the data. I've come up with two options to do this upgrade:

1. Add one 2TB drive as a stand-alone, copy all data to the 2TB drive, replace the 4 old drives, create 4x2TB RAID6 array, copy data over from stand-alone drive, and finally expand the array to include the fifth drive.

2. Add 2TB drive, make current RAID5 array a RAID6 array including that drive, then swap out the 4 500GB drives one at a time as if they had each died.

I'm leading towards option 1 as I think that would be quicker, the longest part being the expanding of the array from 4 drives to 5 at the end, but the copying to and from the 2TB drive will probably just take several hours each way. It also seems fairly safe because I'll have the 4 500GB unaltered if anything should go wrong.

Thanks for any input, and thanks for all your hard work on this project.

Keith

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (06 January 2010, 19:34 UTC)

Option 1 is definitely better if you are happy with the extended down-time that it requires. It will very likely be faster as the RAID5 -> RAID6 conversion is very slow.

As you say, you will have a complete, valid copy of the data on the smaller drives which is good insurance.

So go with option 1.

NeilBrown

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (07 January 2010, 17:33 UTC)
Neil, thanks for the input. I'll go that route and I don't think the downtime should be too bad as the array is primarily media storage. I won't be able to write to it for a while, but I'll probably be able to stream from it without much trouble.

Keith

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (09 January 2010, 06:43 UTC)

Neil was wondering if I could get your advice. I'm currently running at 4x1TB Raid5 and I would like to change that to a 6x1TB Raid6. I guess confused on the process. So essentially i can just use the mdadm --add /dev/mdX /dev/sdX and add the 2 drives making it 6x1TB. Then mdadm --grow /dev/md0 --level=6 --raid-disk=6 will correctly reshape my raid5 to a raid6 then i'd just have to xfs_growfs /dev/mdX afterwards? Just wondering if my thought process is correct here. Thanks in advance this is some great progress.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (09 January 2010, 18:11 UTC)

Hi all, I'm converting a 7 disk RAID-5 array to an 8 disk RAID-6 array and wanted to share. Unlike the previous poster, I am not increasing the size of the RAID device, I am simply increasing redundancy.

First of all I made sure I was running mdadm 3.1 or above, which my distro (gentoo) did not have by default. So I installed mdadm 3.1.1, which I believe is current as of the time of this writing.

To add the 8th disk to my existing RAID-5 array, I ran: mdadm --add /dev/md3 /dev/sdh3

this added /dev/sdh3 as a hot spare. Then, to convert this to a RAID-6 array, I ran: mdadm --grow /dev/md3 --level=6 --raid-devices=8 --backup-file=/nfs/media/tmp/md3.backup

Notice the argument is "--raid-devices", not "--raid-disk" as in Neil's post.

I had tried to run the --grow command without the --backup-file argument, as Neil's post seems to say that a backup file is not necessary when a hot spare is present. But mdadm wasn't having it, it told me: mdadm level of /dev/md3 changed to raid6 mdadm: /dev/md3: Cannot grow - need backup-file mdadm: aborting level change

With the --backup-file argument everything seems to be working fine. Here's the relevant part of my /proc/mdstat: md3 : active raid6 sdh3[7] sdg3[6] sdf3[5] sde3[4] sda3[0] sdb3[2] sdc3[3] sdd3[1] 120052224 blocks super 0.91 level 6, 256k chunk, algorithm 18 [8/7] [uuuuuuu_] [=>...................] reshape = 6.3% (1269760/20008704) finish=151.2min speed=2064K/sec

My next step is to convert my 4 terabyte /dev/md5 to a RAID-6 array. Neil wasn't kidding when he said the reshape is a slow process... at the rate that /dev/md3 is converting, I estimate that it will take 4.5 days to convert my /dev/md5 to RAID-6.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (12 January 2010, 12:49 UTC)


sudo mdadm --grow /dev/md1 --chunk=128 --backup-file=/mnt/tmp/backup locks up the computer and forces a resync.

mdadm 3.1.1 kernel 2.6.32.10

/dev/md1 is a 3 disk raid5 64k chunk size. /mnt/tmp is a spare hard drive.


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (24 January 2010, 22:47 UTC)
##### mdX : active raid5 sdd1[8](S) sdb1[7](S) sdf8[0] sdl8[4] sdk2[5] sdc1[6] sdj6[3] sdi8[1] . Y blocks super 1.1 level 5, 128k chunk, algorithm 2 [6/6] [uuuuuu] # mdadm --grow /dev/mdX --level=6 --raid-devices=8 --backup-file=/root/mdX.backupfile mdX : active raid6 sdd1[8] sdb1[7] sdf8[0] sdl8[4] sdk2[5] sdc1[6] sdj6[3] sdi8[1] . Y blocks super 1.1 level 6, 128k chunk, algorithm 18 [8/9] [uuuuuu_u] . [>....................] reshape = 0.0% (33920/484971520) finish=952.6min speed=8480K/sec Nooo mdadm 3.1.1 I wanted an 8 device raid-6 Why do you show 9? -- update What is it showing me now??? blocks super 1.1 level 6, 128k chunk, algorithm 2 [8/10] [uuuuuuuu] mdadm --detail /dev/mdX /dev/mdX: . Version : 1.01 . Raid Level : raid6 . Array Size : 6z () . Used Dev Size : z () . Raid Devices : 8 . Total Devices : 8 . Persistence : Superblock is persistent . Update Time : Sun Jan 24 14:40:32 2010 . State : clean .Active Devices : 8 Working Devices : 8 .Failed Devices : 0 . Spare Devices : 0 . Layout : left-symmetric . Chunk Size : 128K . Number Major Minor RaidDevice State . 0 8 . 0 active sync /dev/sdf . 1 8 . 1 active sync /dev/sdi . 3 8 . 2 active sync /dev/sdj . 6 8 . 3 active sync /dev/sdc . 5 8 . 4 active sync /dev/sdk . 4 8 . 5 active sync /dev/sdl . 8 8 . 6 active sync /dev/sdd . 7 8 . 7 active sync /dev/sdb ... Did it actually do what I want but just show me the wrong result with kernel 2.6.32-gentoo-r2

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (12 February 2010, 11:16 UTC)

I have just started a conversion of a 6 disk 2TB RAID5 to a 8 disk 2TB RAID6 on my system.

The performance during conversion was terrible - 1MB/second. This got me suspicious, as I know the array could perform much better than that.

Issuing "echo 200000 > /sys/block/<mddevice>/md/sync_speed_min" fixed it, as it now reshapes at 20MB/second.

This ofcourse blocks everything else, but issuing an "echo 10000" gives the a fair tradeoff, as the "when idle" system obviously seems to have a problem.

So try this, instead of rebooting.

[permalink][hide]

Comment (22 February 2010, 01:03 UTC)
Hey there,

first of all I want to thank Neil and all other developers of mdadm for their great work. I have been using mdadm for several years now and my arrays have survived two disk-faults flawlessly - from the notification-email to rebuilding the array everything went as it should.

I have a question regarding my raid6-array, consisting of 7 1TB-disks providing a single lvm-physical volume. I plan to expand that array with a 1.5TB-disk and want to make sure that I got everything written here and on the mailing list right:

- If I grow that array after adding the new 1.5TB-disk to it, only 1TB of the new disk will be used and the array-size will increase by 1TB. (Do I have to issue some special commands when growing or will mdadm automatically just use 1TB of the 1.5TB-disk?)

- If I resize the lvm-physical volume the array provides, the 1TB will become usable by lvm.

Is that correct?

Then I got another question: To grow a raid6-array, I need mdadm past version 3.x. Currently, the array is on a debian-server running a year or so without reboot, therefore it runs on mdadm-2.6.x. I recently updated the mdadm-package to the current version 3.1.1 using aptitude. Is a reboot necessary for the grow-feature to become available or does it suffice to just stop and start the array (or is neither of these actions necessary)?

Thank you all for your help!

rman

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (22 February 2010, 00:03 UTC)

- yes, only 1TB of the new device will be used. md doesn't need to be told any special for this to happen.

- similarly, lvm will only be able to see 1TB of extra data.

Providing your kernel is v2.6.21 or more recent you should not need to reboot. If your kernel is older than that, you will need a new kernel.

[permalink][hide]

Comment (22 February 2010, 16:34 UTC)
Thank you very much for your answers, everything as I expected. I decided to add another 1TB disk instead of a 1.5TB disk as I didn't consider that the 1.5TB disk would be much slower than all other preexisting 1TB disk. So here my situation again: I want to add another 1TB-disk to an existing raid6-array consisting of seven 1TB-disks. Unfortunately mdadm gives me some strange errors. At first I added the new disk (sdh) to the existing array (md11), which went fine. So now I have a 7-disk array with a single spare:

md11 : active raid6 sdh[9](S) sdc[0] sdi[8] sdf[7] sde[4] sdk[3] sdj[2] sdd[1] 4883811840 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/7] [uuuuuuu]

If i now issue 'mdadm /dev/md11 --grow --raid-devices=8 --backup-file=/root/backup_md11.bak' I get

mdadm: this change will reduce the size of the array. use --grow --array-size first to truncate array. e.g. mdadm --grow /dev/md11 --array-size 1565606912

Why would adding a disk REDUCE the size? Am I missing something here? If I add the '--size=max' switch to the command line, i.e. 'mdadm --grow /dev/md11 --raid-devices=8 --level=6 --backup-file=/root/backup_md11.bak --size=max', I get

mdadm: cannot change component size at the same time as other changes. Change size first, then check data is intact before making other changes.

I really don't understand this as the new disk is exactly the same model as the other ones. 'hdparm -I' on the new disk gives me a device size which is identical to the other disks. Has anyone a clue what is going on? Has it something to do with the non-standard chunk-size?

Thanks for helping,

rman

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (22 February 2010, 20:56 UTC)

You have found a bug! It has been fixed for a while, but I haven't released an updated version yet. The two patches to fix it are:

http://neil.brown.name/git?p=mdadm;a=commitdiff;h=2ed4f75388f99968be58097941a9704f6e42d701

and

http://neil.brown.name/git?p=mdadm;a=commitdiff;h=f98841b3852ceb7fce56a6f818236a4af9b5a00a

If you have 'git' installed, you might do best to:

git clone git://neil.brown.name/mdadm mdadm cd mdadm make install

and try that.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (05 March 2010, 21:41 UTC)

has changing the chunk size been fixed? i have a 3 disk raid5 that has 64k chunks and i want to go to 256k chunks.

trying to do this crashed the array and forced a resync

[permalink][hide]

Re: Changing chunk size (05 March 2010, 22:41 UTC)

As far as I know, changing chunksize works OK.

There might be some issues when going from a smaller to a larger chunksize if the size of the components is not a multiple of the larger chunk size - mdadm doesn't seem to check that properly at the moment, but the failure mode is to do nothing, not to crash the array.

If you have specific details - kernel log error messages, "mdadm --detail" details of the array, version of kernel and mdadm, and anything else that might help me reproduce your problem, please post them to linux-raid@vger.kernel.org (you don't have to subscribe to post).

I just successfully reshaped a small scratch 3-drive-raid5 from 64K chunks to 256K chunks using mdadm-3.1.1 and linux-2.6.32.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (13 March 2010, 14:15 UTC)

Neil,

I'm trying to do a procedure you describe in the examples, converting a 4 disk raid5 to a 6 disk raid6 volume.

I have added the 2 new disks to the raid5 array (sd[ef]4): md0 : active raid5 sdf4[4](S) sde4[5](S) sdd1[1] sdb1[3] sdc1[2] sda1[0] 937705728 blocks level 5, 128k chunk, algorithm 2 [4/4] [uuuu] bitmap: 1/150 pages [4kb], 1024KB chunk

However, when I run the grow command it aborts: # mdadm --grow /dev/md0 --level=6 --raid-devices=6 --backup-file=/root/raid-backup mdadm level of /dev/md0 changed to raid6 mdadm: Need to backup 1536K of critical section.. mdadm: Cannot set device shape for /dev/md0 mdadm: aborting level change

I'm using the following versions of mdadm and Gentoo kernel: mdadm - v3.1.1 - 19th November 2009 2.6.31-hardened-r11

Dave

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (13 March 2010, 20:38 UTC)

Hi Dave,

The reason it doesn't work for you is that you have a write-intent bitmap. It is not currently possible to reshape an array with a bitmap as resizing the bitmap is not yet implemented. You need to remove the bitmap (mdadm -G /dev/md0 --bitmap none), then reshape and wait for it to complete, then re-add the bitmap (mdadm -G /dev/md0 --bitmap internal).

In think mdadm 3.1.2 (only just released) gives a more useful error message in this circumstance.


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (14 March 2010, 01:54 UTC)

You rule Neil! Thanks for the quick response. I'm almost half done the reshape. Looks like it's going to end up taking about 10 hours total. Seriously, mdadm is magic. I'm glad I didn't waste money on a hardware controller. I feel a lot safer using linux raid.

Thanks again!

[permalink][hide]

Query about integrating non-hotswap 'spare' drives into a hot-swap array (17 March 2010, 22:46 UTC)

I'm planning out a 5-20 drive RAID array (to be grown as needed by adding more drives) and have been considering ways to integrate a 'spare' drive in one or more of the non-hotswap bays available in the chassis. Would it be reasonable to do the following?

  • Assemble each 'hot swap' bay as a 'device + missing' RAID-1.
  • Assemble the RAID-5/RAID-6 from the RAID-1 slots, leaving the 'cold swap' drive out.
  • On failure, add the 'cold swap' drive into the 'missing' slot of the affected RAID-1 with a script.

Theory behind this is that it allows the rebuild to begin immediately, without requiring to wait for the rebuild to finish before the failed drive can be hot-swapped. Once the hot-swap drive is replaced and the rebuild finished + mirrored back in, the 'cold swap' drive can be pulled back out of the appropriate RAID-1.

Or is there a more efficient way to implement this idea of allowing for cold-swap versus hot-swap status for drives in a RAID array? Or would the RAID-1 coming back up in this manner cause problems with the RAID-5/RAID-6 wrapped around it?

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (28 March 2010, 00:31 UTC)

Hi,

I use mdadm 3.1.1 with ubuntu 10.04 2.6.32-17 kernel I want to shrink a 6 drive raid 5 to a 5 drive raid 5

via: $ sudo mdadm --grow /dev/md0 --raid-devices=5 --backup-file /mdadmbackup

cmdline shows me following output:

mdadm: Need to backup 1280K of critical section.. $

and nothing happen even under cat /proc/mdstat shows me just the normal array and nothing more no drive interaction

What is happen?

Will there be a algorithm which create Raid - Sys without Write-hole like zfs? If yes when?

Thanks TT

[permalink][hide]

Re: Thanks for all the amazing work (23 April 2010, 07:17 UTC)

Hi

As a contributor to several open source projects myself, I know how rarely you hear "thanks". So:

Thanks very, very much for all you've done on md/mdadm. It's a brilliantly reliable, solid and easy to use system that I strongly prefer to fakeraid/hostraid setups and almost always favor over true hardware RAID as well. It's a sysadmin's dream.

About the only time I ever use anything else is when I'm not using a Linux server or I need write-back caching, in which case hardware RAID is currently a necessity to get persistent write cache.

I was recently astonished at the ease with which mdadm reshaped and grew an existing RAID 6 array from 8 to 10 disks. One command and it's merrily chugging away without the rest of the system even noticing. Thanks to the status-watching cron job it even emailed me when it was done two days later!

I think people take the md subsystem for granted, and that's a shame as it's one of the best things around on Linux systems.

(soapyfrogs.blogspot.com has contact info).

[permalink][hide]

Comment (05 May 2010, 04:09 UTC)
We use mdadm every day, across hundreds of servers, on mission critical work...some of it life-or-death stuff...meaning, if it dies we don't live.

We strongly prefer mdadm to every other RAID system, software and hardware, and have chosen it hands-down, every time for years.

mdadm is of aspiration quality. Thank you for creating this, touching millions if not billions of lives, truly improving humanity, and making something deserving of utmost pride and admiration.

Best regards from outside of Washington, DC, 5 May 2010 Michael Brenden

[permalink][hide]

RAID10 reshaping (25 September 2010, 10:12 UTC)

Hi, here are some reshaping options involving RAID10 that I would really like to have, and my understanding of how they would work:

- 2-drive RAID1 to 2-drive near-RAID10 and back; should be trivial

- 2-drive RAID0 to 4-drive near-RAID10 degraded to 2 drives, and back; should be trivial too

- changing the number of devices, especially 2->4; that's more difficult, but it's already done for RAID5 and 6 and I think it's not very different

- changing the RAID10 layout, probably not harder than the above

- changing the size of devices for near layout; relatively easy I guess


Thanks for the great work!

aditsu

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (12 May 2010, 06:02 UTC)

Just wanted to say thanks for making such a great product! I just converted a 3-drive RAID-5 array to a 4-drive RAID-6 array, with exactly one simple command. The array is the root of my Linux system, and it didn't even hiccup as the conversion happened.

Really nice work.

Thanks, Mason

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (18 May 2010, 02:01 UTC)

I'm in a position where I have a 6 drive (1 failed, 5 active and working with the array in a degraded state) RAID6 array that I'd like to shrink/grow to a 5 drive RAID5 array.

The logic is that I've had a bad run with 1.5tb sized drives, so I'm more itnerested in just making do with the 5 drives / 6tb worth of useable storage I have now - and limp along until I can put together a new array from 2tb drives to take it's place. It also doesn't help that my current array is reiserfs, and I'd really feel a lot more comfortable with ext or xfs.

So I admit I'm having problems with the directions so far, insofar as shrink/growing the array from 6x RAID6 to 5x RAID5.

Suggestions or assistance?

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (18 May 2010, 02:12 UTC)

It is easier to answer these sort of questions when posted to linux-raid@vger.kernel.org.

However you just need mdadm-3.1.1 or newer, linux kernel 2.6.32 or newer and then

mdadm --grow /dev/mdXX --level raid5 --backup /some/file/not/on/the/array

It will take a while rearranging all the data, but when it is done you should have a RAID5, and while it is going you should still have access to your data.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (31 May 2010, 16:55 UTC)

Fantastic utility. I recently had to re-shape (twice), re-size and even re-layout an existing and poorly thought out 5 1TB Disk RAID4 array. As I couldn't use my running kernel or installed mdadm (too low a version for both) I boot to System Rescue CD 1.5.4, which had the requisite kernel version and upper mdadm version, http://www.sysresccd.org/ and was able to do everything. It took 48 hours for each operation (3 separate operations), but in the end I had a left symmetric, 6 disk RAID6 array. The production OS was able to assemble the device and mount the volume without incident!!

Thank you for your hard work and attention to detail.

Christian

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (10 June 2010, 16:08 UTC)

Excellent utility. I am using it now for my home network share. Here is the problem I am have (or maybe not having). Last night, I added 4 drives to my 4 drive RAID5 array. At the same time, I changed to RAID6. The command I used to do this was (if I remember right):

mdadm --grow /dev/md0 --level=6 --raid-devices=8 --backup-file=/home/tbmorris/RAID.bak

When I did it, I do not remember seeing any errors. I then ran:

mdadm --detail /dev/md0

What I saw did not make me happy. Drives /dev/sd[bcdeg] were active sync, while /dev/sdf said spare rebuiling and /dev/sd[hi] said fauly spare rebuilding. Now while I was cursing that I just dumped 1.2Tb of movies and files, my wife was still sitting on the couch watching those very same movies that I thought I had destroyed. This morning when I checked up on the array, it said the same. My question is.... Is this normal? Is this merely what it looks like when you add 4 drives and change to RAID6 at the same time?

Thanks,

Terry

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (11 June 2010, 04:25 UTC)

Hi Terry,

it is hard to be sure without seeing concrete details (e.g. mdadm --detail output, or cat /proc/mdstat) but I would guess that do of the new drives suffered some sort of error while new data was being written to them. So your RAID6 now is doubly-degraded. i.e. the data is safe, but another disk failure will cause loss of data.

You should wait until the reshape completes (if it hasn't already) then add two known-good drives. If you think the too failed drives are actually good, you should test them well before adding them to the array. Maybe look for error messages in the kernel logs to see what so of error occurred on them.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (11 June 2010, 06:33 UTC)

Here are the stats:

/dev/md0: Version : 0.91 Creation Time : Mon May 30 03:29:47 2005 Raid Level : raid6 Array Size : 1462862592 (1395.09 GiB 1497.97 GB) Used Dev Size : 487620864 (465.03 GiB 499.32 GB) Raid Devices : 8 Total Devices : 8 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Thu Jun 10 19:15:18 2010 State : clean, FAILED Active Devices : 5 Working Devices : 6 Failed Devices : 2 Spare Devices : 1 Layout : left-symmetric-6 Chunk Size : 64K Delta Devices : 3, (5->8) New Layout : left-symmetric UUID : e2ebff64:c0d9f90e:4bd6122a:78a10948 Events : 0.34221 Number Major Minor RaidDevice State 0 8 83 0 active sync /dev/sdf3 1 8 99 1 active sync /dev/sdg3 2 8 115 2 active sync /dev/sdh3 3 8 131 3 active sync /dev/sdi3 4 8 64 4 spare rebuilding /dev/sde 5 8 48 5 active sync /dev/sdd 8 8 32 6 faulty spare rebuilding /dev/sdc 9 8 16 7 faulty spare rebuilding /dev/sdb

and /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid6 sde[4] sdd[5] sdc[8](F) sdb[9](F) sdg3[1] sdi3[3] sdf3[0] sdh3[2] 1462862592 blocks super 0.91 level 6, 64k chunk, algorithm 18 [8/5] [UUUU_U__]

To me it means nothing. Seems like it says it's good, but it's bad. Kind of like a pint of ice cream.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (11 June 2010, 06:43 UTC)

Hmmm... that looks rather messed up - something is definitely wrong. The fact that the reshape has stopped in the middle looks bad, but maybe not too bad.

I need to see the output of "mdadm -E" on all 8 devices. Rather than try to put it in a comment (Which will mess up the formatting) can you post it to me at neilb at suse.de.

Thanks.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (26 July 2010, 02:24 UTC)
I would like to reshape a 6x1T raid5 array to a 4x2T raid5 array. From the instructions on this page, I tried:

# mdadm /dev/md2 --grow --size=max --raid-devices=4 mdadm: cannot change component size at the same time as other changes. Change size first, then check data is intact before making other changes.

Linux 2.6.34 x86_64 mdadm - v3.1.2 - 10th March 2010

Is it possible to do what I'm trying to do? If so, what do I need to do?

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (26 July 2010, 03:00 UTC)

Yes it is possible, but you have to go through a slightly round-about route. I have made a note to look at fixing it for a future release.

You need to first

mdadm --grow /dev/md2 --size=max

and then try

mdadm --grow /dev/md2 --raid-devices=4

That won't work but will tell you to use "--grow --array-size=" to reduce the array size. It will tell you what size is needed. So

mdadm --grow /dev/md0 --array-size=whatever
mdadm --grow /dev/md2 --raid-devices=4

That should do what you want.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (26 July 2010, 03:10 UTC)
Thank you for the lightning fast response!

Unfortunately, that sequence did not work:

# mdadm --grow /dev/md2 --size=max mdadm: component size of /dev/md2 has been set to 976759808K

IMO, going from 1T to 2T should result in a component size double that value. There are 4x 2T devices in the array now and 2x 1T devices. Isn't max the max common to all devices, and thus 1T?

The 2T devices are partitioned as follows:

# fdisk -l /dev/sdb Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes 255 heads, 63 sectors/track, 243201 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x2051f33f Device Boot Start End Blocks Id System /dev/sdb1 1 243201 1953512001 fd Linux raid autodetect

# mdadm --grow /dev/md2 --raid-devices=4 mdadm: this change will reduce the size of the array. use --grow --array-size first to truncate array. e.g. mdadm --grow /dev/md2 --array-size 2930279424

This is telling me to shrink the array to 3T which I cannot do since the array is nearly full (5T of data).

Is it necessary to fail the 1T devices first? If so, I can only reduce by 1 then. Yes?

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (26 July 2010, 03:40 UTC)

976759808K == 976759808 * 1024 == 1000202043392 == 1TB

So it did what you expected. But that is a small matter.

I see your real problem though.. If you have 4 2TB devices and 2 1TB devices, md will only acknowledge 1TB of each device and you cannot grow to use more of fewer devices.

If you had 1 more 2TB device that you could swap for a 1TB device, then you could degraded the array so that it all lives on 2TB devices, then you could reshape. But I wouldn't recommend that - you should never discard your redundancy by choice - the risk is too high.

If you still have the 4 1TB devices that you replace, then you can use RAID0 to combine them into 2TB devices, and so make the whole array consist of 2TB devices (4 real, 2 RAID0). Then you should be able to reshape nicely.

It would be good to make md/mdadm handle this better - it isn't entirely straight-forward though. I'll have to give it some thought.


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (31 July 2010, 08:15 UTC)

I'm using ubuntu and don't have the latest mdadm. Ubuntu uses v2.6.7.1

I want to reshape a RAID5 (3x 1TiB) and change the chunk size. I have read that increasing chunk size can greatly improve performance on a large filesystem with large files. Most (90%+) of my files are over 1GiB. Would there be any perceivable improvement?

Assuming there would be an improvement, what are the risks involved with installing a newer version of mdadm while it's running? Could I make a livecd with the latest version, reshape from that, then continue using 2.6.7.1 once it's reshaped??

I plan to add another drive in the near future, so I'll be reshaping then and may change chunk size at that point and kill 2 birds... Again I need to fully understand the implications of upgrading mdadm vs putting it on a live cd.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (02 August 2010, 00:15 UTC)

Performance is very dependent on hardware and load characteristics. A larger chunk size does seem to improve performance for sequential reads, but I cannot promise it will in your circumstances. If you are going to add a device, then changing the chunksize at the same time is little extra cost so you may as well.

You don't need to install a new mdadm to make use of it. Just compile it somewhere and run ./mdadm --args to run it. You can leave the installed on unchanged. If you have to shutdown or crash while the reshape is happening, then the installed mdadm won't be able to restart the array, so if the array holds '/', then you hit problems. But if the array is separate, and the mdadm you compiled isn't on the the array, then it is easy to restart the array with the new mdadm.

You do however need a new kernel - 2.6.32 at least. If your install has an old mdadm it is unlikely to have a new kernel. So the need for a new kernel might be enough to justify a livecd approach.

- NeilBrown

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (02 August 2010, 13:51 UTC)
Hi Neil, thanks for this wonderful piece of software! Thanks to you I have survived many a hard drive death :)

I've searched https://raid.wiki.kernel.org/ but couldn't find an answer to this problem.

I have a question about converting RAID1 to RAID5. At the moment I have 2*500GB as RAID1 (so 500GB usable), and I want to change it to 3*500GB as RAID5 (so 1000GB usable). I see two ways of achieving this:


1. a- fail one of the RAID1 partitions

b- create a degraded 3 partition RAID5 (ie. 2 partitions present, and one missing) using the "failed" one and the new one

c- copy the data from the remaining RAID1 partition

d- delete the RAID1 array

e- add the remaining partition to the RAID5 and let it resync


2. a- convert the RAID1 to a 2 drive RAID5

b- grow the RAID5 by adding the new partition


Downtime is not a concern, and I'm not bothered about speed, but my concern is - what if a drive dies during the exhausting resync (with way 1) or the probably even more exhausting grow (with way 2)?

Way 1 would obviously lead to data loss if a drive dies. Does way 2 suffer the same problem? If not I'll obviously take that route. But if way2 also suffers the problem, which route would you recommend?

Thanks very much! Steffen

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (02 August 2010, 22:46 UTC)

If you just convert straight to RAID5 with

mdadm /dev/md0 -a /dev/newdev
mdadm --grow /dev/md0 --level=5 --raid-devices=3

then a problem with 'newdev' will mean that you lose redundancy but do not lose your data (until another device failed). A problem with one of the original devices would be harder to recovery from... I'm not sure off hand exactly what would happen, it might survive but as I haven't tested it I cannot be sure.

If you want to be extra safe, I would do the following, assuming the old drives are /dev/A and /dev/B and the new drive is /dev/C, and the array is /dev/md0.

mdadm -S /dev/md0
mdadm -C /dev/md0 -l5 -n2 /dev/A missing -x1 /dev/C
wait for resync
mdadm -G /dev/md0 -n 3
wait for reshape to complete.

Now you have all your data still safe on /dev/A, and assuming nothing has gone wrong, all your data is also safe of /dev/md0 - though the array is degraded. Further you have run a resync and a reshape on /dev/B and /dev/C so you should have some confidence in them.

Now

mdadm /dev/md0 -a /dev/A

This will recover /dev/A which should be safe because we believe /dev/B and /dev/C are healthy having exercised them heavily. If /dev/A has a problem, it will fall out of the array and your data will still be safe.

Though on reflection, I suspect the first option would probably handle errors well.... maybe I should add that to my testing.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (03 August 2010, 03:46 UTC)

Neil,

Thanks so much for the advice. I decided to go ahead and reshape for chunk size for the experience, so when i've got a raid-full of important data i know what to do about adding a drive. My computer locked up on me around 6% but i was able to recover using:

/[pathto]/mdadm -S /md0 /[pathto]/mdadm --assemble /md0 /dev/sdb /dev/sdc /dev/sdd --backup-file [path_to_backup_file]

Ubuntu tried to reassemble on boot, but failed. After manually assembling and mounting everything works as normal.

Another thing that hung me up... your guide said to use --chunk-size but the parameter is actually --chunk.

Other than those few hiccups it's going smoothly. I'm thoroughly impressed that i can still use md0 as it rebuilds. I'm able to play music and video from the raid while encoding a video to the raid with no perceptible slowdown. The only way i know it's slowing down is from the readout from /proc/mdstat.

Last thing (for now)... is there any reason to keep the packaged version from ubuntu if i've got the latest from source?

Thanks again! justinmteal

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (03 August 2010, 04:26 UTC)

Locking up is a bit of a worry, but if it was locked-up in a way the "mdadm -S" still worked it cannot have been locked too hard. I wonder what the cause was..

While I try to keep new version of mdadm mostly backwards-compatible with old versions that isn't always possible. It is possible - though maybe not likely - that Ubuntu init.d scripts depend on some behaviour of mdadm that has changed between the release that they ship and the current release. So it is always good to be cautious.

It should be fairly safe to simply install the mdadm that you have build and see what breaks. If nothing: you are happy. If something does break, you should still be able to "apt-get install --reinstall mdadm" (or whatever the command it) to get the ubuntu mdadm back in place.

NeilBrown

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (03 August 2010, 13:11 UTC)

I am trying to shrink a raid 5 and reduce the nr of disks.

I am using mdadm v3.1.2 and kernel 2.6.32.12.

I have successfully shrunk the the raid using " mdadm --grow --array-size=2600000000 /dev/md0" and the file system seems ok after this.

I try to run "mdadm --grow --raid-disks=4 /dev/md0 /mdadm.backupfile" to reduce the nr of disks but I get "mdadm: --add cannot be used with other geometry changes in --grow mode" as a response. I have done this before a couple of times, but dont recall getting this message. I dont understand it, can someone tell me what is meant and what I can do about it?

This output might be useful: [root@cerberos/]# mdadm --detail /dev/md0 /dev/md0: Version : 0.90 Creation Time : Tue Sep 8 17:28:29 2009 Raid Level : raid5 Array Size : 2600000000 (2479.55 GiB 2662.40 GB) Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) Raid Devices : 6 Total Devices : 6 Preferred Minor : 0 Persistence : Superblock is persistent

Update Time : Tue Aug 3 14:57:20 2010 State : clean Active Devices : 6 Working Devices : 6 Failed Devices : 0 Spare Devices : 0

Layout : left-symmetric Chunk Size : 128K

UUID : 7983e71a:77a9a523:89cd91be:8d911c05 Events : 0.793229

Number Major Minor RaidDevice State 0 8 0 0 active sync /dev/sda 1 8 48 1 active sync /dev/sdd 2 8 16 2 active sync /dev/sdb 3 8 32 3 active sync /dev/sdc 4 8 80 4 active sync /dev/sdf 5 8 96 5 active sync /dev/sdg

Thanks in advanced (and a big THANKS for the great mdadm that lets me do everything I can think about to my arrays without ever causing me problems).


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (03 August 2010, 19:14 UTC)

Thanks for you quick reply, I'll try it now. After that I got a 4th partition to add to 3*1TB RAID5 and once that's all done a badblocks check of 2 big drives... so when all this is finished I'll post it into the wiki I indicated in my earlier post.

Once again, thank you and everyone else who worked on this so much for your work!

Steffen

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (03 August 2010, 23:31 UTC)

I think you have a simple error in your usage.

You used

mdadm --grow --raid-disks=4 /dev/md0 /mdadm.backupfile
but should use
mdadm --grow --raid-disks=4 /dev/md0 --backup-file /mdadm.backupfile

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (25 September 2010, 00:38 UTC)
Thank you Neil for all your effort and continued improvements on a wonderful tool! The "new" raid level reshaping abilities are truly impressive.

*edit* I resolved my problem (below) so I guess it can be ignored ;D (let me know if you'd rather I edit it down?)

*edit2* spoke too soon: now I've got massive drive corruption... slowed the resync/reshape to 0k while I backup some files up, in the event it is the reshape causing the corruption.

Sorry if this is the wrong place to post since it's technically not a raid5->raid6 problem (I've been reading this blog for a while now).

On a fedora 12 system that I recently grew the raid5 to raid6 then while adding a 7th 2TB drive to the new raid6, a drive was kicked offline (the new, drive I think) 6-8 hrs into the reshape.

Unfortunately, while diagnosing the drive for reliability (booting knoppix, etc.) the OS drive became corrupted! I installed fedora 13 (so I wouldn't need to upgrade mdadm to 3.1.2) on a new drive and now I see that the raid6 array is... "degraded" but not started, with a "removed" drive. "State : active, degraded, Not Started"

Sorry if I'm being overly cautious (in asking for advice) but I wanted to confirm my next steps.

#The 1st thing I tried 1) mdadm /dev/md126 --re-add /dev/sdi1 mdadm: --re-add for /dev/sdi1 to /dev/md126 is not possible # Try without the "removed" disk 2) mdadm --stop /dev/md126 mdadm --assemble /dev/md126 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdj1 mdadm: /dev/md126 assembled from 6 drives - not enough to start the array while not clean - consider --force. # Try with the "removed" disk 3) mdadm --stop /dev/md126 mdadm --assemble /dev/md126 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1 mdadm: /dev/md126 assembled from 6 drives and 1 spare - not enough to start the array while not clean - consider --force. Q: Should I use --force? ------------------------- mdadm --detail /dev/md126 /dev/md126: Version : 1.2 Creation Time : Mon Jul 19 16:27:34 2010 Raid Level : raid6 Used Dev Size : 1927798784 (1838.49 GiB 1974.07 GB) Raid Devices : 7 Total Devices : 6 Persistence : Superblock is persistent Update Time : Wed Sep 22 01:24:28 2010 State : active, degraded, Not Started Active Devices : 6 Working Devices : 6 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 512K Delta Devices : 1, (6->7) Name : midori:128 (local to host midori) UUID : b2945795:978d8c97:9451e9f5:3191ab23 Events : 49541 Number Major Minor RaidDevice State 0 8 97 0 active sync /dev/sdg1 1 8 113 1 active sync /dev/sdh1 2 8 81 2 active sync /dev/sdf1 4 8 65 3 active sync /dev/sde1 6 8 17 4 active sync /dev/sdb1 5 0 0 5 removed 8 8 145 6 active sync /dev/sdj1 *UPDATE* mdadm --assemble /dev/md128 /dev/sdf1 /dev/sdg1 /dev/sdd1 /dev/sde1 /dev/sdb1 /dev/sdi1 /dev/sdh1 mdadm: /dev/md128 assembled from 6 drives and 1 spare - not enough to start the array while not clean - consider --force. mdadm --run /dev/md128 mdadm: failed to run array /dev/md128: Input/output error Still afraid to run '--force' OK I finally ran '--force' mdadm --assemble /dev/md128 /dev/sdf1 /dev/sdg1 /dev/sdd1 /dev/sde1 /dev/sdb1 /dev/sdi1 /dev/sdh1 --force mdadm: cannot open device /dev/sdf1: Device or resource busy mdadm: /dev/sdf1 has no superblock - assembly aborted *edit* Oops raid not "stopped" let's try that again mdadm --assemble /dev/md128 /dev/sdf1 /dev/sdg1 /dev/sdd1 /dev/sde1 /dev/sdb1 /dev/sdi1 /dev/sdh1 --force mdadm: /dev/md128 has been started with 6 drives (out of 7) and 1 spare. cat /proc/mdstat Personalities : [raid0] [raid6] [raid5] [raid4] md128 : active raid6 sdf1[0] sdi1[8] sdb1[6] sde1[4] sdd1[2] sdg1[1] 7711195136 blocks super 1.2 level 6, 512k chunk, algorithm 2 [7/6] [uuuuu_u] [=>...................] reshape = 5.1% (99625660/1927798784) finish=2456.1min speed=12404K/sec Guess I resolved this on my own :D




[permalink][hide]

Comment (03 October 2010, 04:48 UTC)
I've been generally quite pleased with mdadm for managing linux raids, but there is one process which I have not been happy with regarding shape changing and I thought I would see if there is a better way to do it. I refer to growing a RAID (raid-5 in this case) not by adding disks but by replacing the disks with larger disks over time. As you know, disks keep getting cheaper so I have been through this process 3 times now, jumping my 3 disks from 300gb to 1tb in various steps. I don't allocate the whole disk to the RAID, I use partitions but that doesn't change things much.

The current method involves replacing all disks, one by one, doing a rebuild each time. But I typically replace disks one at a time, always buying a new larger disk in the new pricing sweet spot. Doing 3 (or N) rebuilds at once is a risky proposition. Any failure during all those rebuilds and you have hosed your RAID -- even though the data to rebuild is still there on the old drive you marked as faulty.

So when you have three 500gb disks and you buy a 750gb, you still only use 500mb of the 750gb drive of course. You could allocate the whole drive to the RAID, but usually 250gb of disk is worthwhile, so it is much nicer to use that 250gb for something else temporarily like online backup (which does not need redundancy) or scratch space etc. Then you do it with a 2nd drive. Finally the day comes when you have 2x750 and a 500 and you replace the 500 -- not with a 750, but with a 1TB because prices have dropped. So now you put in the 1TB you can either allocate 750gb or the whole 1TB to the RAID -- again, that extra disk is handy, why leave it unused for a year? But either way you allocate at least 750 and it rebuilds a 500gb raid component into it. Then you go to those two 750 drives which both have only 500 in use.

You have to rebuild them, too. You remove the partition and the spare one after it and create a new larger 750 partition, and re-add it to the raid. And lo, it rebuilds everything, even though it is in general writing back to the drive exactly what is already on it. About all that really happens is it moves and updates the metadata block, if I understand this right. This is the part that bothers me. I put my array at risk by having it run degraded and rebuild to do almost nothing.

Then I do it again for the other drive. And finally I have a raid on 3 750gb partitions using 500gb of each, and I can grow that to size=max, and I have my 3x750gb RAID. With yet another operation.

What could fix this?

a) A tool to allow me to grow a RAID partition into empty space beyond it in the disk. Ie. modify a program like gparted to be able to grow RAID component partitions up into empty space, updating their metadata. Or possibly even move them if the RAID is offline.

b) Since this is obviously easier to do if the metadata is at the front, encourage people to create RAID5s they plan to move with version 1.2 metadata. And/or offer a means to change the metadata format of an array from 0.9 to 1.2

These functions would help a lot, but another function in the kernel engine would improve robustness of the RAID, namely a way to hot replace a working drive. In a hot replace, the drive to be removed would be in the system along with the new drive as a spare. Instead of marking the drive "faulty" we would mark it to be replaced. The system would then start building on the new spare, either as normal or just by copying blocks from the original, whichever works best. However, in the event of any read error during this procedure, all the original drives would be available to get the parity info to survive the failure.

Without this, we have a situation during a replacement where the RAID has made our data less protected rather than more, which is not the philosophy of redundant drives.

Thoughts? To my mind this is the most common type of reshaping I have actually had to do, so while it's nice to convert a 5 to a 6, this is what I think many people would like.

- Brad Templeton

[permalink][hide]

Re: A better way to grow a RAID with larger drives (03 October 2010, 05:13 UTC)

This is actually why I usually build my RAID5 and RAID-4/5/6 setups with an extra MB or so 'wasted' by wrapping each individual drive in a RAID-1, and building the RAID-4/5/6 out of those single-drive RAID-1's.

Then growing to 'larger' drives involves no drive ever being marked faulty: I just add the new drive to the appropriate RAID-1, wait for it to sync with a bitmap (which is LIGHTNING fast since it's a pure linear read/write, no math involved) then remove the old, smaller drive from the RAID-1 and bump the size of the RAID-1.

It's a tiny bit of overhead and wasted space to cart around, but I disable bitmapping on the RAID-1 except when/if I'm swapping drives around, and leave bitmapping enabled on the RAID-4/5/6 regardless.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (05 October 2010, 04:14 UTC)

hi Brad,

The short answer is that I completely agree.

The longer answer includes:

1/ if you can make the underlying device larger while md is still using it, you can tell md to see more of it by writing to /sys/block/mdXXX/md/dev-YYY/size. Writing '0' means 'use all the space'. After doing this to all devices, you can "mdadm --grow --size max" to make use of it. However it is not possible to resize regular partitions, only 'dm' partitions. Certainly it would also help if mdadm did more of this for you.

2/ The default metadata with most recent mdadm is 1.2 - at the start of the device. However Debian patched it back to 1.0 because grub/lilo cannot boot from a 1.2 array.

3/ Moving the metadata to the front requires shuffling all the data down a bit. That could be done, but would take a lot of time (and some new code). I'll keep it in mind.

4/ With 1.1 or 1.2 metadata this is very easy to do off-line. Simply stop the array, resize all the devices, the assemble the array with "--update=devicesize"

5/ To do this off-line with 1.0 or 0.90 metadata you would need to backup the metadata, then resize the device, then restore the metadata. I hope to add that functionality to mdadm-3.2

6/ I could probably add an ioctl to the block layer so that partitions can be resized... I'll look into that.

Thanks for your valuable feed back.

NeilBrown

[permalink][hide]

Raid 1 to 5 and back (25 October 2010, 14:57 UTC)

Hi,

I've just converted a two-drive raid1 to a raid5 using "mdadm --grow --level 5 /dev/md3", and apparently this worked well:

morris:/usr/local/src/mdadm-3.1.4# cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md3 : active raid5 sda3[0] sdb3[1] 961613184 blocks level 5, 64k chunk, algorithm 2 [2/2] [uu]

Now I found I'd rather not done that, and thought I could easily revert this, but:

morris:/usr/local/src/mdadm-3.1.4# ./mdadm --grow /dev/md3 --level=1 mdadm: /dev/md3: could not set level to raid1

This is with kernel 2.6.32-bpo.2-686 on Debian 5.0.4, and mdadm 3.1.4.

I asume I could stop the raid, and simply re-assemble a new raid 1 from the two partitions, but I'm a bit reluctant to try. Is there another way to get my raid1 back?

Andreas

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (26 October 2010, 00:10 UTC)

You need 2.6.33 to be able to convert a RAID5 to a RAID1.

NeilBrown

[permalink][hide]

Comment (28 October 2010, 20:45 UTC)
I'm having issues trying to convert a raid 5 to a raid 6 array. I've got a hot spare in the raid 5 config and the array is clean. Each drive is 1T. I haven't been able to find anything specific to this in my searching. Any help would be appreciated. Thanks. #mdadm --grow /dev/md2 --level=6 --raid-disks=6 --backup-file=/home/md.backup mdadm: /dev/md2: could not set level to raid6 #mdadm --version mdadm - v3.1.4 - 31st August 2010 #cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md2 : active raid5 sda1[3] sdb1[4] sdc1[5](S) sdd1[2] sde1[1] sdf1[0] 3907039744 blocks level 5, 128k chunk, algorithm 2 [5/5] [uuuuu] #mdadm --detail /dev/md2 /dev/md2: Version : 0.90 Creation Time : Sat Aug 7 23:22:28 2010 Raid Level : raid5 Array Size : 3907039744 (3726.04 GiB 4000.81 GB) Used Dev Size : 976759936 (931.51 GiB 1000.20 GB) Raid Devices : 5 Total Devices : 6 Preferred Minor : 2 Persistence : Superblock is persistent Update Time : Thu Oct 28 16:00:16 2010 State : clean Active Devices : 5 Working Devices : 6 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 128K UUID : ff4ec8ee:95d2230f:84fc2b17:ec59dd3f Events : 0.641774 Number Major Minor RaidDevice State 0 8 81 0 active sync /dev/sdf1 1 8 65 1 active sync /dev/sde1 2 8 49 2 active sync /dev/sdd1 3 8 1 3 active sync /dev/sda1 4 8 17 4 active sync /dev/sdb1 5 8 33 - spare /dev/sdc1






[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (29 October 2010, 06:10 UTC)

What kernel are you running? You need at least 2.6.30, and preferrably 2.6.32.

If you have 2.6.30 or later, what do you see in

dmesg | tail -50

after you try and fail to convert to RAID6??


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (29 October 2010, 06:53 UTC)

I don't know that it matters but I use --raid-devices instead of disks. mdadm --grow /dev/md2 --level=6 --raid-devices=6 --backup-file=/home/md.backup

Note: I'm in hour 33 with 1000 min remaining at 65% of a 7->8 Raid6 ... CPU 60-70% idle and only 7-10mb/s (speed_min set to 15mb/s) wish I could speed it up (started at 13mb/s).

[permalink][hide]

Comment (29 October 2010, 06:57 UTC)
Thanks for the reply. That would be the problem I think. It's the stock centos 5.5 kernel. I have enough space externally to copy all of the data off. I'm thinking of doing that, deleting the raid 5 setup, and creating a raid 6 setup from scrath given the high number of events in the array. It will also allow me to use a more recent metadata format. What do you think and if I do that do I still need to upgrade my kernel? Thanks.

# uname -a Linux lexicon 2.6.18-194.el5 #1 SMP Fri Apr 2 14:58:14 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (29 October 2010, 07:49 UTC)

"--raid-disks" and "--raid-devices" and "-n" are all synonymous.

As the reshape progresses, the distance that data needs to be moved increases, so it is possible that the seek times would increase so the throughput would go down.

You could try increasing /sys/block/mdXX/md/stripe_cache_size.

A larger cache would allow it to read more before seeking, then write more before seeking back.

If you see any performance change, do let us know.


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (29 October 2010, 08:27 UTC)
It was 565, I changed it to 1024 and thought I saw a momentary increase (14m?) but then it settled back to similar values... I then set it to 2048 (good value?) and now I'm noticing a cyclic pattern from 7mb-14mb/s every 1-2min which I guess is an improvement since it was staying around 7 before (possibly 7-10) although the completion estimate is more varied. Thanks for the advice! Is it safe to lower that value from 4096, could it cause corruption?

P.S I've been using the raid-disks option after reading the much earlier post. I only just recently got confident enough to play around with the 25Gx8 partition at the end of my 2T drives. I was able to add drives, Raid 5->6, Raid 6->5, Raid 5->6! Very cool!

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (29 October 2010, 08:35 UTC)

Changing the stripe_cache_size will not risk causing corruption.

If you set it too low the reshape will stop progressing. You can then set it to a larger value and let it continue.

If you set it too high you risk tying up all of your system memory in the cache. In this case you system might enter a swap-storm and it might be rather hard to set it back to a lower value.

The amount of memory used per cache entry is about 4K times the number of device in the array.


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (31 October 2010, 05:39 UTC)

I am stuck in my current situation with what would be the best procedure to follow.. I have an LVM on top of 2xRAID5 (6x 1TB each = 12 drives), and this gave me 9.5TB of usable space... which is now full. :(

I want to replace 1 RAID5 with 1.5TB drives, one at a time because I have no more HDD slots available. My idea was fail a drive, and then to create 2 partitions on the new drive (1TB + 500GB), making sure the 1TB was as large as existing drives. Allow for repair to finish, and repeat for remaining drives in array. Once done, add the remaining 6x 500GB to grow the array. I know this will be time-consuming but is this still do-able?

Otherwise, there's the option of failing a drive and adding the new 1.5TB drive to only use 1TB of it. Repeat until all 6 are swapped and repaired. But then how do I increase the array to use the 6x 500GB which is there but unseen? I was then thinking of replacing the other array with 2TB drives.

Copying all data off to another PC or collection of HDD would be difficult (home user) and a last resort. It also means that data from both arrays needs to be copied. If I did restart, would I be better off with 2x RAID6 array with 5 drives each (+1 each for raid6), or 1 RAID6 array with 11 drives (+1 for raid6)?

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (31 October 2010, 05:45 UTC)

Simply replacing each drive-slot one by one with the 1.5TB drives will work in my experience. The RAID5 will simply rebuild peacefully as long as the drive is equal or larger than the existing drive-size for the array, and once all drives are replaced with larger ones you can initiate a one-time raid expansion with the --grow option and it can re-size itself to the now-larger drives.

So failing a drive at a time and replacing it with the new 1.5TB drives with a single 1.5TB partition on them for MDADM to gobble up is actually your best approach in my experience. MDADM won't try to grow the array until you tell it to, and won't let you tell it to until all drives have been swapped for the new, larger size.

Obviously you'll be running without redundancy during this however, RAID6 is a lot safer once you get above about 5-6 drives in my experience in a single array.

- WolfWings

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (31 October 2010, 23:38 UTC)
As a user with recent successes and failures (as seen in previous posts).

I "believe" I was able to convert a 1TBx6 into a 2TBx6or7 by repartitioning (delete/recreate) and growing the array. A few suggestions (0) Upgrade mdadm to the latest version (compile it) (1) backup (2) backup (3) SCRUB the new drive (badblocks -w, or dd /dev/zero), scrub again. (4) convert to RAID6 (add a new controller, hang drives outside the case, if you have to) (5) then add new drives.

I say "believe" because at least twice now I've gotten bit by by a failure midway through reshaping the raid (converting raid5-6 or adding a drive) and had trouble continuing the process after reboot. The first time I forced the array which probably worked, unfortunately, my new sata controller (added after the failure) was silently corrupting the array so I'm not sure...

Note to Neil: Just today I had a problem restarting an array 6x25G (thankfully) and the "solution" was to upgrade from "mdadm - v3.1.2 - 10th March 2010" to "mdadm - v3.1.4 - 31st August 2010" (via git). Prior to the upgrade, assemble using a backup file, gave a seg-fault (sorry, I seem to have lost the trace) after the upgrade the array assembled no problems. So the moral of the story is use the latest version of mdadm!

[permalink][hide]

Querying RAID10 layout and geometry (09 November 2010, 18:36 UTC)

I have 6 drives as RAID10, and want to send both /dev/sde and /dev/sdf back to be replaced, so I just want to know the real layout of the array to know if it'll still run without these until the new ones arrive.

I've used mdadm on fc12-14 for a year, hotswapped a couple of failed drive and got the hang of its quirks.

After much research and digging I found something in the man page under the -p option which seems to me rather obscure. The default creation layout is the undefined term 'left-symmetric' and default layout is 'n2', which says in the man: "Multiple copies of one data block are at similar offsets in different devices."

But which devices???

mdadm --detail does not help me know which devices are pairs, and which I can pull out and while keeping the array running, nor does the man page explain it clearly.

mdadm --detail /dev/md0 /dev/md0: Version : 0.90 Creation Time : Tue Jan 19 02:43:36 2010 Raid Level : raid10 Array Size : 2197714752 (2095.90 GiB 2250.46 GB) Used Dev Size : 732571584 (698.63 GiB 750.15 GB) Raid Devices : 6 Total Devices : 6 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Tue Nov 9 18:30:57 2010 State : clean Active Devices : 6 Working Devices : 6 Failed Devices : 0 Spare Devices : 0 Layout : near=2 Chunk Size : 64K UUID : 8ae77b89:72492206:bfe78010:bc810f04 Events : 0.10143148 Number Major Minor RaidDevice State 0 8 1 0 active sync /dev/sda1 1 8 17 1 active sync /dev/sdb1 2 8 33 2 active sync /dev/sdc1 3 8 49 3 active sync /dev/sdd1 4 8 65 4 active sync /dev/sde1 5 8 81 5 active sync /dev/sdf1 So which devices are pairs? This 'detail' doesn't tell me anything without some missing assumptions!

{ab}{cd}{ef} or {ad}{be}{cf} or something else? I assume 3 stripes, but it could even be 3 mirrors of 2way stripes!

Could not the --detail option not make a better picture of an array's organisation? for example: a b c stripeset 0 {sda1} {sdb1} {sdc1} stripeset 1 {sdd1} {sde1} {sdf1} or a b stripeset 0 {sda1} {sdb1} stripeset 1 {sdc1} {sdd1} stripeset 2 {sde1} {sdf1} What about other exotic layouts, like mirrored raid5/6 a b c d e stripeset 0 {sda1} {sdb1} {sdc1} {sdd1} {sde1} stripeset 1 {sdf1} {sdg1} {sdh1} {sdi1} {sdj1} or 4 mirrors of raid5 a b c stripeset 0 {sda1} {sdb1} {sdc1} stripeset 1 {sdd1} {sde1} {sdf1} stripeset 2 {sdg1} {sdh1} {sdi1} stripeset 3 {sdj1} {sdk1} {sdl1} This method would help me a lot to visualise what devices are where.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (15 November 2010, 14:18 UTC)

I have a raid of 7x2TB disks running in raid6 with mdadm. Works great and has been doing so for some time now! Thanks for the great mdadm software and the work you put into it!

I am thinking about adding a couple of new 2TB disks, and see that now there is "Advanced Format" disks availabale that uses 4k sectors instead of the old 512bytes. I have read that this might be a bad thing for small files, and I guess that impacts the usage of raid such as mdadm? I read this " When creating a RAID array from Advanced Format disks, you don't need to take any extra steps. Because the RAID alignment values are multiples of the 4096-byte alignment required by Advanced Format drives, both technologies' needs are met if you align partitions as for a RAID array of disks with 512-byte physical sectors" (source : https://www.ibm.com/developerworks/linux/library/l-4kb-sector-disks/)

Any input on how this would impact the raids would be great! Do I have to tune chunk-size with this is mind? Should I avoid mixing 512bytes drives with 4k?

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (15 November 2010, 20:30 UTC)

Answering the question about raid10 layout:

near=2

means there are 2 copies of all data, and they are near each other. So on 6 disk that is

A A B B C C D D E E F F

etc.

Yes, I guess --detail could be more explicit. Patches always welcome.

[permalink][hide]

Re: Using Advanced Technology 4K drives (15 November 2010, 20:33 UTC)

You should have no trouble using drives with 4K sectors. md RAID5 and RAID6 already do almost all IO in multiples of 4K, so no change in behaviour is needed.

I think the most likely cause of problems is if you use an old version of fdisk to partition the drive, and get partitions that are not aligned to sectors. So when (if) you make partitions, just check they are aligned properly, and everything should be fine.


[permalink][hide]

Converting RAID5 to RAID6 and back: code backported into current "stable" 2.6.27.x kernels? (16 November 2010, 17:33 UTC)

First of all, very impressive work here. Major kudos to Neil Brown for adding and maintaining such well-written and useful code to the Linux kernel.

Second, I would like to know whether the code needed for the RAID5->RAID6 and RAID6->RAID5 conversions have been backported into the current "stable" 2.6.27.x (ie. 2.6.27.55 as of this moment). Also, will mdadm >= 3.1 (which I understand is also needed for these conversions) work with this "stable" kernel?

Thanks, -- Durval Menezes

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (30 November 2010, 11:33 UTC)

1/ No. We don't backport new functionality. The stable series only get serious bug fixes and occasionally hardware-enablement which just requires adding a device id to a list. Various distros base their enterprise kernels on the stable releases and they backport functionality as required for business reasons, but these backports do not get into the kernel.org kernels.

2/ Yes, mdadm-3.1 should work with old kernels (and newer kernels). Obviously functionality that is not present in older kernels cannot be used, but mdadm should fail gracefully when you try to use that functionality.

NeilBrown

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (29 November 2010, 03:17 UTC)

Is there any reason why a chunk-size change would be significantly slower than a resync or grow command? I am changing from a chunk of 128 to 256 on a 10TB array. It's going at about 3MB/s versus my usually speed of at least 30MB/s. It's using a backup file which is only 41MB, but it is saved on a 300GB 10k Raptor drive, so it should be able to do faster than 3MB/s, I'd think.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (30 November 2010, 11:49 UTC)

Yes, chunksize changing is much slower than resync or a 'grow' that makes the array larger.

This is because the data needs to be backed-up continuously in case of a crash. So all the data is written twice, once to the backup file, and once back to the array.

Partly because of this, it also moves data in smaller chunks, so it is seeking back-and-forwards a lot more.

You might be able to increase the speed a bit by increasing stripe_cache_size in /sys/block/mdXX/md/, but don't make it too big or you will exhaust memory. A few thousand is probably OK. Don't expect too much gain from doing this though.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (29 November 2010, 17:12 UTC)
Thanks for a great summary of this functionality in mdadm(as well as your contributions to mdadm!). I would love to see new posts like this as the todo items mentioned here are tackled. I really can't stress enough how nice it is to have someone provide these concise technical summaries of how mdadm handles reshaping :)

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (01 December 2010, 20:20 UTC)

Just started converting my 7 disk raid5 to an 8 disk raid6 array, but it's going to take forever. Maybe I shouldn't have used a nfs backup file?




europa:/sys/block/md0/md# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sdd2[7] sdc2[5] sdb2[6] sde2[0] sda2[4] sdh2[3] sdg2[2] sdf2[1] 8787875712 blocks super 0.91 level 6, 64k chunk, algorithm 18 [8/7] [uuuuuuu_] [>....................] reshape = 2.1% (31239936/1464645952) finish=30918.0min speed=772K/sec europa:/sys/block/md0/md# cat sync_speed_max 200000 (system) europa:/sys/block/md0/md# cat sync_speed_min 200000 (local) europa:/sys/block/md0/md# mdadm --version mdadm - v3.1.4 - 31st August 2010 europa:/sys/block/md0/md# uname -a Linux europa 2.6.32-5-amd64 #1 SMP Fri Sep 17 21:50:19 UTC 2010 x86_64 GNU/Linux europa:/sys/block/md0/md# uptime 08:57:48 up 51 days, 11:37, 7 users, load average: 1.30, 1.31, 1.27

This is not exactly a powerhouse machine (e1400 celeron), but during monthly array checks, it's usually in the 14-15 MB/s range. Guessing since I was growing the array size at the same time as going from raid5 to raid6, that I didn't need the backup-file? Anyway for me to speed this up? The nfs box is connected @ 100mbit.


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (01 December 2010, 20:23 UTC)

No, NFS would not be the best idea for a backup file.

However you could stop the array, copy the backup file to a faster device, and assemble the array again referring to the new backup file (if you have a faster device).

Going from 7 disk RAID5 to 8 disk RAID6 isn't really growing the array. The size stays the same. So everything has to be relocated in-place, so a backup file is needed the whole time.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (01 December 2010, 22:35 UTC)

All the sata ports are occupied, not sure how much faster an external USB2 drive would be?

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (01 December 2010, 22:41 UTC)

USB2 has a data rate of 480Mb/sec which is nearly 5 times your network speed. And I suspect there is less overhead for usb-storage than for nfs. So my guess is that a rotating drive (not a usb-flash stick) connected by USB2 would be faster than your NFS server.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (02 December 2010, 00:27 UTC)

So just kill the converting mdadm process, copy the backup file to the usb drive then start it up again with the same command line parameters? Guess I'll grab a faster usb drive from work tomorrow & repartition it.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (02 December 2010, 00:28 UTC)

No, that wouldn't work. You need to stop the array and re-assemble it.

Though I guess that will be a challenge as it is presumably your root (and only) filesystem.

That puts you in a spot of bother... but I cannot see how you could do that as you cannot boot from RAID5 so you cannot have it as your only filesystem.

Maybe we need to rewind a bit... what exactly if your configuration of devices and arrays and filesystems and what do you boot off?

[permalink][hide]

Comment (02 December 2010, 00:39 UTC)

That puts you in a spot of bother... but I cannot see how you could do that as you cannot boot from RAID5 so you cannot have it as your only filesystem.

I've got a couple of file-servers that only have a RAID-1 partition for grub. The entire OS and main storage is on the RAID5/6 array. Only the 2-5MB raw kernel and initrd images need to be off the RAID-1. Which would more-or-less stick you in this situation with trying to re-direct the backup file for a RAID5->6 conversion.

If all else fails though, a Gentoo Minimal Install CD generally has the latest MDADM tools installed, then the conversion can occur 'offline' instead via the USB storage to hold the backup file.

- WolfWings

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (02 December 2010, 00:59 UTC)

Maybe the backup file could go in /boot ?? That won't help you with booting though.

I think that using a rescue CD is the only way to finish the reshape if you ever need to shutdown, which you would need to do to move the backup file.

The rescue CD would need to have access to the backup file, and you would assemble the array with

mdadm --assemble /dev/md0 --backup-file=/path/to/file /dev/disk1 /dev/disk2 .....
(the order of the disks isn't important).

Then just wait for it to complete. It should go faster than with the backup file on NFS, but it still won't go nearly as fast as a resync.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (02 December 2010, 01:12 UTC)

Rewinding: europa:~# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/array-root 19223252 2342136 15904632 13% / /dev/sde1 474440 43692 406251 10% /boot /dev/mapper/array-store 8630753568 5064478176 3565005632 59% /storage

So basically each of the 1.5tb drives has 2 partitions, small ~500mb for either /boot or swap and the rest for md0 (+lvm on top). The 3 newest drives have the ~500mb partitions unused. So I guess I could cobble totally non raid system, but it will be vulnerable as hell to disk failure. Now that I look at it, I should have put the backup file in /boot, just wasn't thinking the process through properly.

Options?


1. Let it take 20 days to reshape with the nfs backup file.
2. Use a live boot, take the raid offline and reshape.
3. If possible, stop the reshape, copy the backup file to /boot and start it up again.

Much rather an easy fix that's safe, but that's not always possible.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (02 December 2010, 01:39 UTC)

I'm afraid I cannot offer an easy fix. I can think of various ways I can make something like this safer in the long term, but they don't help you now.

If you can afford a couple of days down time, I would probably use a rescue CD and do it all off-line. I would probably still use a separate USB drive as that would stop the main array drives from seeking to update the backup file. If you cannot afford that, just wait.

I could possibly 'fix' mdadm so that it is possible to kill it and restart it, but I can't make any promises about when that will happen. I'll let you know if I do end up with something I trust though.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (02 December 2010, 06:31 UTC)

Well if it's not in an overly vulnerable state, I don't necessarily need the reshape competed quickly. I think what I will do is let it run, but also grab an external usb dvdrom + burn a rescue cd + grab a faster 3.5" external drive just in case. The machine is headless and not exactly in a good spot to be worked on. Next time I'll know better! Thanks for the help and keep up the good work, devs deserved praise!

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (12 December 2010, 13:23 UTC)

Hi, im running Ubuntu 10.04 and have 4 sata disks in raid5 with mdadm.

I need to convert them back to RAID1.... is this possible?

Thanks a lot for your help,

Marco

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (13 December 2010, 02:46 UTC)

If you have a 4-disk RAID5 and you want to convert it to RAID1, you first need to reduce the size of the filesystem on it so it is only using the space of a single device. Then you would convert from a 4-disk RAID5 to a 2-disk RAID5 which requires at least 2.6.30 but I think needs 2.6.32 to be safe. Finally you would convert the 2-disk RAID5 to a 2-disk RAID1 which requires Linux 2.6.33 or later.

I hope that answers your question.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (03 February 2011, 04:52 UTC)

Hi,

I am trying to convert from raid 5 to raid 6 and I am getting this error:

mdadm: Need to backup 1280K of critical section.. mdadm: Cannot set device size/shape for /dev/md0: Invalid argument


Some info that may be useful:

[root@borg ~]# uname -a Linux borg.loc 2.6.27.25-78.2.56.fc9.x86_64 #1 SMP Thu Jun 18 12:24:37 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux [root@borg ~]# [root@borg ~]# [root@borg ~]# mdadm -V mdadm - v2.6.7.1 - 15th October 2008


From what I've read, My kernel should support this. I also checked the .config file to ensure raid5 resizing is enabled.

I previously was able to go from 3 drives to 5 with no issues but now that I'm trying to go raid6, no go. Any help would be appreciated, thanks!

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (03 February 2011, 04:56 UTC)

Converting between raid levels was introduced in 2.6.30, so your kernel is several versions too old.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (03 February 2011, 13:48 UTC)

Thanks, I was afraid of that. Guess I'll stick to raid 5 with hot spare.

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (12 February 2011, 19:46 UTC)

I just started a switch from raid6 to raid5, and it says it's going to take over 2 weeks to finish:

md0 : active raid6 sdc[0] sdh[5](S) sdg[4] sdf[3] sde[2] sdd[1] 5860543488 blocks super 0.91 level 6, 64k chunk, algorithm 2 [5/5] [uuuuu] [>....................] reshape = 0.0% (245760/1953514496) finish=21189.7min speed=1536K/sec

Is there any way to abort it?

Thanks!

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (12 February 2011, 20:50 UTC)

No, it is not possible revert a reshape that has already started - sorry.

Two weeks does seem rather long... What device is your 'backup file' on? If it is a particularly slow device and you have a faster one available, you could stop the array, move the backup file to a better device, and re-assemble the array...


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (13 February 2011, 01:58 UTC)
Hi,

I need help!

I did something, that in hindsight, is incredibly stupid!

For some reason, not relevant here (may be appropriate in a separate comment), I killed mdadm while it was reshaping a RAID-6 array from a chunk size of 64KB to 512 KB. It was at least 20% through when I did so,

System appeared to function okay.

A few minutes later I rebooted.

Note that I had 3 RAID-6 arrays:

md0 swap

md1 my main user data

md2 /

I have 5 500GB hard disks .dev/sd[abcde] /dev/sda1 has /boot, not in a RAID array.

Machine has a quad core AMD 64 bit processor with 8GB RAM.

There are 2 logical links from md2 to md1, including /home.

I am using Fedora 14, up-to-date with patches as of Friday morning.

mdadm - v3.1.2 - 10th March 2010


The RAID array being reshaped was md1.

The reboot started(completed?) to check md2, and then gave up.

The mdadm conf file says that md1 has an old and new chunk size, but no indication (to my eyes) that thee is a problem.

Attempting to redo the reshape command brings up an error message about not being able to add something.

The backup file is in /boot, and is the same size as when I checked it during the process.

This is my main development machine and the gateway to the internet - I have temporarily connected another machine to access the Internet.

I would appreciate advice, and will provide more details if you consider it relevant.

Thanks, Gavin Flower


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (13 February 2011, 04:06 UTC)

Hi, it is a lot easier to handle this sort of thing by Email - please email linux-raid@vger.kernel.org (you don't need to subscribe first).

Include all the info you can about the current state:

  • cat /proc/mdstat
  • mdadm -Evs
  • kernel log messages
  • if assembling the array doesn't work, try with "--verbose" and post the output.


[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (13 February 2011, 08:29 UTC)
sent email as requested

The file system is read only on the affected machine, fortunately I remembered the 'tee' command and had a USB stick.

So in case of others having similar problems...

I noticed messages about 'sdf' on the console, when I inserted the USB stick, so I ran

mount /dev/sdf /media/

and saved the output and saw it on the screen by doing things like

cat /proc/mdstat | tee /media/gcf-mdstat

note that 'tee' does not send error messages in the file

-Gavin Flower

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (13 February 2011, 22:40 UTC)

1/ The mail to linux-raid hasn't arrived yet... The list rejects mail with HTML, that might be the problem. Try sending plain text

2/ you can capture error output as well with "|&". e.g. mdadm -Av ... |& tee /mount/usb

NeilBrown

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (13 February 2011, 22:57 UTC)

My first email in HTML got bounced, no error message from my second attempt.

Does it reject email with attachments?

Do I need to explicitly include the text of files directly?

I will send you a series of emails, the first just to ensure you get my email address, waiting 10 minutes then one file per email..


P.S. And I had thought you had been caught up in unimportant things, like: sleeping, family, working, other projects... :-)

[permalink][hide]

Re: Converting RAID5 to RAID6 and other shape changing in md/raid (13 February 2011, 23:57 UTC)


You should now have 6 emails from me.

-Gavin Flower

[permalink][hide]




[æ]