<?xml version="1.0" encoding="utf-8"?>
<feed version="0.3" xmlns="http://purl.org/atom/ns#">
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/Projects"/>

<title>Projects</title>
<modified>2012-07-24T06:07:22Z</modified>
<author></author>
<entry>
<title>Linux 3.5 on the GTA04</title>
<issued>2012-07-24T06:07:22Z</issued>
<modified>2012-07-24T06:07:22Z</modified>
<id>http://neil.brown.name/blog/20120724060722</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20120724060722"/>
<content type="text/html" mode="escaped">


&lt;p&gt;Linus released Linux-3.5 a couple of days ago so it is long past time to have 3.5 available for the OpenPhoenux GTA04.  By the time I publish this blog entry it will be, both on github at &lt;a href=&quot;https://github.com/neilbrown/linux/tree/gta04/3.5.y&quot;&gt;https://github.com/neilbrown/linux/tree/gta04/3.5.y&lt;/a&gt; and on my own server at &lt;a href=&quot;http://neil.brown.name/git?p=gta04;a=shortlog;h=refs/heads/3.5-gta04&quot;&gt;http://neil.brown.name/git?p=gta04;a=shortlog;h=refs/heads/3.5-gta04&lt;/a&gt;.

&lt;p&gt;Quite a lot has gone into 3.5-gta04.  A lot of that was bug fixes.  3.5 seems to have more changes that break my phone than other recent kernels, though I haven't counted bugs or hours so that could be misleading.  Certainly I have quite a lot of patches in my 'bugs' branch which is mainly for fixing regressions, but then 8 of those are all related: the libertas has a new async firmware loader which seems to be broken.  I haven't examined it closely, but wifi didn't work until I revert those patches.

&lt;p&gt;On a happier note I've found time to make various improvements.

&lt;p&gt;&lt;ul&gt;&lt;li&gt;Wifi reset is now handled directly by the mmc driver, rather than having a pseudo regulator do it for me.  This is mostly just a code clean up, but it does mean that the voltage is now set correctly on the wifi regulator.  That doesn't seem to change performance significantly but I haven't looked closely.
&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;&lt;ul&gt;&lt;li&gt;The OMAP serial ports can how have a 'virtual DTR' which is a GPIO line that transitions up or down as the device is opened or closed.  This allows the next two point...
&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;&lt;ul&gt;&lt;li&gt;The bluetooth is automatically powered up when /dev/ttyO0 is opened, and powered down when it is closed.  This is much nicer (more automatic) than having an 'rfkill' device.  So running &lt;tt&gt;hciattach /dev/ttyO0 any&lt;/tt&gt; will power on the device, and &lt;tt&gt;killall hciattach&lt;/tt&gt; will turn it off.  Simple.  Power is turned off when the device suspends too so you don't need to
kill hciattach before suspend.
&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;&lt;ul&gt;&lt;li&gt;Rather more interestingly, the GPS is now turned &amp;quot;on&amp;quot; or &amp;quot;off&amp;quot; when /dev/ttyO1 is opened or closed. This is rather tricky as both changes require the same down/up pulse on a gpio and it is not trivial to know if the device is currently on or off.  But I have a driver that manages all of that - it watches the RX line for traffic whenever the GPS should be off and pulses the line again if needed.  This is particularly important at boot - if  the GPS is found to be on at boot it is immediately turned off.  After that its state should stay in sync with expectations.

&lt;p&gt;When turned 'off' the GPS is not actually off - it does maintain some
state and seems to use extra current though I'm not completely sure of
that yet.  I don't know of any way to turn the GPS off more firmly.

&lt;p&gt;The virtual-DTR line doesn't turn off power to the antenna - that still requires an 'rfkill'.
If the GPS is partly on when it is officially 'off', it might be useful to leave the antenna on so it can monitor satellites occasionally.  Until I know exactly what turning it off means I don't want to exclude the possibility of GPS off but antenna on.
For now I have created /etc/gpsd/device-hook to call 'rfkill' as appropriate, and made 'rfkill' setuid - rather horrible but it works for now.
&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;&lt;ul&gt;&lt;li&gt; The GPIO that is used to sense the state of the GPS antenna is now reported using the new 'extcon' (external connector) framework.
&lt;tt&gt;/sys/class/extcon/gps_antenna/state&lt;/tt&gt; will contain either 'internal' or 'external' as appropriate.
When there is a change a UEVENT is created so e.g. udev could be configured to do something interesting.
&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;&lt;ul&gt;&lt;li&gt;The wakeup signal from the GSM/3G chip is now an input key rather than a bare 'gpio' line.  This makes it easier to integrate with the auto-suspend code.

&lt;p&gt;To get notifications of wakeups you need to open the right [/dev/input/event] device and listen for key presses of KEY_UNKNOWN.  The &lt;tt&gt;GTA04/udev-rules/input.rules&lt;/tt&gt; file has the necessary udev magic to make the right file appear as [/dev/input/incoming].
&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;I think that is all.  Power management seems to be largely unchanged -
25mA when in suspend, about 70mA when idle, awake, screen off, and
about 200mA when idle and awake with screen on.
I'll be using this on my phone on a regular basis and
see how it goes.

&lt;p&gt;&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20120724060722&gt;(No comments)&lt;/a&gt;</content>
</entry>
<entry>
<title>OpenPhoenux Serial Console Access</title>
<issued>2012-07-08T10:31:50Z</issued>
<modified>2012-07-08T10:31:50Z</modified>
<id>http://neil.brown.name/blog/20120708103150</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20120708103150"/>
<content type="text/html" mode="escaped">


&lt;p&gt;I find that serial console access is a must for debugging kernel problems on the GTA04 board in my OpenPhoenux phone.  Getting serial console access to the board itself is quite easy as it has a little connector on the board, and came with a ribbon cable which plugs in to this connector and has a DB-9 on the other end.

&lt;p&gt;However it isn't so easy when the board is in the phone case, which is where it is most useful.  The connector is on the side of the board closest to the front of the phone, just above the display.  Cutting a hole there might be possible, but having a ribbon cable hanging out isn't really an option.  So I needed to be a little more clever.

&lt;p&gt;Firstly I destroyed the ribbon cable, keeping the brown, yellow, and blue wires and cutting the rest off at the plug.  The three wires I kept about 2cm of length on.  You can probably see some solder burns from an earlier experiment where I tried to solder 3 wires directly onto the pins at the back of the plug.  That worked a little bit, but they kept breaking off.  Maybe I'm not good a soldering, or maybe it was just a bad idea.

&lt;p&gt;As you can see in this picture, the wires are feed around the edge of the board the the back.  They are aiming for the little cavity beside the battery.

&lt;p&gt;&lt;img src=&quot;http://neil.brown.name/blog-files/201207/08103150/wires.JPG&quot;&gt;

&lt;p&gt;Then I cut away some plastic to make a hole big enough for a small cable with tap wrapped around it, shown here.

&lt;p&gt;&lt;img src=&quot;http://neil.brown.name/blog-files/201207/08103150/slot.JPG&quot;&gt;

&lt;p&gt;Next I found a nice small 3.5mm stereo jack of an old busted pair of headphone.  Not all jacks are small enough.  Ipod headsets seem to have particularly slim jacks, but others probably work as well.  Soldering the 3 wires from this to the 3 wires from the ribbon cable means that I have TX, RX, GND in a reasonably accessible location.

&lt;p&gt;&lt;img src=&quot;http://neil.brown.name/blog-files/201207/08103150/jack.JPG&quot;&gt;

&lt;p&gt;You need to be careful about getting the length of the cable right so it is long enough without any excess.  Here you see it tucked away so the back cover can go over.  The tip of the jack goes into the slot a little bit and as you might be able to see, I have a little bit too much cable so it buckles a bit an uses more space than I would like.  I can clip the back on though, which is all that matters.

&lt;p&gt;&lt;img src=&quot;http://neil.brown.name/blog-files/201207/08103150/tuck.JPG&quot;&gt;

&lt;p&gt;Finally I bought a short cable with a female 3.5mm on one end (I think there were 2 RCA on the other), cut that up, and soldered the 3 wires to my DB-9.  And now I get console  access by just popping the cover over (being careful not to let the battery fall out), easing the jack out, and plugging it in.  Very useful.



&lt;p&gt;&lt;br&gt;&lt;br&gt;&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20120708103150&gt;(No comments)&lt;/a&gt;</content>
</entry>
<entry>
<title>Tedious bugs</title>
<issued>2012-07-06T12:29:34Z</issued>
<modified>2012-07-06T12:29:34Z</modified>
<id>http://neil.brown.name/blog/20120706122934</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20120706122934"/>
<content type="text/html" mode="escaped">


&lt;p&gt;&lt;a href=&quot;https://lwn.net/Articles/482345/&quot;&gt;Some bugs&lt;/a&gt; can be educational, but others are just tedious.  I guess one still learns, but there must be a better way.

&lt;p&gt;One of the many issues with my &lt;a href=&quot;http://www.openphoenux.org&quot;&gt;Open Phoenux&lt;/a&gt; &lt;a href=&quot;http://www.gta04.org&quot;&gt;GTA04&lt;/a&gt; is that it hangs occasionally.  Note that I'm not complaining about there being issues. I deliberately got this phone because I wanted to experiment and explore and if it all worked perfectly, where would the fun be?  But there are issues and some are fun to fix.  Others..

&lt;p&gt;So anyway, it hangs.  The symptom is that sometimes I ask it to wake up from suspend and it doesn't.  The console is completely silent (no sysrq even).  This seemed to correlate with someone trying to call or TXT me, and the message not getting through.  So my initial assumption was that it would hang on resume.  But only occasionally.

&lt;p&gt;So to find out more I installed a kernel with debugging enabled (this
is a 3.2 based kernel, it currently seems most stable), connected a
serial console, (via a 3.5mm phono jack that I manage to wire up) and
programmed the phone to set the RTC alarm every minute (fortunately I
got the RTC alarm working!) so it would suspend and resume every 60 seconds.  Then let it go.

&lt;p&gt;Within about 30 minutes it would be very likely to hang.... providing the USB cable wasn't plugged in.  Even after I discovered this requirement I kept leaving the USB cable in after copying a new kernel across, and so wasted lots of testing time - I would leave it for 30 minutes with the cable in, and then realise that it wasn't going to hang like that.  Like I said: tedious.

&lt;p&gt;The first hang showed that it was just after &amp;quot;PM: early resume of
devices complete&amp;quot; so I added some more tracing and found that it was
in &lt;tt&gt;resume_device_irqs()&lt;/tt&gt;.  Then more and it was just as interrupt
92 - musb-hdrc - was enabled.  There was a pending interrupt for this
device and something got confused in there.  A few more cycles (about
an hour each!) found the bug for me.

&lt;p&gt;&lt;tt&gt;omap2430_musb_set_vbus&lt;/tt&gt; in omap2430.c contains:

&lt;p&gt;&lt;span class=&quot;mono&quot;&gt;
 			while (musb_readb(musb-&amp;gt;mregs, MUSB_DEVCTL) &amp;amp; 0x80) {
 
 				cpu_relax();
 
 				if (time_after(jiffies, timeout)) {
 					dev_err(musb-&amp;gt;controller,
 					&amp;quot;configured as A device timeout&amp;quot;);
 					ret = -EINVAL;
 					break;
 				}
 			}
 &lt;/span&gt;

&lt;p&gt;where 'timeout' is set one second in the future.  Now looping for up to one second in code that can be called from an interrupt handler (as this is called from musb_stage0_irq), is pretty poor form to start with, but it is worse than that.

&lt;p&gt;When &lt;tt&gt;resume_device_irqs()&lt;/tt&gt; calls &lt;tt&gt;__enable_irq()&lt;/tt&gt; which checks for pending interrupts and calls the handler, it does this with all interrupts disabled (and having interrupts disabled in an interrupt handler is pretty common, so this should not be a problem).  With interrupts disable, the timer doesn't tick, so jiffies doesn't ever change.  You can see where this is going, can't you.

&lt;p&gt;If that interrupt is pending on resume, we get to this code and don't just spin for 1 second, we spin forever - with interrupts disabled.  This exactly matches what I saw.

&lt;p&gt;Working around this should be easy - just put in a maxiumum loop count ... I'll get to that in a minute.  But why is there an interrupt at all?

&lt;p&gt;Well the interrupt handler - &lt;tt&gt;generic_interrupt()&lt;/tt&gt; in musb_core.c - reads the interrupt status registers and the only bit that is set is MUSB_INTR_SESSRQ in int_usb.  i.e. There is a request to start a new session.

&lt;p&gt;I *think* this means that the ID pin in the USB port was grounded.  A USB-OTG cable will do this to request that the USB-OTG port switch to host mode and start talking to a gadget.  This is what the code seems to be doing.  In the first instance it turns on VBUS (the 5V power supply for the USB).  So it all seems consistent.

&lt;p&gt;However I never plugged anything into the USB port, and definitely didn't ground the ID pin.  Remember that when I did have the USB port plugged in the problem didn't happen.  Maybe that is related.

&lt;p&gt;My guess is that some electrical noise either during suspend or resume triggers the interrupt.  Maybe there really should be a capacitor on that 'ID' pin - it wouldn't be the first time that a missing capacitor had caused problems for Openmoko devices.  I don't know how to test this, or whether to care if I can just make it work.

&lt;p&gt;But it seems I cannot, at least not yet.

&lt;p&gt;I imposed a 1000-loops count on that loop and it got further, but not
much.  It took about 5 second (so I can make the max a lot smaller),
the interrupt handler gave up, and we moved on to resume all devices
(having completed early-resume).  It got up to the MMC drivers
(hsmmc), stopped for 180 seconds, then the soft-lockup timer triggered.  So not it is not hanging with interrupts disabled, so it might be easier to debug, but it is still hanging.

&lt;p&gt;The stack trace shows

&lt;p&gt;&lt;span class=&quot;mono&quot;&gt;
 
 [ 4758.457092] mmcqd/0         D c0449efc  5644    59      2 0x00000000
 [ 4758.463806] [&amp;lt;c0449efc&amp;gt;] (__schedule+0x57c/0x608) from [&amp;lt;c044a3e0&amp;gt;] (schedule_timeout+0x1c/0x1d0)
 [ 4758.473052] [&amp;lt;c044a3e0&amp;gt;] (schedule_timeout+0x1c/0x1d0) from [&amp;lt;c044985c&amp;gt;] (wait_for_common+0xd8/0x150)
 [ 4758.482666] [&amp;lt;c044985c&amp;gt;] (wait_for_common+0xd8/0x150) from [&amp;lt;c02f4408&amp;gt;] (mmc_wait_for_req_done+0x24/0xa4)
 [ 4758.492645] [&amp;lt;c02f4408&amp;gt;] (mmc_wait_for_req_done+0x24/0xa4) from [&amp;lt;c02f4bc0&amp;gt;] (mmc_start_req+0x50/0x144)
 [ 4758.502410] [&amp;lt;c02f4bc0&amp;gt;] (mmc_start_req+0x50/0x144) from [&amp;lt;c02fe460&amp;gt;] (mmc_blk_issue_rw_rq+0x78/0x4dc)
 [ 4758.512115] [&amp;lt;c02fe460&amp;gt;] (mmc_blk_issue_rw_rq+0x78/0x4dc) from [&amp;lt;c02fecc8&amp;gt;] (mmc_blk_issue_rq+0x404/0x434)
 [ 4758.522155] [&amp;lt;c02fecc8&amp;gt;] (mmc_blk_issue_rq+0x404/0x434) from [&amp;lt;c02ff824&amp;gt;] (mmc_queue_thread+0x98/0x100)
 [ 4758.531951] [&amp;lt;c02ff824&amp;gt;] (mmc_queue_thread+0x98/0x100) from [&amp;lt;c0055060&amp;gt;] (kthread+0x80/0x88)
 [ 4758.540771] [&amp;lt;c0055060&amp;gt;] (kthread+0x80/0x88) from [&amp;lt;c000f30c&amp;gt;] (kernel_thread_exit+0x0/0x8)
 
 &lt;/span&gt;

&lt;p&gt;which suggests that a block io  request to the mmc is being retried
immediately on resume, and not getting anywhere.  Now it could be that
this is completely unrelated to the USB and I should run more tests,
but not today.

&lt;p&gt;The suspend/resume thread is :

&lt;p&gt;&lt;span class=&quot;mono&quot;&gt;
 
 [ 4758.669219] susman          D c0449efc  5060  1423   1420 0x00000000
 [ 4758.675872] [&amp;lt;c0449efc&amp;gt;] (__schedule+0x57c/0x608) from [&amp;lt;c02f36c8&amp;gt;] (__mmc_claim_host+0xb8/0x154)
 [ 4758.685119] [&amp;lt;c02f36c8&amp;gt;] (__mmc_claim_host+0xb8/0x154) from [&amp;lt;c02f8edc&amp;gt;] (mmc_sd_resume+0x34/0x5c)
 [ 4758.694458] [&amp;lt;c02f8edc&amp;gt;] (mmc_sd_resume+0x34/0x5c) from [&amp;lt;c02f2b20&amp;gt;] (mmc_resume_host+0xc8/0x15c)
 [ 4758.703704] [&amp;lt;c02f2b20&amp;gt;] (mmc_resume_host+0xc8/0x15c) from [&amp;lt;c0307874&amp;gt;] (omap_hsmmc_resume+0xa0/0xe4)
 [ 4758.713317] [&amp;lt;c0307874&amp;gt;] (omap_hsmmc_resume+0xa0/0xe4) from [&amp;lt;c024e760&amp;gt;] (platform_pm_resume+0x44/0x54)
 [ 4758.723114] [&amp;lt;c024e760&amp;gt;] (platform_pm_resume+0x44/0x54) from [&amp;lt;c0252bcc&amp;gt;] (pm_op+0x6c/0xb8)
 [ 4758.731811] [&amp;lt;c0252bcc&amp;gt;] (pm_op+0x6c/0xb8) from [&amp;lt;c0253778&amp;gt;] (device_resume+0x190/0x228)
 [ 4758.740234] [&amp;lt;c0253778&amp;gt;] (device_resume+0x190/0x228) from [&amp;lt;c025391c&amp;gt;] (dpm_resume+0x10c/0x244)
 [ 4758.749298] [&amp;lt;c025391c&amp;gt;] (dpm_resume+0x10c/0x244) from [&amp;lt;c0253a60&amp;gt;] (dpm_resume_end+0xc/0x18)
 [ 4758.758178] [&amp;lt;c0253a60&amp;gt;] (dpm_resume_end+0xc/0x18) from [&amp;lt;c00768cc&amp;gt;] (suspend_devices_and_enter+0x1d0/0x22c)
 [ 4758.768432] [&amp;lt;c00768cc&amp;gt;] (suspend_devices_and_enter+0x1d0/0x22c) from [&amp;lt;c0076a54&amp;gt;] (enter_state+0x12c/0x18c)
 [ 4758.778656] [&amp;lt;c0076a54&amp;gt;] (enter_state+0x12c/0x18c) from [&amp;lt;c0075884&amp;gt;] (state_store+0x94/0x118)
 [ 4758.787536] [&amp;lt;c0075884&amp;gt;] (state_store+0x94/0x118) from [&amp;lt;c01d4d8c&amp;gt;] (kobj_attr_store+0x1c/0x24)
 [ 4758.796600] [&amp;lt;c01d4d8c&amp;gt;] (kobj_attr_store+0x1c/0x24) from [&amp;lt;c01172e4&amp;gt;] (sysfs_write_file+0x108/0x13c)
 [ 4758.806213] [&amp;lt;c01172e4&amp;gt;] (sysfs_write_file+0x108/0x13c) from [&amp;lt;c00bfc5c&amp;gt;] (vfs_write+0xac/0x180)
 [ 4758.815368] [&amp;lt;c00bfc5c&amp;gt;] (vfs_write+0xac/0x180) from [&amp;lt;c00bfde8&amp;gt;] (sys_write+0x40/0x6c)
 [ 4758.823699] [&amp;lt;c00bfde8&amp;gt;] (sys_write+0x40/0x6c) from [&amp;lt;c000e9c0&amp;gt;] (ret_fast_syscall+0x0/0x3c)
 
 &lt;/span&gt;

&lt;p&gt;So it looks like it is waiting for the mmc device, which itself is
hanging.  My feeling here is that this is a very different problem.
The block queue should block everything on suspend so no requests
should be pending at this point.  I guess I'm going to have to look
into that some more.

&lt;p&gt;So I've probably learned a bit, but really not much.  Mostly just some
very silly code in an interrupt handler, which took hours to find.
Hopefully examining the block-device issue will be more fun.


&lt;p&gt;&lt;br&gt;UPDATE:

&lt;p&gt;The MMC bug was fairly easy to find - the 'suspend' callback was
simply not being called.  This was easily fixed by applying
commit 32d317c60e56c2a34463b51fc0336cc96b3e1735 from Linux-3.4.

&lt;p&gt;So now my GTA04 doesn't crash on resume any more.  Hurray.


&lt;p&gt;&lt;br&gt;&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20120706122934&gt;(1 comment)&lt;/a&gt;</content>
</entry>
<entry>
<title>A typical week in RAID-land</title>
<issued>2012-06-29T06:53:02Z</issued>
<modified>2012-06-29T06:53:02Z</modified>
<id>http://neil.brown.name/blog/20120629065302</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20120629065302"/>
<content type="text/html" mode="escaped">

&lt;p&gt;I should probably write more.  I enjoy writing but don't do enough of it.  I probably have Elisabeth Bennet's condition.  She says to Darcy on the dance floor:

&lt;p&gt;&lt;div class=indent&gt;We are each of an unsocial,
taciturn disposition, unwilling to speak, unless we expect to say
something that will amaze the whole room, and be handed down
to posterity with all the eclat of a proverb.
&lt;/div&gt;

&lt;p&gt;I sometime feel unwilling to write unless I'll say something to amaze the whole Internet.  But I think I should try to push through that.

&lt;p&gt;So what has happened this week?  I doubt it is really a typical week as no week is really typical - I imagine them all outside my window screaming in chorus &amp;quot;We are all individuals&amp;quot;. But I can't really know unless I record it, then compare it with future weeks.

&lt;p&gt;&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20120629065302&gt;read more...(No comments)&lt;/a&gt;</content>
</entry>
<entry>
<title>A Nasty md/raid bug</title>
<issued>2012-06-15T07:32:45Z</issued>
<modified>2012-06-15T07:32:45Z</modified>
<id>http://neil.brown.name/blog/20120615073245</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20120615073245"/>
<content type="text/html" mode="escaped">
There is a rather nasty RAID bug in some released versions of the
Linux kernel.  It won't destroy your data, but it could make it hard
to access that data.

&lt;p&gt;If you are concerned that this might affect you, the first thing you
should do (after not panicking) is to gather the output of

&lt;p&gt;&lt;span class=&quot;mono&quot;&gt;
    mdadm -Evvvvs
 &lt;/span&gt;

&lt;p&gt;and save this somewhere that is not on a RAID array.  The second thing
to do is read to the end of this note and then proceed accordingly.
You most likely will never need use the output of that command, but if you
do it could be extremely helpful.
&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20120615073245&gt;read more...(20 comments)&lt;/a&gt;</content>
</entry>
<entry>
<title>Closing the RAID5 write hole</title>
<issued>2011-06-14T10:17:08Z</issued>
<modified>2011-06-14T10:17:08Z</modified>
<id>http://neil.brown.name/blog/20110614101708</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20110614101708"/>
<content type="text/html" mode="escaped">
Over a year ago I wrote some thoughts about closing the RAID5 write hole
in an answer to a comment on a blog post:

&lt;p&gt;&lt;a href=&quot;http://neil.brown.name/blog/20090129234603&quot;&gt;http://neil.brown.name/blog/20090129234603&lt;/a&gt; and 
&lt;a href=&quot;http://neil.brown.name/blog/20090129234603-028&quot;&gt;http://neil.brown.name/blog/20090129234603-028&lt;/a&gt;.

&lt;p&gt;I recently had some interest shown in this so I thought it might be
useful to write up some thoughts more coherently and completely.
&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20110614101708&gt;read more...(No comments)&lt;/a&gt;</content>
</entry>
<entry>
<title>Another mdadm release: 3.2.1</title>
<issued>2011-03-28T02:54:07Z</issued>
<modified>2011-03-28T02:54:07Z</modified>
<id>http://neil.brown.name/blog/20110328025407</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20110328025407"/>
<content type="text/html" mode="escaped">


&lt;p&gt;Hot on the heals of mdadm-3.1.5 I have just released 3.2.1.

&lt;p&gt;The 3.2 series contains two particular sets of new functionality.

&lt;p&gt;Firstly there is the &amp;quot;policy&amp;quot; framework.  This allows us to set policy for different devices based on where they are connected (e.g. which controller) so that e.g. when a device is hot-plugged it can immediately be made a hot-spare for an array without further operator intervention.  It also allows broader controller of spare-migration between arrays.  It is likely that more functionality will be added to this framework over time

&lt;p&gt;Secondly, the support for Intel Matrix Storage Manager (IMSM) arrays has been substantially enhanced.  Spare migration is now possible as is level migration and OLCE (OnLine Capacity Expansion).  This support is not quite complete yet and requires MDADM_EXPERIMENTAL=1 in the environment to ensure people only use it with care.  In particular if you start a reshape in Linux and then shutdown and boot into Window, the Windows driver may not correctly restart the reshape.  And vice-versa.

&lt;p&gt;If you don't want any of the new functionality then it is probably safest to stay with 3.1.5 as it has all recent bug fixes.  But if you are at all interested in the new functionality, then by all means give 3.2.1 a try.  It should work fine and is no more likely to eat your data than any other program out there.

&lt;p&gt;&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20110328025407&gt;(14 comments)&lt;/a&gt;</content>
</entry>
<entry>
<title>Release of mdadm-3.1.5</title>
<issued>2011-03-23T04:59:10Z</issued>
<modified>2011-03-23T04:59:10Z</modified>
<id>http://neil.brown.name/blog/20110323045910</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20110323045910"/>
<content type="text/html" mode="escaped">


&lt;p&gt;The last release of mdadm that I mentioned in this blog was 2.6.1.  As I am now announcing 3.1.5 you can see that I missed a few.  That's OK though as I keep the release announcements in the source distribution so you can always go and read them there.

&lt;p&gt;3.1.5 is just bugfixes.  It is essentially 3.1.4 plus all the bug fixes found while working on 3.2 and 3.2.1.  The list from the release announcement is:

&lt;p&gt;&lt;ul&gt;&lt;li&gt;Fixes for v1.x metadata on big-endian machines.&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;man page improvements&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;Improve '--detail --export' when run on partitions of an md array.&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;Fix regression with removing 'failed' or 'detached' devices.&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;Fixes for &amp;quot;--assemble --force&amp;quot; in various unusual cases.&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;Allow '-Y' to mean --export.  This was documented but not implemented.&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;Various fixed for handling 'ddf' metadata.  This is now more reliable
    but could benefit from more interoperability testing.&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;Correctly list subarrays of a container in &amp;quot;--detail&amp;quot; output.&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;Improve checks on whether the requested number of devices is supported
    by the metadata - both for --create and --grow.&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;Don't remove partitions from a device that is being included in an
    array until we are fully committed to including it.&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;Allow &amp;quot;--assemble --update=no-bitmap&amp;quot; so an array with a corrupt
    bitmap can still be assembled.&lt;/li&gt;&lt;/ul&gt;
&lt;ul&gt;&lt;li&gt;Don't allow --add to succeed if it looks like a &amp;quot;--re-add&amp;quot; is probably
    wanted, but cannot succeed.  This avoids inadvertently turning
    devices into spares when an array is failed.&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;As you can see - lots of little bits and pieces.

&lt;p&gt;I hope to release 3.2.1 soon.  For people who want to use the Intel metadata format (Intel Matrix Storage Manager - IMSM) on Intel motherboards which have BIOS support and MS-Windows support, you should probably wait for 3.2.1.  For anyone else, 3.1.5 is what you want.

&lt;p&gt;3.2.1 should be released soonish.  I probably won't even start on 3.2.2 for a couple of months, though I already have a number of thoughts about what I want to include.  A lot of it will be cleaning up and re-organising the code:  stuff I wanted to do for 3.2 but ran out of time.

&lt;p&gt;As always, mdadm can be found via git at &lt;a href=&quot;git://neil.brown.name/mdadm/&quot;&gt;git://neil.brown.name/mdadm/&lt;/a&gt; or from
&lt;a href=&quot;http://www.kernel.org/pub/linux/utils/raid/mdadm/&quot;&gt;http://www.kernel.org/pub/linux/utils/raid/mdadm/&lt;/a&gt;.

&lt;p&gt;&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20110323045910&gt;(No comments)&lt;/a&gt;</content>
</entry>
<entry>
<title>log segments and RAID6 reshaping</title>
<issued>2011-03-08T07:47:33Z</issued>
<modified>2011-03-08T07:47:33Z</modified>
<id>http://neil.brown.name/blog/20110308074733</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20110308074733"/>
<content type="text/html" mode="escaped">

&lt;p&gt;Part of the design approach of LaFS - and any other log structured filesystem - is to divide the device space into relatively large segments.  Each segment is many megabytes in size so the time to write a whole segment is much more than the time to seek to a new segment.   Writes happen sequentially through a segment, so write throughput should be as high as the device can manage.

&lt;p&gt;(obviously there needs to be a way to find or create segments with no live data so they can be written to.  This is called cleaning and will not be discussed further here).

&lt;p&gt;One of the innovations of LaFS is to allow segments to be aligned with the stripes in a RAID5 or RAID6 array so that each segment is a whole number of stripes and so that LaFS knows the details of the layout including chunk size and width (number of data devices).

&lt;p&gt;This allows LaFS to always write in whole 'strips' - where a 'strip' is one block from each device chosen such that they all contribute to the one parity block.  Blocks in a strip may not be contiguous (they only are if the chunksize matches the block size), so one would not normally write a single strip.  However doing so is the most efficient way to write to RAID6 as no pre-reading is needed.  So as LaFS knows the precise geometry and is free with how it chooses where to write, it can easily write just a strip if needed.  It can also pad out the write with blocks of NULs to make sure a whole strip is written each time.

&lt;p&gt;Normally one would hope that several strip would be written at once, hopefully a whole stripe or more, but it is very valuable to be able to write whole strips at a time.

&lt;p&gt;This is lovely in theory but in practice there is a problem.  People like to make their RAID6 arrays bigger, often by adding one or two devices to the array and &amp;quot;restriping&amp;quot; or &amp;quot;reshaping&amp;quot; the array.
When you do this the geometry changes significantly and the alignment of strips and stripes and segments will be quite different.  Suddenly the efficient IO practice of LaFS becomes very inefficient.

&lt;p&gt;There are two ways to address this, one which I have had in mind since the beginning, one which only occurred to me recently.
&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20110308074733&gt;read more...(No comments)&lt;/a&gt;</content>
</entry>
<entry>
<title>Off-the-road-map: Data checksums</title>
<issued>2011-02-27T11:42:01Z</issued>
<modified>2011-02-27T11:42:01Z</modified>
<id>http://neil.brown.name/blog/20110227114201</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20110227114201"/>
<content type="text/html" mode="escaped">

&lt;p&gt;Among the responses I received to my recent post of a development road-map for md/raid were some suggestions for features that I believe are wrong and should not be implemented.  So rather than being simple ommisions, they are deliberate exclusions.  On of these suggestions in the idea of calculating, storing, and checking a checksum of each data  block.

&lt;p&gt;Checksums are in general a good idea.  Whether it is a simple parity bit, an ECC, a CRC or a full cryptographic hash, a checksum can help detect single bit and some multi-bit errors and stop those error propagating further into a system.  It is generally better to know that you have lost some data rather than believe that some wrong data is actually good, and checksums allow you to do that.

&lt;p&gt;So I am in favour of checksum in general, but I don't think it is appropriate to sprinkle them around everywhere and in particular I don't think that it is the role of md to manage checksums for all data blocks.

&lt;p&gt;To make this belief more concrete, I see that there are two classes of places where checksums are important.  I call these &amp;quot;link checksums&amp;quot; and &amp;quot;end-to-end checksums&amp;quot;.
&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20110227114201&gt;read more...(No comments)&lt;/a&gt;</content>
</entry>
<entry>
<title>MD/RAID road-map 2011</title>
<issued>2011-02-16T04:40:02Z</issued>
<modified>2011-02-16T04:40:02Z</modified>
<id>http://neil.brown.name/blog/20110216044002</id>
<link rel="alternate" type="text/html" href="http://neil.brown.name/blog/20110216044002"/>
<content type="text/html" mode="escaped">

&lt;p&gt;It is about 2 years since I last published a
&lt;a href=&quot;http://neil.brown.name/blog/20090129234603&quot;&gt;road-map&lt;/a&gt;
for md/raid
so I thought it was time for another one.  Unfortunately quite a few
things on the previous list remain undone, but there has been some
progress.

&lt;p&gt;I think one of the problems with some to-do lists is that they aren't
detailed enough.  High-level design, low level design, implementation,
and testing are all very different sorts of tasks that seem to require
different styles of thinking and so are best done separately.  As
writing up a road-map is a high-level design task it makes sense to do
the full high-level design at that point so that the tasks are
detailed enough to be addressed individually with little reference to
the other tasks in the list (except what is explicit in the road map).

&lt;p&gt;A particular need I am finding for this road map is to make explicit
the required ordering and interdependence of certain tasks.  Hopefully
that will make it easier to address them in an appropriate order, and
mean that I waste less time saying &amp;quot;this is too hard, I might go read
some email instead&amp;quot;.

&lt;p&gt;So the following is a detailed road-map for md raid for the coming
months.
&lt;p&gt;&lt;a href=http://neil.brown.name/blog/20110216044002&gt;read more...(10 comments)&lt;/a&gt;</content>
</entry>

</feed>
