Unknown date

24 July 2012, 06:07 UTCLinux 3.5 on the GTA04

Linus released Linux-3.5 a couple of days ago so it is long past time to have 3.5 available for the OpenPhoenux GTA04. By the time I publish this blog entry it will be, both on github at and on my own server at;a=shortlog;h=refs/heads/3.5-gta04.

Quite a lot has gone into 3.5-gta04. A lot of that was bug fixes. 3.5 seems to have more changes that break my phone than other recent kernels, though I haven't counted bugs or hours so that could be misleading. Certainly I have quite a lot of patches in my 'bugs' branch which is mainly for fixing regressions, but then 8 of those are all related: the libertas has a new async firmware loader which seems to be broken. I haven't examined it closely, but wifi didn't work until I revert those patches.

On a happier note I've found time to make various improvements.

I think that is all. Power management seems to be largely unchanged - 25mA when in suspend, about 70mA when idle, awake, screen off, and about 200mA when idle and awake with screen on. I'll be using this on my phone on a regular basis and see how it goes.

[permalink (No comments)]

08 July 2012, 10:31 UTCOpenPhoenux Serial Console Access

I find that serial console access is a must for debugging kernel problems on the GTA04 board in my OpenPhoenux phone. Getting serial console access to the board itself is quite easy as it has a little connector on the board, and came with a ribbon cable which plugs in to this connector and has a DB-9 on the other end.

However it isn't so easy when the board is in the phone case, which is where it is most useful. The connector is on the side of the board closest to the front of the phone, just above the display. Cutting a hole there might be possible, but having a ribbon cable hanging out isn't really an option. So I needed to be a little more clever.

Firstly I destroyed the ribbon cable, keeping the brown, yellow, and blue wires and cutting the rest off at the plug. The three wires I kept about 2cm of length on. You can probably see some solder burns from an earlier experiment where I tried to solder 3 wires directly onto the pins at the back of the plug. That worked a little bit, but they kept breaking off. Maybe I'm not good a soldering, or maybe it was just a bad idea.

As you can see in this picture, the wires are feed around the edge of the board the the back. They are aiming for the little cavity beside the battery.

Then I cut away some plastic to make a hole big enough for a small cable with tap wrapped around it, shown here.

Next I found a nice small 3.5mm stereo jack of an old busted pair of headphone. Not all jacks are small enough. Ipod headsets seem to have particularly slim jacks, but others probably work as well. Soldering the 3 wires from this to the 3 wires from the ribbon cable means that I have TX, RX, GND in a reasonably accessible location.

You need to be careful about getting the length of the cable right so it is long enough without any excess. Here you see it tucked away so the back cover can go over. The tip of the jack goes into the slot a little bit and as you might be able to see, I have a little bit too much cable so it buckles a bit an uses more space than I would like. I can clip the back on though, which is all that matters.

Finally I bought a short cable with a female 3.5mm on one end (I think there were 2 RCA on the other), cut that up, and soldered the 3 wires to my DB-9. And now I get console access by just popping the cover over (being careful not to let the battery fall out), easing the jack out, and plugging it in. Very useful.

[permalink (No comments)]

06 July 2012, 12:29 UTCTedious bugs

Some bugs can be educational, but others are just tedious. I guess one still learns, but there must be a better way.

One of the many issues with my Open Phoenux GTA04 is that it hangs occasionally. Note that I'm not complaining about there being issues. I deliberately got this phone because I wanted to experiment and explore and if it all worked perfectly, where would the fun be? But there are issues and some are fun to fix. Others..

So anyway, it hangs. The symptom is that sometimes I ask it to wake up from suspend and it doesn't. The console is completely silent (no sysrq even). This seemed to correlate with someone trying to call or TXT me, and the message not getting through. So my initial assumption was that it would hang on resume. But only occasionally.

So to find out more I installed a kernel with debugging enabled (this is a 3.2 based kernel, it currently seems most stable), connected a serial console, (via a 3.5mm phono jack that I manage to wire up) and programmed the phone to set the RTC alarm every minute (fortunately I got the RTC alarm working!) so it would suspend and resume every 60 seconds. Then let it go.

Within about 30 minutes it would be very likely to hang.... providing the USB cable wasn't plugged in. Even after I discovered this requirement I kept leaving the USB cable in after copying a new kernel across, and so wasted lots of testing time - I would leave it for 30 minutes with the cable in, and then realise that it wasn't going to hang like that. Like I said: tedious.

The first hang showed that it was just after "PM: early resume of devices complete" so I added some more tracing and found that it was in resume_device_irqs(). Then more and it was just as interrupt 92 - musb-hdrc - was enabled. There was a pending interrupt for this device and something got confused in there. A few more cycles (about an hour each!) found the bug for me.

omap2430_musb_set_vbus in omap2430.c contains:

while (musb_readb(musb->mregs, MUSB_DEVCTL) & 0x80) { cpu_relax(); if (time_after(jiffies, timeout)) { dev_err(musb->controller, "configured as A device timeout"); ret = -EINVAL; break; } }

where 'timeout' is set one second in the future. Now looping for up to one second in code that can be called from an interrupt handler (as this is called from musb_stage0_irq), is pretty poor form to start with, but it is worse than that.

When resume_device_irqs() calls __enable_irq() which checks for pending interrupts and calls the handler, it does this with all interrupts disabled (and having interrupts disabled in an interrupt handler is pretty common, so this should not be a problem). With interrupts disable, the timer doesn't tick, so jiffies doesn't ever change. You can see where this is going, can't you.

If that interrupt is pending on resume, we get to this code and don't just spin for 1 second, we spin forever - with interrupts disabled. This exactly matches what I saw.

Working around this should be easy - just put in a maxiumum loop count ... I'll get to that in a minute. But why is there an interrupt at all?

Well the interrupt handler - generic_interrupt() in musb_core.c - reads the interrupt status registers and the only bit that is set is MUSB_INTR_SESSRQ in int_usb. i.e. There is a request to start a new session.

I *think* this means that the ID pin in the USB port was grounded. A USB-OTG cable will do this to request that the USB-OTG port switch to host mode and start talking to a gadget. This is what the code seems to be doing. In the first instance it turns on VBUS (the 5V power supply for the USB). So it all seems consistent.

However I never plugged anything into the USB port, and definitely didn't ground the ID pin. Remember that when I did have the USB port plugged in the problem didn't happen. Maybe that is related.

My guess is that some electrical noise either during suspend or resume triggers the interrupt. Maybe there really should be a capacitor on that 'ID' pin - it wouldn't be the first time that a missing capacitor had caused problems for Openmoko devices. I don't know how to test this, or whether to care if I can just make it work.

But it seems I cannot, at least not yet.

I imposed a 1000-loops count on that loop and it got further, but not much. It took about 5 second (so I can make the max a lot smaller), the interrupt handler gave up, and we moved on to resume all devices (having completed early-resume). It got up to the MMC drivers (hsmmc), stopped for 180 seconds, then the soft-lockup timer triggered. So not it is not hanging with interrupts disabled, so it might be easier to debug, but it is still hanging.

The stack trace shows

[ 4758.457092] mmcqd/0 D c0449efc 5644 59 2 0x00000000 [ 4758.463806] [<c0449efc>] (__schedule+0x57c/0x608) from [<c044a3e0>] (schedule_timeout+0x1c/0x1d0) [ 4758.473052] [<c044a3e0>] (schedule_timeout+0x1c/0x1d0) from [<c044985c>] (wait_for_common+0xd8/0x150) [ 4758.482666] [<c044985c>] (wait_for_common+0xd8/0x150) from [<c02f4408>] (mmc_wait_for_req_done+0x24/0xa4) [ 4758.492645] [<c02f4408>] (mmc_wait_for_req_done+0x24/0xa4) from [<c02f4bc0>] (mmc_start_req+0x50/0x144) [ 4758.502410] [<c02f4bc0>] (mmc_start_req+0x50/0x144) from [<c02fe460>] (mmc_blk_issue_rw_rq+0x78/0x4dc) [ 4758.512115] [<c02fe460>] (mmc_blk_issue_rw_rq+0x78/0x4dc) from [<c02fecc8>] (mmc_blk_issue_rq+0x404/0x434) [ 4758.522155] [<c02fecc8>] (mmc_blk_issue_rq+0x404/0x434) from [<c02ff824>] (mmc_queue_thread+0x98/0x100) [ 4758.531951] [<c02ff824>] (mmc_queue_thread+0x98/0x100) from [<c0055060>] (kthread+0x80/0x88) [ 4758.540771] [<c0055060>] (kthread+0x80/0x88) from [<c000f30c>] (kernel_thread_exit+0x0/0x8)

which suggests that a block io request to the mmc is being retried immediately on resume, and not getting anywhere. Now it could be that this is completely unrelated to the USB and I should run more tests, but not today.

The suspend/resume thread is :

[ 4758.669219] susman D c0449efc 5060 1423 1420 0x00000000 [ 4758.675872] [<c0449efc>] (__schedule+0x57c/0x608) from [<c02f36c8>] (__mmc_claim_host+0xb8/0x154) [ 4758.685119] [<c02f36c8>] (__mmc_claim_host+0xb8/0x154) from [<c02f8edc>] (mmc_sd_resume+0x34/0x5c) [ 4758.694458] [<c02f8edc>] (mmc_sd_resume+0x34/0x5c) from [<c02f2b20>] (mmc_resume_host+0xc8/0x15c) [ 4758.703704] [<c02f2b20>] (mmc_resume_host+0xc8/0x15c) from [<c0307874>] (omap_hsmmc_resume+0xa0/0xe4) [ 4758.713317] [<c0307874>] (omap_hsmmc_resume+0xa0/0xe4) from [<c024e760>] (platform_pm_resume+0x44/0x54) [ 4758.723114] [<c024e760>] (platform_pm_resume+0x44/0x54) from [<c0252bcc>] (pm_op+0x6c/0xb8) [ 4758.731811] [<c0252bcc>] (pm_op+0x6c/0xb8) from [<c0253778>] (device_resume+0x190/0x228) [ 4758.740234] [<c0253778>] (device_resume+0x190/0x228) from [<c025391c>] (dpm_resume+0x10c/0x244) [ 4758.749298] [<c025391c>] (dpm_resume+0x10c/0x244) from [<c0253a60>] (dpm_resume_end+0xc/0x18) [ 4758.758178] [<c0253a60>] (dpm_resume_end+0xc/0x18) from [<c00768cc>] (suspend_devices_and_enter+0x1d0/0x22c) [ 4758.768432] [<c00768cc>] (suspend_devices_and_enter+0x1d0/0x22c) from [<c0076a54>] (enter_state+0x12c/0x18c) [ 4758.778656] [<c0076a54>] (enter_state+0x12c/0x18c) from [<c0075884>] (state_store+0x94/0x118) [ 4758.787536] [<c0075884>] (state_store+0x94/0x118) from [<c01d4d8c>] (kobj_attr_store+0x1c/0x24) [ 4758.796600] [<c01d4d8c>] (kobj_attr_store+0x1c/0x24) from [<c01172e4>] (sysfs_write_file+0x108/0x13c) [ 4758.806213] [<c01172e4>] (sysfs_write_file+0x108/0x13c) from [<c00bfc5c>] (vfs_write+0xac/0x180) [ 4758.815368] [<c00bfc5c>] (vfs_write+0xac/0x180) from [<c00bfde8>] (sys_write+0x40/0x6c) [ 4758.823699] [<c00bfde8>] (sys_write+0x40/0x6c) from [<c000e9c0>] (ret_fast_syscall+0x0/0x3c)

So it looks like it is waiting for the mmc device, which itself is hanging. My feeling here is that this is a very different problem. The block queue should block everything on suspend so no requests should be pending at this point. I guess I'm going to have to look into that some more.

So I've probably learned a bit, but really not much. Mostly just some very silly code in an interrupt handler, which took hours to find. Hopefully examining the block-device issue will be more fun.


The MMC bug was fairly easy to find - the 'suspend' callback was simply not being called. This was easily fixed by applying commit 32d317c60e56c2a34463b51fc0336cc96b3e1735 from Linux-3.4.

So now my GTA04 doesn't crash on resume any more. Hurray.

[permalink (1 comment)]

29 June 2012, 06:53 UTCA typical week in RAID-land

I should probably write more. I enjoy writing but don't do enough of it. I probably have Elisabeth Bennet's condition. She says to Darcy on the dance floor:

We are each of an unsocial, taciturn disposition, unwilling to speak, unless we expect to say something that will amaze the whole room, and be handed down to posterity with all the eclat of a proverb.

I sometime feel unwilling to write unless I'll say something to amaze the whole Internet. But I think I should try to push through that.

So what has happened this week? I doubt it is really a typical week as no week is really typical - I imagine them all outside my window screaming in chorus "We are all individuals". But I can't really know unless I record it, then compare it with future weeks.

read more... (No comments)

15 June 2012, 07:32 UTCA Nasty md/raid bug
There is a rather nasty RAID bug in some released versions of the Linux kernel. It won't destroy your data, but it could make it hard to access that data.

If you are concerned that this might affect you, the first thing you should do (after not panicking) is to gather the output of

mdadm -Evvvvs

and save this somewhere that is not on a RAID array. The second thing to do is read to the end of this note and then proceed accordingly. You most likely will never need use the output of that command, but if you do it could be extremely helpful.

read more... (20 comments)

14 June 2011, 10:17 UTCClosing the RAID5 write hole
Over a year ago I wrote some thoughts about closing the RAID5 write hole in an answer to a comment on a blog post: and

I recently had some interest shown in this so I thought it might be useful to write up some thoughts more coherently and completely.

read more... (No comments)

28 March 2011, 02:54 UTCAnother mdadm release: 3.2.1

Hot on the heals of mdadm-3.1.5 I have just released 3.2.1.

The 3.2 series contains two particular sets of new functionality.

Firstly there is the "policy" framework. This allows us to set policy for different devices based on where they are connected (e.g. which controller) so that e.g. when a device is hot-plugged it can immediately be made a hot-spare for an array without further operator intervention. It also allows broader controller of spare-migration between arrays. It is likely that more functionality will be added to this framework over time

Secondly, the support for Intel Matrix Storage Manager (IMSM) arrays has been substantially enhanced. Spare migration is now possible as is level migration and OLCE (OnLine Capacity Expansion). This support is not quite complete yet and requires MDADM_EXPERIMENTAL=1 in the environment to ensure people only use it with care. In particular if you start a reshape in Linux and then shutdown and boot into Window, the Windows driver may not correctly restart the reshape. And vice-versa.

If you don't want any of the new functionality then it is probably safest to stay with 3.1.5 as it has all recent bug fixes. But if you are at all interested in the new functionality, then by all means give 3.2.1 a try. It should work fine and is no more likely to eat your data than any other program out there.

[permalink (14 comments)]

23 March 2011, 04:59 UTCRelease of mdadm-3.1.5

The last release of mdadm that I mentioned in this blog was 2.6.1. As I am now announcing 3.1.5 you can see that I missed a few. That's OK though as I keep the release announcements in the source distribution so you can always go and read them there.

3.1.5 is just bugfixes. It is essentially 3.1.4 plus all the bug fixes found while working on 3.2 and 3.2.1. The list from the release announcement is:

As you can see - lots of little bits and pieces.

I hope to release 3.2.1 soon. For people who want to use the Intel metadata format (Intel Matrix Storage Manager - IMSM) on Intel motherboards which have BIOS support and MS-Windows support, you should probably wait for 3.2.1. For anyone else, 3.1.5 is what you want.

3.2.1 should be released soonish. I probably won't even start on 3.2.2 for a couple of months, though I already have a number of thoughts about what I want to include. A lot of it will be cleaning up and re-organising the code: stuff I wanted to do for 3.2 but ran out of time.

As always, mdadm can be found via git at git:// or from

[permalink (No comments)]

08 March 2011, 07:47 UTClog segments and RAID6 reshaping

Part of the design approach of LaFS - and any other log structured filesystem - is to divide the device space into relatively large segments. Each segment is many megabytes in size so the time to write a whole segment is much more than the time to seek to a new segment. Writes happen sequentially through a segment, so write throughput should be as high as the device can manage.

(obviously there needs to be a way to find or create segments with no live data so they can be written to. This is called cleaning and will not be discussed further here).

One of the innovations of LaFS is to allow segments to be aligned with the stripes in a RAID5 or RAID6 array so that each segment is a whole number of stripes and so that LaFS knows the details of the layout including chunk size and width (number of data devices).

This allows LaFS to always write in whole 'strips' - where a 'strip' is one block from each device chosen such that they all contribute to the one parity block. Blocks in a strip may not be contiguous (they only are if the chunksize matches the block size), so one would not normally write a single strip. However doing so is the most efficient way to write to RAID6 as no pre-reading is needed. So as LaFS knows the precise geometry and is free with how it chooses where to write, it can easily write just a strip if needed. It can also pad out the write with blocks of NULs to make sure a whole strip is written each time.

Normally one would hope that several strip would be written at once, hopefully a whole stripe or more, but it is very valuable to be able to write whole strips at a time.

This is lovely in theory but in practice there is a problem. People like to make their RAID6 arrays bigger, often by adding one or two devices to the array and "restriping" or "reshaping" the array. When you do this the geometry changes significantly and the alignment of strips and stripes and segments will be quite different. Suddenly the efficient IO practice of LaFS becomes very inefficient.

There are two ways to address this, one which I have had in mind since the beginning, one which only occurred to me recently.

read more... (No comments)

27 February 2011, 11:42 UTCOff-the-road-map: Data checksums

Among the responses I received to my recent post of a development road-map for md/raid were some suggestions for features that I believe are wrong and should not be implemented. So rather than being simple ommisions, they are deliberate exclusions. On of these suggestions in the idea of calculating, storing, and checking a checksum of each data block.

Checksums are in general a good idea. Whether it is a simple parity bit, an ECC, a CRC or a full cryptographic hash, a checksum can help detect single bit and some multi-bit errors and stop those error propagating further into a system. It is generally better to know that you have lost some data rather than believe that some wrong data is actually good, and checksums allow you to do that.

So I am in favour of checksum in general, but I don't think it is appropriate to sprinkle them around everywhere and in particular I don't think that it is the role of md to manage checksums for all data blocks.

To make this belief more concrete, I see that there are two classes of places where checksums are important. I call these "link checksums" and "end-to-end checksums".

read more... (No comments)

16 February 2011, 04:40 UTCMD/RAID road-map 2011
08 September 2010, 07:20 UTCA talk on dm/md convergence
19 May 2010, 04:37 UTCDesign notes for a bad-block list in md/raid
24 March 2010, 06:46 UTCA new release of wiggle
11 February 2010, 05:03 UTCSmart or simple RAID recovery??
17 August 2009, 00:09 UTCConverting RAID5 to RAID6 and other shape changing in md/raid
28 February 2009, 12:37 UTCThe LaFS directory structure
24 February 2009, 19:53 UTCMeasuring Freerunner battery life [UPDATED]
15 February 2009, 22:42 UTCtapinput: Yet another soft keyboard for the freerunner.
12 February 2009, 20:54 UTCMoving to Debian on my Neo Freerunner
08 February 2009, 05:22 UTCWhy I wrote my own 'gsmd'
31 January 2009, 20:51 UTCgsm0710muxd without DBUS or ptys
30 January 2009, 21:18 UTCNext Freerunner toys - battery applet and runit
29 January 2009, 23:46 UTCRoad map for md/raid driver - sort of
28 January 2009, 02:56 UTCScreen Lock on the Freerunner
22 February 2007, 04:22 UTCmdadm 2.6.1 released
17 June 2006, 08:24 UTCAnother TODO list : nfsd
11 June 2006, 10:13 UTCMetad - a daemon for controlling daemons
26 May 2006, 10:14 UTCmdadm 2.5 released
21 May 2006, 09:26 UTCAuto-assembly mode for mdadm

list of all entries

[atom feed]