19 May 2010, 04:37 UTCDesign notes for a bad-block list in md/raid
I'm in the middle of (finally) implementing a bad block list for Linux md/raid, and I find that the motivation and the desired behaviour isn't (or wasn't) quite as obvious as I expected. So now that I think I have sorted it out, it seems sensible to write it up so that you, my faithful reader, can point out any glaring problems.
The bad block list is simply a list of blocks - one list for each device - which are to be treated as 'bad'. This does not include any relocation of bad blocks to some good location. That might be done by the underlying device, but md doesn't do it. md just tracks which blocks are bad and which, by implication, are good.
The difficulty comes in understanding exactly what "bad" means, why we need to record badness, and what to do when we find that we might want to perform IO against a recorded bad block.
24 March 2010, 06:46 UTCA new release of wiggle
A long time ago, while in a job far far away....
Back in 2003 I wrote a program called "wiggle". Like many interesting projects it was written to scratch an itch.
While developing code for the Linux kernel I would often need to apply patches made for earlier versions against later versions. Sometimes there would be trivial conflicts and the "patch" program would just give up an create a reject file. After the 50th time that I applied a patch like this by hand it decided that enough was enough so I wrote "wiggle". It takes patches that don't quite apply properly and wiggles them in to place. If there is a change in part of the code that the patch doesn't actually change, wiggle doesn't let that get in the way. If there is a change in part of the code that the patch also changes, wiggle reports that inline as a conflict in a way that makes it easy to resolve by hand.
Since 2003 I have made a few improvements and fixed a few bugs. Just recently the Debian package of wiggle got a new maintainer who was very proactive in trying to get some patches upstream to me, and get some languishing bugs fixed.
Always keen to reward such friendly behaviour I applied the patches, fixed the bugs and finally made a new release of wiggle, the first in nearly 7 years.
Version 0.7 can be found in my git tree at git://neil.brown.name/wiggle or browsers at http://neil.brown.name/git?p=wiggle;a=summary or downloaded as a 'tar' archive from http://neil.brown.name/wiggle.
Feedback always welcome.
What I really want to know is how to get git to always use wiggle for merging conflicts. I can do it on a per-repository basis by setting the 'merge' attribute (I think) but I cannot make it automatically apply to all of my git trees...
11 February 2010, 05:03 UTCSmart or simple RAID recovery??
I frequently see comments, particularly on the linux-raid mailing list to the effect that md should be more clever when recovering from an inconsistent stripe in an array.
In particular, it is suggested that for a RAID1 with more than 2 devices, a vote should be held and if one content occurs more often than the others (e.g. 2 devices have the same content, the third is different) then the majority vote should rule and the most common content be copied over the less common content.
Similarly with RAID6 if the P and Q blocks don't match the data blocks, it may be possible to find exactly one data block which can be corrected so as to make both P and Q match - so we could change just one data block instead of two "parity" blocks to achieve consistency.
I will call this approach the "Smart recovery" approach.
The assertion is that smart recovery will not only make the stripe consistent, but will also make it "correct".
I do not agree with these comments. It is my position that if there is an inconsistency that needs to be corrected then it should be corrected in a simple predictable way and that any extra complexity is unjustified. For RAID1, that means copying to first block over all the others. For RAID6, that means calculating new P and Q blocks based on the data. This is the "simple recovery" approach.
This note is an attempt to justify this position, both to myself and to you, my loyal reader.
17 August 2009, 00:09 UTCConverting RAID5 to RAID6 and other shape changing in md/raid
Back in early 2006 md/raid5 gained the ability to increase the number of devices in a RAID5,
thus making more space available. As you can imagine, this is a slow process as every
block of data (except possibly those in the first stripe) needs to be relocated. i.e
they need to be read from one place and written to another. md/raid5 allows this reshaping to
happen while the array is live. It temporarily blocks access to a few stripes at a time while
those stripes a rearranged. So instead of the whole array being unavailable for several hours,
little bits are unavailable for a fraction of a second each.
Then in early 2007 we gained the same functionality for RAID6. This was no more complex than RAID5, it just involved a little more code and testing.
Now, in mid 2009, we have most of the rest of the reshaping options that had been planned. These include changing the stripe size, changing the layout (i.e. where the parity blocks get stored) and reducing the number of devices.
Changing the layout provides valuable functionality as it is an important part of converting a RAID5 to a RAID6.
24 February 2009, 19:53 UTCMeasuring Freerunner battery life [UPDATED]
I'm trying to avoid getting too distracted by my Openmoko Freerunner this week as I have a very different project that I want to concentrate on. But it is hard not to play with it sometimes, as it is just sitting there waiting. But fortunately I managed to find something useful to do with it that required me leaving it alone.
I thought it was time to find out how long the battery really lasts. So I fully charged it a couple of days ago and then left it sleeping with the intention of waking it up the next morning and seeing how much battery was left after a given time. Unfortunately I left it too long and the battery went completely flat, so I learned very little.
Now obviously I don't want to be checking it every so-often as that would distract me from my other project, and turning the display on all the time would be more power usage that I was wanting to measure. So I came up with the incredibly creative idea of getting the device to observe itself. That is of course the beauty of computers. They can do the work for you.
So here is the script that I wrote and ran, saving the output to a file:
while : do echo =========================================== date cat /sys/class/power_supply/battery/capacity cat /sys/class/i2c-adapter/i2c-0/0-0073/resume_reason cat /sys/class/i2c-adapter/i2c-0/0-0073/neo1973-resume.0/resume_reason /root/wkalrm +30m sleep 20 apm -s done
It very simply wakes up every 30 minutes, check the battery capacity and some other random bits of information that I wanted to check, stays away for 20 seconds, then goes back to sleep.
There are two reasons for to 20 second sleep. One is that I wanted to be able to ssh in and kill the script if I needed to get control of the device again before the power ran out completely. The other is that the Xglamo X server seemed to get confused if I suspend too soon after waking up. I guess it needs a little while to sort itself out after a resume. I haven't experimented with this much so I'm I may be misinterpreting a single failure with the wrong general cause.
And the results? The device just sat there for about 15 hours. The screen stays off the whole time. I might sometimes hear a little click from the speaker when it resumes, but that is the only external indication that anything is happening.
The sequence of battery capacity readings was
97 94 91 88 85 82 79 77 74 71 68 65 62 59 56 53 50 47 44 41 38 35 32 29 25 22 19 16 13 10
A very consistent difference of 3% every 30 minutes, except 79-77 where the difference is 2, and 29-25 where it is 4. So after 15 hours, 90% is gone. I was probably hoping for a bit more than that, but it should be workable.
This was with both the GSM and the GPS devices powered the whole time. And of course the CPU powered for 20 seconds every 30 minutes.
Later today after the device is fully charged, I'll try again with the GPS turned off. It might also be interesting to try with GSM off and GPS on. I assume Wifi and Bluetooth are turned of by suspend... I guess I should check that.
Update
So I tried with GPS and Bluetooth turned off. It didn't quite go as planned though. The samples I got are =========================================== Wed Feb 25 12:27:36 EST 2009 94 =========================================== Wed Feb 25 12:57:38 EST 2009 93 =========================================== Thu Feb 26 05:46:05 EST 2009 62i.e. the first wakeup worked, but then it didn't wake again until I woke it to check the results. My guess is that my "go to sleep when idle" program put the device to sleep during the "sleep 20". Then when the alarm woke it, the script sent the device back to sleep never to awake. I'll try again with the auto-sleep disabled.
But the results are still good. Nearly 17 hours and only 32 percent gone. That is 1/3 of the power usage for then GPS and possibly BT were powered. So while keeping the GPS on for regular sampling is possible, it does eat battery life. The chip apparently has a power-save mode where it remembers all the state data but doesn't listen to the voices from the sky. I wonder if it is possible to enter that mode during suspend...
later ... A proper run with bluetooth and gps turned off gives these samples, one per half hour:
95 94 93 92 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 73 71 70 69 68 67
Which suggests 50 hours from full to empty. So charging once a day with light use should be fine.
The GPS does have a mode where it can sleep for a set period, wake up to get a new fix, then go back to sleep. I'll give that a try when I find time. I wonder what a good 'set period' is. 30 minutes? Maybe it depends on how much you are traveling.
But more immediately, I'll test with GPS off but Bluetooth on. I'm hoping it gets turned off at suspend, but these is a simpe way to test...
Final Update... The setting for bluetooth power made no difference. It seems it is turned off when we go into suspend, at least by default.
15 February 2009, 22:42 UTCtapinput: Yet another soft keyboard for the freerunner.
In theory at least, my preferred method for entering text into my Freerunner involves using hand writing recognition. i.e. just draw the letters and other symbols on the front of the display and it will Do-The-Right-Thing. In practice, I need an alternate method, at least some of the time.
I do have some code in http://neil.brown.name/git/scribble which does some recognition. However it is only about 80% accurate (if that) and is a little bit slow (possibly because it is written in python). The low accuracy means that I really need to wait for each character to be recognised before starting the next one. And combining this with the slow matching speed means that I actually enter text quite slowly.
I have hopes for making this better and faster. But even then, I suspect that entering punctuation will be a bit of a challenge, and sometimes I just want something more reliable, if a little more awkward in other ways.
So I have a written a little toy that I call 'tapinput'. Its key characteristics are:
- Few buttons (12) so they can be big enough to easily hit accurately
- popup window hovers over other windows and can be moved around so you can still see what you are typing. The target window doesn't need to be resized so you can fit the tap-board onto the screen at the same time (as is required by some Freerunner keyboards.
- Normally 2 taps on adjacent keys will enter a symbol, though in number mode, 1 tap is used for most symbols.
For lowercase input, the window looks like this (I've scale down by a factor of two to partially compensate for that fact that the Freerunner has very small pixels).
12 February 2009, 20:54 UTCMoving to Debian on my Neo Freerunner
My Freerunner now runs Debian, which was an educational experience - and education is always a good thing.
But there is a bit of an irony here.
I changed to Debian because of the wealth of packages available. While Open Embedded (which is the base for the FSO distro that I was using) packages lots of stuff, it doesn't package everything. In particular it didn't seem to package python-xlib which I wanted so that I could play with fakekey-like keystroke generation using python code base on pykey by the author of crikey.
So now I have Debian and python-xlib and I am happy. But a problem is that a lot of the toys that people are writing for the Freerunner (like the python EFL Sudoku or the Neon image viewer) are only being packaged as ipk files for the Open Embedded distros. So if I want them I probably have to install by hand or package them myself.
So did I increase the range of available packages by going to Debian, or decrease it?
08 February 2009, 05:22 UTCWhy I wrote my own 'gsmd'
You would think that writing a program to talk dirty to the GSM controller in the Openmoko Freerunner wouldn't be at the top of many peoples TODO lists. After all, it has been done. Multiple times. And having yet another implementation (with doubtlessly a different set of bugs) is just going to hurt interoperability of applications. But I did anyway.
31 January 2009, 20:51 UTCgsm0710muxd without DBUS or ptys
Since time immemorial, the Hayes "AT" command set has been used for controlling modems and so it is only sensible
that controlling modern phones such as GSM devices should also use the AT command set. One problem is that
AT is single threaded - while you are on a data call you cannot check signal strength.
So the clever folks who designed the latest version of the "GSM over AT" spec included mutliplexing, so you can have several virtual connections to your phone, one for data, one for control, one for async notifications etc.
This is all implemented quite nicely in gsm0710muxd which is used by Openmoko (and probably elsewhere). But of course it isn't quite as nice as I wanted it. So I've hacked it a bit. My current version can be found at git://neil.brown.name/gsm0710muxd or http://neil.brown.name/git/gsm0710muxd.
There are two things I didn't like. The first is the dependence on DBUS. The second is that insistence on using PTYs to access each channel
30 January 2009, 21:18 UTCNext Freerunner toys - battery applet and runit
Two more toys that I play with on my Freerunner have just been pushed into git://neil.brown.name/freerunner aka http://neil.brown.name/git/freerunner.
The first is a simply battery monitoring applet. There are plenty of these around and mine adds nothing of real value, except that it is stand-alone and looks the way I want it to look.
It is partly based on the applet in Openmoko Panel Plugin and uses the images for battery status from there. The code (and bugs) are mine though.
The other is a little toy with the creative name of "runit". It will run whatever program it is given and display the output in a window with a button to re-run and another to close the window.
I currently use this for configuring the network (until I get a proper tool for that) and running informational commands like "hcitool scan". It allows me to make important functionality available quickly, while waiting for a more comprehensive tool to be written.
These tools show a significant part of my philosophy which is to create simple stand-alone tools what do what I want, rather the complex frameworks.
One possible problem with this is memory usage. Each python program seems to use up about 5Meg of memory that is resident and not shared. When you only have 128M, that limits you to around 20 such programs at a time. And that doesn't even allow for the kernel.
While I may not want to have 20 running at a time, I am still concerned about the memory wastage. I may end up arranging that python programs are imported rather than executed in a separate process. Python has quite nice namespace control which should make this quite managable, and the gtk.main loop makes is easy to run multiple gtk applications in the one process ... just as long as none of them do any slow processing or call gtk.main_quit.
I'll have to see how that goes....
29 January 2009, 23:46 UTCRoad map for md/raid driver - sort of
28 January 2009, 02:56 UTCScreen Lock on the Freerunner
22 February 2007, 04:22 UTCmdadm 2.6.1 released
17 June 2006, 08:24 UTCAnother TODO list : nfsd
11 June 2006, 10:13 UTCMetad - a daemon for controlling daemons
26 May 2006, 10:14 UTCmdadm 2.5 released
21 May 2006, 09:26 UTCAuto-assembly mode for mdadm
27 July 2005, 14:31 UTCTODO list for mdadm
27 July 2005, 14:15 UTCTODO list of Linux md/raid
14 July 2005, 16:16 UTCsuidrun - for providing setuid to customers when you want to mount with nosuid
06 July 2005, 16:34 UTCUser-space touchpad driver for ALPS in Latitude D800
15 June 2005, 09:55 UTCmdadm 1.12.0 released
21 May 2005, 22:25 UTCBlog thoughts
17 December 2004, 16:11 UTCLinux md/raid update - UPDATED-1
15 December 2004, 10:03 UTCLinux md/raid throughput measurements
24 November 2004, 16:33 UTCA final word on AVL trees
24 November 2004, 14:18 UTCNon-recursive algorithm for AVL tree deletion
24 November 2004, 10:18 UTCNon-recursive algorithm for AVL tree insertion
27 August 2004, 22:54 UTCRAID10 in Linux MD driver
17 June 2004, 16:35 UTCEntry