28 February 2009, 12:37 UTCThe LaFS directory structure
I've spent most of this week working on my filesystem - LaFS, having not touched it since December. As usual, working on it re-awakens my enthusiasm for it and as I seem to be in a blogging mood, I'm going to write about it. Currently. my favourite part of my filesystem is the directory structure. So I'm going to write about that.
The difficulty with designing directories for a Posix filesystem is that Posix provides two ways to find entries in a directory.
The most obvious way to find an entry is to look it up by name. So to be able to implement large directories at all efficiently, you need an index that can lead you quite quickly from a name to an entry.
However Posix also requires you to be able to look up an entry given a small fixed-length identified. This is needed to implement seekdir. The filesystem gets to choose the identifier and it returns it via the readdir or telldir function. However whatever identifier is returned must continue to work for that entry indefinitely until that entry is removed. There is no mechanism for the identifer to time out or be refreshed. It must be really stable. Even if Posix didn't require this, NFS does. To be able to export a filesystem via NFS, you really need to be able to find entries given small fixed-length keys.
As names in directories are not fixed length, and the maximum length is not small, this seems to imply that you need two separate indexes for a directory, and then need to keep them in-sync with each other, and a number of filesystems do just this.
My clever idea, which I only realised after failing to make a couple of other approaches work, and after arguing with Ted T'so about the ext3 directory structure, is that you can get by with only one index. Here is how.