README update

author NeilBrown <neilb@suse.de>

Fri, 4 Mar 2011 01:47:29 +0000 (12:47 +1100)

committer NeilBrown <neilb@suse.de>

Fri, 4 Mar 2011 01:47:29 +0000 (12:47 +1100)
author NeilBrown <neilb@suse.de>
Fri, 4 Mar 2011 01:47:29 +0000 (12:47 +1100)
committer NeilBrown <neilb@suse.de>
Fri, 4 Mar 2011 01:47:29 +0000 (12:47 +1100)
diff --git a/README b/README

index 44150dfb4f67ab778b3af78ea76fe13916cd9367..a5e15f48c4d5dfaf62c05644f147ecdac558a477 100644 (file)
--- a/README
+++ b/README
@@ -5292,23 +5292,22 @@ DONE 15cf/ lafs_iget_fs need to sometimes to in-kernel mounts for subset filesys
  
  36/ review roll-forward
  
-36a/  make sure files with nlink == 0 are handled well
+DONE 36a/  make sure files with nlink == 0 are handled well
  DONE 36b/  sanity check before trusting clusters
-36c/ handle miniblocks which create new inodes.
-36d/ Handle DescHole in roll_block
-36e/ When dirtying a block in roll_block, maybe use writeback rather
+DONE 36c/ handle miniblocks which create new inodes.
+DONE 36d/ Handle DescHole in roll_block
+DONE 36e/ When dirtying a block in roll_block, maybe use writeback rather
       than just iolock, for consistency...
-37f/ What to do if table becomes full when add_block_address in
+DONE 36f/ What to do if table becomes full when add_block_address in
       roll_block ??
-37g/ Write roll_mini for directories.
-37h/ In roll_one, use the cluster counting code to find block number and
+36g/ Write roll_mini for directories.
+DONE 36h/ In roll_one, use the cluster counting code to find block number and
       make sure we don't exceed the segment.
-37i/ add more general error checking to lafs_mount - 
+DONE 36i/ add more general error checking to lafs_mount - 
              lafs_iget orphans and segsum.  Check type is correct.
           errors from lafs_count_orphans or lafs_add_orphans.
           alloc_page failure for chead - maybe allocate something bigger??
  
-
  37/ Configure index block hash_table at run time base on mem size??
  
  38/ striped layout
@@ -5392,6 +5391,12 @@ DONE 52/ NFS export
     realloc or dirty rather than lafs_allocated_block doing it.?
     See also 15ad below.
  
+66/ Delay writeout of directory updates until an fsync.  If a checkpoint happens
+   first, discard the updates (and fsync waits for checkpoint to complete).
+   If a cross-directory rename happens care is needed:  either flush updates
+   first or ensure that a flush does happen before the cross-directory
+   update is flushed.
+
  26June2010
   Investigating 5a
  
@@ -6914,3 +6919,57 @@ WritePhase - what is that all about?
  
      So that is all done now, except I don't hold refs on snapshots in the cleaner
      yet.
+
+11oct2010
+ DescHole
+   - When is this used? directory etc don't need it.
+   - a regular file might, but there is no API to punch
+     a hole.... yet I guess.
+   - So we just want to allocate these blocks to 0.
+
+15oct2010 - happy birthday Daniel...
+ Looking at 36:
+  a/ files with nlink==0;
+        If we happen to find them, we hold a reference until all roll-forward
+        is done, incase a name is found - it is important not to start deletion
+        early.
+
+18oct2010
+  36g - write roll_mini for directories.
+   We get a name, an inode number, and one of:
+      LINK UNLINK REN_SOURCE REN_NEW_TARGET REN_OLD_TARGET
+
+   The REN_SOURCE is linked with a REN_*_TARGET which could be in a
+   different directory, so we need to stash the SOURCE until the TARGET
+   arrives.
+   We simply impose the implied change on the directory and update the
+   link count in the target inode.
+   So:
+     load the inode
+     possibly record REN_SOURCE for later
+
+     calls prepare/pin/commit as appropriate.
+     Put the inode on orphan list if appropriate - needs care
+        as we retarget orphan list.
+     update inode link count.
+
+   (28Feb2011)
+   Just a refresh on the purpose of these updates.
+   1/ They allow us to fsync a directory without performing a full checkpoint.
+     As directory blocks are not processed in roll-forward we need the update
+     for data to be safe.  As fsync of directories are rare in some common
+     situations we could avoid actually writing these.  Simply queue them
+     internally and discard them on a checkpoint.  If an fsync comes before the
+     checkpoint, only then do we write them out.  If there are any cross-directory
+     renames then the preceeding updates in both directories need to be flushed
+     before the cross-directory rename.  It might be easier to always flush on
+     a cross-directory rename.
+   2/ They ensure consistency of inode link-count wrt to names in the filesystem,
+     but as link count is only updated by these (or a checkpoint) there is no
+     problem with delaying.
+
+   So: when replaying these we must update the directory content and the inode
+   link count.
+   It is OK to delay the write-out of these until an fsync, and not bother
+   if a checkpoint happens.
+   So add that to th TODO list - item 66.
author	NeilBrown <neilb@suse.de>
	Fri, 4 Mar 2011 01:47:29 +0000 (12:47 +1100)
committer	NeilBrown <neilb@suse.de>
	Fri, 4 Mar 2011 01:47:29 +0000 (12:47 +1100)