]> git.neil.brown.name Git - LaFS.git/log
LaFS.git
13 years agoUse igrab_fs for I_Pinned handling.
NeilBrown [Sun, 10 Oct 2010 23:05:12 +0000 (10:05 +1100)]
Use igrab_fs for I_Pinned handling.

We hold references on inodes when the InoIdx block is pinned.
This is needed for cleaning to make sure the inode doesn't disappear.

But we also need the superblock to be held in this context.
So use lafs_igrab_fs

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoUse igrab_fs to hod refcounts on the inodes of orphans.
NeilBrown [Sun, 10 Oct 2010 22:58:36 +0000 (09:58 +1100)]
Use igrab_fs to hod refcounts on the inodes of orphans.

We currently hold a refcount on the inodes of dir orphans.
We need the filesystem (super_block) as well.

Also, while we don't really need a similar refcount for inode orphans,
it doesn't hurt.

So simplify the tracking of whether we need to take such a refcount,
use iget_fs to grab the super_block as well, and also take a ref in
lafs_add_orphans, which was missing.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFix lafs_iget_fs for subset filesystems.
NeilBrown [Sun, 3 Oct 2010 09:33:16 +0000 (20:33 +1100)]
Fix lafs_iget_fs for subset filesystems.

This requires spliting code out from the s_get function so
we can just get a super_block given the parent inode.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoReturn ref on sb as well as ino from lafs_iget_fs
NeilBrown [Sun, 3 Oct 2010 09:04:27 +0000 (20:04 +1100)]
Return ref on sb as well as ino from lafs_iget_fs

As the sb might not be mounted, we need to hold a reference.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoroll forward clean up
NeilBrown [Sat, 2 Oct 2010 11:56:29 +0000 (21:56 +1000)]
roll forward clean up

More validation
improved mem allocation
general clean up

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoREADME/comment update
NeilBrown [Sat, 2 Oct 2010 11:23:41 +0000 (21:23 +1000)]
README/comment update

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoBe more robust in face of read errors on orphan list.
NeilBrown [Sat, 2 Oct 2010 11:15:39 +0000 (21:15 +1000)]
Be more robust in face of read errors on orphan list.

This isn't *very* robust, but we shouldn't BUG now.  Worst case
is we loose some orphans and some orphan slots.  fsck will have
to deal with that.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoorphan: replace some pointless tests with BUGs.
NeilBrown [Sat, 2 Oct 2010 06:56:55 +0000 (16:56 +1000)]
orphan: replace some pointless tests with BUGs.

All orphan file blocks always have a reference and are Valid,
so lots of testing is not needed.

Also, make sure this really is true when reading the orphan file
at mount time.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCombine dirty_iblock with setting Realloc on iblock.
NeilBrown [Sat, 2 Oct 2010 05:13:25 +0000 (15:13 +1000)]
Combine dirty_iblock with setting Realloc on iblock.

This makes it easier to do the right thing on iblocks that
we have just split ... not that we would expect that when cleaning,
but lots of things are possible, and elegant code is good.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoDon't use I_ICredit for UnincCredit when cleaning.
NeilBrown [Sat, 2 Oct 2010 04:31:21 +0000 (14:31 +1000)]
Don't use I_ICredit for UnincCredit when cleaning.

ICredit is only to be used when dirtying a block, so any setting of
UnincCredit for cleaning must get the credit from elsewhere.  If no
such credit is available, fall back on dirtying the block.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMinor comments etc
NeilBrown [Sat, 2 Oct 2010 02:51:50 +0000 (12:51 +1000)]
Minor comments etc

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoUse wait_on_bit / wake_bit to get more wait_queues
NeilBrown [Sat, 2 Oct 2010 02:41:00 +0000 (12:41 +1000)]
Use wait_on_bit / wake_bit to get more wait_queues

... rather than having just one wait queue for all IO.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoREADME update and comment fix.
NeilBrown [Sat, 2 Oct 2010 01:52:16 +0000 (11:52 +1000)]
README update and comment fix.

Indeed, there is nothing we can do about errors during truncate,
except ignore them.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agouse little-endian bit operations for inode usage map.
NeilBrown [Sat, 2 Oct 2010 01:46:04 +0000 (11:46 +1000)]
use little-endian bit operations for inode usage map.

Must not use hos-endian here, so use generic 'le' operations.
Also protect all operations with i_mutex.  Even if the bitops
were atomic, we need the locking when punching a hole in the file,
or adding a new block.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoAdd proper locking to inode_handle_orphan
NeilBrown [Sat, 2 Oct 2010 00:59:32 +0000 (10:59 +1000)]
Add proper locking to inode_handle_orphan

When walking the indexblock looking for things to purge
we need to hold the inode private_lock.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoImprove handling of snapshot name.
NeilBrown [Sat, 2 Oct 2010 00:42:34 +0000 (10:42 +1000)]
Improve handling of snapshot name.

Name stored in fileset inode is now variable length and empty
on subordinate filesets.  Snapshots have space for a name depending
on how much space was allocated when fs was created.

Name is only used for snapshots.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFix calculation of table_size
NeilBrown [Fri, 1 Oct 2010 12:40:13 +0000 (22:40 +1000)]
Fix calculation of table_size

I was confused about which table I was sizing.
This is the table of which there are several in the segusage  files.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMinor formatting improvements.
NeilBrown [Fri, 1 Oct 2010 12:38:48 +0000 (22:38 +1000)]
Minor formatting improvements.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFix bug in sort_block
NeilBrown [Fri, 24 Sep 2010 01:43:42 +0000 (11:43 +1000)]
Fix bug in sort_block

It doesn't handle 2 blocks the same - which isn't a big deal, but it
is best to have the code 'right'

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoUpdate filesys mtime
NeilBrown [Sun, 19 Sep 2010 12:09:00 +0000 (22:09 +1000)]
Update filesys mtime

Do this when inode is dirtied.  Maybe this isn't the perfect time,
but it is fairly good for now.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agouse i_mtime for file-set update time.
NeilBrown [Sun, 19 Sep 2010 11:59:35 +0000 (21:59 +1000)]
use i_mtime for file-set update time.

No need to have a separate field in the inode.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoProcess orphan file during mount.
NeilBrown [Sun, 19 Sep 2010 11:46:46 +0000 (21:46 +1000)]
Process orphan file during mount.

We need to read the orphan file to set nextfree, and then to
add all the blocks to the orphan list.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoREADME update
NeilBrown [Sun, 19 Sep 2010 04:42:24 +0000 (14:42 +1000)]
README update

13 years agoroll-forward: update youth block for new segments.
NeilBrown [Sun, 19 Sep 2010 04:39:31 +0000 (14:39 +1000)]
roll-forward: update youth block for new segments.

Any new segments found during roll-forward need their youth value set.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoUpdate youth during seg_apply_all
NeilBrown [Sun, 19 Sep 2010 04:23:18 +0000 (14:23 +1000)]
Update youth during seg_apply_all

If we added blocks to a segment is seg_apply_all, make sure the youth
has been updated properly.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agosplit out set_youth.
NeilBrown [Sun, 19 Sep 2010 04:16:00 +0000 (14:16 +1000)]
split out set_youth.

Setting of the youth value is now a separate function.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGeneralise segunused to seg_pop
NeilBrown [Sun, 19 Sep 2010 04:05:41 +0000 (14:05 +1000)]
Generalise segunused to seg_pop

And use seg_pop more broadly,
and re-arrange some code.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoIntroduce DelayYouth
NeilBrown [Sat, 18 Sep 2010 13:00:48 +0000 (23:00 +1000)]
Introduce DelayYouth

When writing out the accounting blocks we need to not update the youth
block if we happen to start a new segment.
We already do that at unmount time, so generalise it with a new flag.

Signed-Off-By: NeilBrown <neilb@suse.de>
13 years agoUse segsum rather than separate dblock for youth in lafs_free_get
NeilBrown [Sat, 18 Sep 2010 12:40:54 +0000 (22:40 +1000)]
Use segsum rather than separate dblock for youth in lafs_free_get

As segsum now always have a youthblk for ss==0, we don't need to
separately find a data block.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agosegment.c: allow the else branch to just fall-through
NeilBrown [Sat, 18 Sep 2010 12:25:42 +0000 (22:25 +1000)]
segment.c: allow the else branch to just fall-through

The 'then' branch does a goto, so we don't need the else.
This patch is mostly just re-indenting.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agosegments: swap then/else branches.
NeilBrown [Sat, 18 Sep 2010 12:24:36 +0000 (22:24 +1000)]
segments: swap then/else branches.

Swap 'then' and 'else' branches, inverting condition

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFactor out seg_add_new
NeilBrown [Sat, 18 Sep 2010 06:38:46 +0000 (16:38 +1000)]
Factor out seg_add_new

We had multiple places that add entries to the segtracker.  Now we
only have one.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoadd_block_address: be careful with physaddr == 0
NeilBrown [Fri, 17 Sep 2010 06:00:15 +0000 (16:00 +1000)]
add_block_address: be careful with physaddr == 0

An address of '0' is not consecutive with and address of '1' and while
it is very unlikely to ever be a problem, make sure we don't try to
combine those addresses into a range in the uninc table.

Also remove a comment about a possible problem that doesn't seem to be
a real problem.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoSimplify test in lafs_refile
NeilBrown [Fri, 17 Sep 2010 05:47:34 +0000 (15:47 +1000)]
Simplify test in lafs_refile

If index block has non-zero pending_cnt, then it must be
dirty or realloc, so we can drop a test here.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agolafs_refile: lots of locking and refcount changes.
NeilBrown [Fri, 17 Sep 2010 05:38:10 +0000 (15:38 +1000)]
lafs_refile: lots of locking and refcount changes.

See the comments added to lafs.h for more detail.

There are really several changes and fixes in here but I don't think
it is worth the effort to separate them all out.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agore-indent lafs_file.
NeilBrown [Wed, 15 Sep 2010 05:20:36 +0000 (15:20 +1000)]
re-indent lafs_file.

This patch is simply a re-indent.  No code change.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agolafs_refile: group all functionality that requires refcnt==0
NeilBrown [Wed, 15 Sep 2010 05:19:50 +0000 (15:19 +1000)]
lafs_refile:  group all functionality that requires refcnt==0

There are multiple silly (racy) tests on refcnt==dec.  Combine all
these tests into a single 'if' which is run only when the refcnt
reaches zero.

This patch just changes the code without fixing the indentation.  That
makes it easier to review.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agolafs_refile: factor out check for unpinnable data blocks.
NeilBrown [Wed, 15 Sep 2010 04:57:02 +0000 (14:57 +1000)]
lafs_refile: factor out check for unpinnable data blocks.

Data blocks can be unpinned while refcounts are still present.

Factor that case out into a stand-alone test.

We still test for datablocks being unpinnable in the 'refcnt==0'
case as PinPending might have only just been cleared.

This leaves a large bulk of tests all of which depend on
refcnt reaching zero which can be improved in a subsequent patch.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoSmall lafs_refile clean up
NeilBrown [Wed, 15 Sep 2010 04:34:23 +0000 (14:34 +1000)]
Small lafs_refile clean up

We don't need the 'onlru' variable any more.
And we shouldn't really delete data  blocks from the lru at that
point, as they could be on a cluster or io-pending list.
Index blocks are safe as their refcount is zero, so they can only
be on the leaf lru.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoDon't hold counted reference while on leafs list.
NeilBrown [Wed, 15 Sep 2010 03:58:39 +0000 (13:58 +1000)]
Don't hold counted reference while on leafs list.

Doing so makes the refcount calculations in lafs_refile messy.

We don't drop a block while it is dirty etc anyway so there is no loss

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoHold reference while erasing blocks in lafs_invalidate_page
NeilBrown [Wed, 15 Sep 2010 03:56:02 +0000 (13:56 +1000)]
Hold reference while erasing blocks in lafs_invalidate_page

else we sometimes violate assertions about having non-zero refcount

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoTake a ref on dblock whenever the InoIdx block has a ->parent link.
NeilBrown [Wed, 15 Sep 2010 02:57:07 +0000 (12:57 +1000)]
Take a ref on dblock whenever the InoIdx block has a ->parent link.

This will be needed to ensure the dblock stays referenced when we stop
holding a reference to blocks on the 'leafs' lists.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agogetiref_locked fixes
NeilBrown [Tue, 14 Sep 2010 07:01:21 +0000 (17:01 +1000)]
getiref_locked fixes

The #defines were a bit wrong here, so it would not have compiled
properly without DEBUG_REF

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoChange readpage to use ->chain instead of ->lru
NeilBrown [Tue, 14 Sep 2010 05:44:41 +0000 (15:44 +1000)]
Change readpage to use ->chain instead of ->lru

->lru is rather overused, and ->chain is certainly never used on
non-B_Valid blocks, so convert readpage to use chain instead.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agolafs_refile - split out set_lru
NeilBrown [Tue, 14 Sep 2010 04:22:31 +0000 (14:22 +1000)]
lafs_refile - split out set_lru

setting the lru is not dependent on other changes so it can be move to
the top and outside the lock.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agolafs_refile: separate out consistency checking
NeilBrown [Tue, 14 Sep 2010 03:00:20 +0000 (13:00 +1000)]
lafs_refile: separate out consistency checking

lafs_refile is too big and clumsy.  Time to tidy up.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoREADME update and some comment fixes.
NeilBrown [Tue, 14 Sep 2010 01:50:34 +0000 (11:50 +1000)]
README update and some comment fixes.

Still working through the TODO list.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMake sure checkpoint happens after a timeout.
NeilBrown [Tue, 14 Sep 2010 01:10:25 +0000 (11:10 +1000)]
Make sure checkpoint happens after a timeout.

If anything has been written since last check point, ensure another
checkpoint happens within 30 seconds.

This is partly to ensure that index block don't remain dirty forever.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agotidy lafs_get_flushable a bit.
NeilBrown [Mon, 13 Sep 2010 08:15:46 +0000 (18:15 +1000)]
tidy lafs_get_flushable a bit.

There is some code in the wrong place - probably a hang over from a
previous arrangement before we made lafs_is_leaf a function.
No function change here.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRemove old comment
NeilBrown [Mon, 13 Sep 2010 07:58:13 +0000 (17:58 +1000)]
Remove old comment

that issue is already resolved - commit f90959e6f492b6

13 years agoLog symlink creation
NeilBrown [Mon, 13 Sep 2010 07:33:54 +0000 (17:33 +1000)]
Log symlink creation

We cannot include it in an update, so just make sure it goes in the
next write cluster.  This will be before an sync or fsync and
roll-forward should pick it up, so all is OK

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRemove lafs_space_use
NeilBrown [Mon, 13 Sep 2010 07:13:49 +0000 (17:13 +1000)]
Remove lafs_space_use

it is identical to lafs_space_return, and no-one uses it
anyway.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agocluster_allocate - minor code rearrangement.
NeilBrown [Mon, 13 Sep 2010 05:44:05 +0000 (15:44 +1000)]
cluster_allocate - minor code rearrangement.

extract common code and general clean up

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGet SegRef on parent of phys==0 blocks.
NeilBrown [Mon, 13 Sep 2010 05:28:08 +0000 (15:28 +1000)]
Get SegRef on parent of phys==0 blocks.

We still need those parents to be segrefed!

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRemove pinning of iblock in place of dblock
NeilBrown [Mon, 13 Sep 2010 05:07:27 +0000 (15:07 +1000)]
Remove pinning of iblock in place of dblock

We now allow both iblock and dblock to be pinned at the same time.
So when pinning the inode dblock, just do it and don't go bothering
the inode iblock.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRefinements for triggering checkpoint when we are low on space.
NeilBrown [Mon, 13 Sep 2010 04:39:18 +0000 (14:39 +1000)]
Refinements for triggering checkpoint when we are low on space.

This is getting messy but seems to work.

Not sure now on the difference between CleanerBlocks and
EmergencyPending.
I guess the one makes sure the cleaner does what it can and then
triggers a checkpoint.
The other prepares for EmergencyClean to be set after the next
checkpoint.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoClean up error returns in lafs_reserve_block
NeilBrown [Sun, 12 Sep 2010 23:46:36 +0000 (09:46 +1000)]
Clean up error returns in lafs_reserve_block

We only want to consider EAGAIN if lafs_prealloc returns an error.
When other calls return an error, we want to pass exactly that error
back.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoNever set SegRef on an InoIdx block
NeilBrown [Sun, 15 Aug 2010 08:39:28 +0000 (18:39 +1000)]
Never set SegRef on an InoIdx block

An InoIdx block doesn't have a uptodate ->physaddr, that is only
updated in the data block.  So setting SegRef on it is pointless.

Instead, when we find an InoIdx block while setting SegRef, use the
data block instead.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoWait for a checkpoint before returning ENOSPC
NeilBrown [Sun, 15 Aug 2010 08:30:36 +0000 (18:30 +1000)]
Wait for a checkpoint before returning ENOSPC

If we seem to run out of space, it is worth waiting for
a checkpoint as that might free up some space.  So add
an extra step to the sequence leading from 'no space' to 'ENOSPC'.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoclean.c - assorted tidy-ups
NeilBrown [Sun, 15 Aug 2010 05:08:49 +0000 (15:08 +1000)]
clean.c - assorted tidy-ups

Change some magic constants into named constants, and
improves some comments.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCheck if dirblock can be orphan before making it one.
NeilBrown [Sat, 14 Aug 2010 12:25:38 +0000 (22:25 +1000)]
Check if dirblock can be orphan before making it one.

Only certain sorts of deletions can make a directory block
into and orphan - check them out before committing resources.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoflush orphans when renaming to empty directory just like rmdir
NeilBrown [Sat, 14 Aug 2010 11:50:14 +0000 (21:50 +1000)]
flush orphans when renaming to empty directory just like rmdir

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCleaner: add a memory barrier to ensure we see i_size promptly.
NeilBrown [Sat, 14 Aug 2010 11:37:56 +0000 (21:37 +1000)]
Cleaner: add a memory barrier to ensure we see i_size promptly.

There is a possible race that we need a barrier to protect against.
I think... or hope.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCleaner: remove pointless signed compare.
NeilBrown [Sat, 14 Aug 2010 11:11:12 +0000 (21:11 +1000)]
Cleaner: remove pointless signed compare.

bcnt can never be negative, so don't pretend.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoAllow cleaner to skip new filesystems.
NeilBrown [Sat, 14 Aug 2010 11:09:40 +0000 (21:09 +1000)]
Allow cleaner to skip new filesystems.

If we can tell the a subset-filesystem is too new to
match what we find in the write-cluster, we can skip it
quickly.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoWhen cleaner finds a block beyond EOF, ignore whole descriptor.
NeilBrown [Sat, 14 Aug 2010 11:05:55 +0000 (21:05 +1000)]
When cleaner finds a block beyond EOF, ignore whole descriptor.

All other blocks in descriptor must also be beyond EOF,
so ignore them all at once.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoUpdate trunc_gen whenever we truncate a file to zero.
NeilBrown [Sat, 14 Aug 2010 11:04:08 +0000 (21:04 +1000)]
Update trunc_gen whenever we truncate a file to zero.

That allows some minor optimisations in the cleaner to work.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agocleaner_parse: use truncate number to avoid looking at old inodes.
NeilBrown [Sat, 14 Aug 2010 10:55:23 +0000 (20:55 +1000)]
cleaner_parse: use truncate number to avoid looking at old inodes.

That is why we have the truncnumber in the cluster head after all.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoAllow cleaner_parse to request multiple inodes at once.
NeilBrown [Sat, 14 Aug 2010 10:52:11 +0000 (20:52 +1000)]
Allow cleaner_parse to request multiple inodes at once.

Currently cleaner_parse stops when it hits an inode that it cannot
load immediately.  This reduced the opportunities for parallelism.

Instream allow up to 16 -EAGAINs from inode lookups.
This requires that we mark headers for inodes which failed, and
always start again from the beginning of the cluster head.
We already reduce the bcnt to 0, so for inodes that can be
found, we won't lookup the blocks twice.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRefactor try_clean
NeilBrown [Sat, 14 Aug 2010 10:34:45 +0000 (20:34 +1000)]
Refactor try_clean

It is a very big function - change it to 3 moderate sized functions.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGive to_clean.ss a meaningful name.
NeilBrown [Sat, 14 Aug 2010 10:08:53 +0000 (20:08 +1000)]
Give to_clean.ss a meaningful name.

It is a flag set when we have a valid segment address.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoKeep track of 'seq' number in cleaner
NeilBrown [Sat, 14 Aug 2010 08:17:48 +0000 (18:17 +1000)]
Keep track of 'seq' number in cleaner

When reading cluster-heads, track the seq number, both for validation
and, later, for optimisation.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoREADME update
NeilBrown [Sat, 14 Aug 2010 08:04:47 +0000 (18:04 +1000)]
README update

13 years agoClean up interaction between cleaner and checkpoint.
NeilBrown [Sat, 14 Aug 2010 06:48:39 +0000 (16:48 +1000)]
Clean up interaction between cleaner and checkpoint.

If a checkpoint is wanted, the cleaner shouldn't start any more work.
If the cleaner or segscan is active a checkpoint cannot start, but
when they complete they should wake the checkpoint process.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoRelease stray B_Async blocks if we find them.
NeilBrown [Sat, 14 Aug 2010 06:38:14 +0000 (16:38 +1000)]
Release stray B_Async blocks if we find them.

There could still be some stray index blocks...
maybe fix that later.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCombine cleaning and orphan list_heads.
NeilBrown [Sat, 14 Aug 2010 05:41:59 +0000 (15:41 +1000)]
Combine cleaning and orphan list_heads.

A datablock is very rarely both an orphan and requiring cleaning, so
having two list_heads is a waste.

If is an orphan it will have full parent linkage and addresses already
so it will be handled promptly and removed from the cleaning list.

So arrange that if a block wants to be both, it is preferentially on
the cleaning list, and when removed from the cleaning list is gets
added back to the pending_orphan list in case it needs processing.

Note that only directory and inode blocks can ever be orphans so some
optimisation of spinlocks is possible.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoChange when orphan blocks are refcounted.
NeilBrown [Sat, 14 Aug 2010 05:16:37 +0000 (15:16 +1000)]
Change when orphan blocks are refcounted.

Count when while B_Orphan is set, rather than while on a list.

This gives us some freedom to do different things with the list,
and ensures that we never lose the flag by the block disappearing.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoUse a new flag to identify blocks being processed by the cleaner.
NeilBrown [Sat, 14 Aug 2010 05:01:34 +0000 (15:01 +1000)]
Use a new flag to identify blocks being processed by the cleaner.

This will help future patch which will unify cleaning and orphans
list_heads, and make is clear when a refcount is being help.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoApply single-exit pattern in try_clean
NeilBrown [Sat, 14 Aug 2010 04:43:11 +0000 (14:43 +1000)]
Apply single-exit pattern in try_clean

This removes a lot of duplications

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoImplement youth decay.
NeilBrown [Sat, 14 Aug 2010 03:36:36 +0000 (13:36 +1000)]
Implement youth decay.

 - After a checkpoint, check if we are close enough to the end
   of youth space to need a decay.
 - when we record a new youth number, un-decay it if the block hasn't
   been decayed yet (and convert endian properly)
 - Change scan_seg to updates free_block/free_dev atomically in just
   one place, and do a block worth of decay at that point.
   As part of this, the youth block is only released at one place now.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoUse the right value of creation_age of subsets.
NeilBrown [Sat, 14 Aug 2010 02:08:38 +0000 (12:08 +1000)]
Use the right value of creation_age of subsets.

It should be cluster seq number.  This never wraps and
is used to compare against write cluster to trim searches of new
filesets early.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoEnsure lafs_orphan_release doesn't block too much
NeilBrown [Sat, 14 Aug 2010 01:16:04 +0000 (11:16 +1000)]
Ensure lafs_orphan_release doesn't block too much

Make sure orphan->i_mutex isn't held for long
periods, and ensure that orphan_abort doesn't block in
erase_dblock.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoREADME update
NeilBrown [Fri, 13 Aug 2010 11:58:57 +0000 (21:58 +1000)]
README update

13 years agoTune checkpoint freq by segments, not blocks.
NeilBrown [Fri, 13 Aug 2010 11:56:49 +0000 (21:56 +1000)]
Tune checkpoint freq by segments, not blocks.

If nothing else does, we should force a checkpoint
every few segments rather than every so-many blocks.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoSeparate thread management from the cleaning.
NeilBrown [Fri, 13 Aug 2010 11:48:36 +0000 (21:48 +1000)]
Separate thread management from the cleaning.

The thread does a lot more than just 'clean' so don't call it the
'cleaner' any more - just the 'thread' or 'lafsd'.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoroll: don't update index if block address hasn't changed.
NeilBrown [Fri, 13 Aug 2010 11:32:11 +0000 (21:32 +1000)]
roll: don't update index if block address hasn't changed.

This is quite possible if the block was pushed out in the previous
phase, and could save some work

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoCreate a backing_dev_info for a lafs filesystem.
NeilBrown [Mon, 9 Aug 2010 04:06:58 +0000 (14:06 +1000)]
Create a backing_dev_info for a lafs filesystem.

As a lafs filesystem can span multiple devices, we need our own
bdi to handle congestion notification and unplugging.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoGive subset objects their own operations
NeilBrown [Fri, 13 Aug 2010 09:36:06 +0000 (19:36 +1000)]
Give subset objects their own operations

And make sure they 'stat' like the sort of directory
that can be used to create them.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoAdd missing set_anon_super for subset mounts
NeilBrown [Fri, 13 Aug 2010 06:43:37 +0000 (16:43 +1000)]
Add missing set_anon_super for subset mounts

oops - missed that.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoAdd test code for subset mounts.
NeilBrown [Fri, 13 Aug 2010 06:26:55 +0000 (16:26 +1000)]
Add test code for subset mounts.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoMaybe add a bug-on in dirty_dblock
NeilBrown [Fri, 13 Aug 2010 06:26:30 +0000 (16:26 +1000)]
Maybe add a bug-on in dirty_dblock

I think we want this bug_on, but it doesn't quite work
yet - leave it as a reminder of '15ca/' in README

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoBetter handling of changing a Directory into an InodeFile
NeilBrown [Fri, 13 Aug 2010 06:23:28 +0000 (16:23 +1000)]
Better handling of changing a Directory into an InodeFile

- actually change the type !!!
- make sure the on-disk block gets a proper index update.

Note that the checkpointing before creating things in the FS is
important for this to be correct.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoVarious fixes for lafs_get_subset
NeilBrown [Fri, 13 Aug 2010 06:14:27 +0000 (16:14 +1000)]
Various fixes for lafs_get_subset

- make sure root directory is created if it doesn't exist
- also create inode usage map.
- hold a ref on the inode while the fs is mounted.
- free the sb_key at unmount.
- set s_bdi from the prime_sb

13 years agolafs_get_subset: balance locks properly
NeilBrown [Fri, 13 Aug 2010 06:09:53 +0000 (16:09 +1000)]
lafs_get_subset: balance locks properly

We drop the mutex outside the 'if' so we must take it outside
the 'if' too - which is safer as well.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoFix lafs_put_super for subset mounts.
NeilBrown [Fri, 13 Aug 2010 06:06:06 +0000 (16:06 +1000)]
Fix lafs_put_super for subset mounts.

- we still need a checkpoint - though not a final one - to ensure
  that all dirty blocks from the fileset are written.

- We it isn't the root for a snapshot, we don't want to put
  to root inode - the root inode will be in the main filesystem,
  not in this one.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoImprove lafs_iget_fs
NeilBrown [Fri, 13 Aug 2010 04:00:56 +0000 (14:00 +1000)]
Improve lafs_iget_fs

Allow getting inodes in other filesystem.

This isn't quite perfect yet though.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoSet PinPending in flush_data_to_inode.
NeilBrown [Fri, 13 Aug 2010 03:59:10 +0000 (13:59 +1000)]
Set PinPending in flush_data_to_inode.

Should always have this set when we pin a block.
It keeps the block pinned until it is dirtied.
As lafs_pin_block does a refile at the end, it can drop the Pinned
state as soon as it is set.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoiget_my_inode - fix for case of ino == NULL
NeilBrown [Fri, 13 Aug 2010 03:56:57 +0000 (13:56 +1000)]
iget_my_inode - fix for case of ino == NULL

igrab doesn't handle NULL inodes, so we must.

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agolafs_write_end: set new file size correctly.
NeilBrown [Fri, 13 Aug 2010 03:55:46 +0000 (13:55 +1000)]
lafs_write_end: set new file size correctly.

We were setting the size to the start of the write, not the end!!

Signed-off-by: NeilBrown <neilb@suse.de>
13 years agoDiscard filesys field from lafs_inode
NeilBrown [Wed, 11 Aug 2010 23:42:56 +0000 (09:42 +1000)]
Discard filesys field from lafs_inode

i_sb can be used just as well.

Signed-off-by: NeilBrown <neilb@suse.de>