]>
git.neil.brown.name Git - LaFS.git/log
NeilBrown [Sat, 2 Oct 2010 04:31:21 +0000 (14:31 +1000)]
Don't use I_ICredit for UnincCredit when cleaning.
ICredit is only to be used when dirtying a block, so any setting of
UnincCredit for cleaning must get the credit from elsewhere. If no
such credit is available, fall back on dirtying the block.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 2 Oct 2010 02:51:50 +0000 (12:51 +1000)]
Minor comments etc
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 2 Oct 2010 02:41:00 +0000 (12:41 +1000)]
Use wait_on_bit / wake_bit to get more wait_queues
... rather than having just one wait queue for all IO.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 2 Oct 2010 01:52:16 +0000 (11:52 +1000)]
README update and comment fix.
Indeed, there is nothing we can do about errors during truncate,
except ignore them.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 2 Oct 2010 01:46:04 +0000 (11:46 +1000)]
use little-endian bit operations for inode usage map.
Must not use hos-endian here, so use generic 'le' operations.
Also protect all operations with i_mutex. Even if the bitops
were atomic, we need the locking when punching a hole in the file,
or adding a new block.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 2 Oct 2010 00:59:32 +0000 (10:59 +1000)]
Add proper locking to inode_handle_orphan
When walking the indexblock looking for things to purge
we need to hold the inode private_lock.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 2 Oct 2010 00:42:34 +0000 (10:42 +1000)]
Improve handling of snapshot name.
Name stored in fileset inode is now variable length and empty
on subordinate filesets. Snapshots have space for a name depending
on how much space was allocated when fs was created.
Name is only used for snapshots.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 1 Oct 2010 12:40:13 +0000 (22:40 +1000)]
Fix calculation of table_size
I was confused about which table I was sizing.
This is the table of which there are several in the segusage files.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 1 Oct 2010 12:38:48 +0000 (22:38 +1000)]
Minor formatting improvements.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 24 Sep 2010 01:43:42 +0000 (11:43 +1000)]
Fix bug in sort_block
It doesn't handle 2 blocks the same - which isn't a big deal, but it
is best to have the code 'right'
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 19 Sep 2010 12:09:00 +0000 (22:09 +1000)]
Update filesys mtime
Do this when inode is dirtied. Maybe this isn't the perfect time,
but it is fairly good for now.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 19 Sep 2010 11:59:35 +0000 (21:59 +1000)]
use i_mtime for file-set update time.
No need to have a separate field in the inode.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 19 Sep 2010 11:46:46 +0000 (21:46 +1000)]
Process orphan file during mount.
We need to read the orphan file to set nextfree, and then to
add all the blocks to the orphan list.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 19 Sep 2010 04:42:24 +0000 (14:42 +1000)]
README update
NeilBrown [Sun, 19 Sep 2010 04:39:31 +0000 (14:39 +1000)]
roll-forward: update youth block for new segments.
Any new segments found during roll-forward need their youth value set.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 19 Sep 2010 04:23:18 +0000 (14:23 +1000)]
Update youth during seg_apply_all
If we added blocks to a segment is seg_apply_all, make sure the youth
has been updated properly.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 19 Sep 2010 04:16:00 +0000 (14:16 +1000)]
split out set_youth.
Setting of the youth value is now a separate function.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 19 Sep 2010 04:05:41 +0000 (14:05 +1000)]
Generalise segunused to seg_pop
And use seg_pop more broadly,
and re-arrange some code.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 18 Sep 2010 13:00:48 +0000 (23:00 +1000)]
Introduce DelayYouth
When writing out the accounting blocks we need to not update the youth
block if we happen to start a new segment.
We already do that at unmount time, so generalise it with a new flag.
Signed-Off-By: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 18 Sep 2010 12:40:54 +0000 (22:40 +1000)]
Use segsum rather than separate dblock for youth in lafs_free_get
As segsum now always have a youthblk for ss==0, we don't need to
separately find a data block.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 18 Sep 2010 12:25:42 +0000 (22:25 +1000)]
segment.c: allow the else branch to just fall-through
The 'then' branch does a goto, so we don't need the else.
This patch is mostly just re-indenting.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 18 Sep 2010 12:24:36 +0000 (22:24 +1000)]
segments: swap then/else branches.
Swap 'then' and 'else' branches, inverting condition
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 18 Sep 2010 06:38:46 +0000 (16:38 +1000)]
Factor out seg_add_new
We had multiple places that add entries to the segtracker. Now we
only have one.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 17 Sep 2010 06:00:15 +0000 (16:00 +1000)]
add_block_address: be careful with physaddr == 0
An address of '0' is not consecutive with and address of '1' and while
it is very unlikely to ever be a problem, make sure we don't try to
combine those addresses into a range in the uninc table.
Also remove a comment about a possible problem that doesn't seem to be
a real problem.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 17 Sep 2010 05:47:34 +0000 (15:47 +1000)]
Simplify test in lafs_refile
If index block has non-zero pending_cnt, then it must be
dirty or realloc, so we can drop a test here.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 17 Sep 2010 05:38:10 +0000 (15:38 +1000)]
lafs_refile: lots of locking and refcount changes.
See the comments added to lafs.h for more detail.
There are really several changes and fixes in here but I don't think
it is worth the effort to separate them all out.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 15 Sep 2010 05:20:36 +0000 (15:20 +1000)]
re-indent lafs_file.
This patch is simply a re-indent. No code change.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 15 Sep 2010 05:19:50 +0000 (15:19 +1000)]
lafs_refile: group all functionality that requires refcnt==0
There are multiple silly (racy) tests on refcnt==dec. Combine all
these tests into a single 'if' which is run only when the refcnt
reaches zero.
This patch just changes the code without fixing the indentation. That
makes it easier to review.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 15 Sep 2010 04:57:02 +0000 (14:57 +1000)]
lafs_refile: factor out check for unpinnable data blocks.
Data blocks can be unpinned while refcounts are still present.
Factor that case out into a stand-alone test.
We still test for datablocks being unpinnable in the 'refcnt==0'
case as PinPending might have only just been cleared.
This leaves a large bulk of tests all of which depend on
refcnt reaching zero which can be improved in a subsequent patch.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 15 Sep 2010 04:34:23 +0000 (14:34 +1000)]
Small lafs_refile clean up
We don't need the 'onlru' variable any more.
And we shouldn't really delete data blocks from the lru at that
point, as they could be on a cluster or io-pending list.
Index blocks are safe as their refcount is zero, so they can only
be on the leaf lru.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 15 Sep 2010 03:58:39 +0000 (13:58 +1000)]
Don't hold counted reference while on leafs list.
Doing so makes the refcount calculations in lafs_refile messy.
We don't drop a block while it is dirty etc anyway so there is no loss
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 15 Sep 2010 03:56:02 +0000 (13:56 +1000)]
Hold reference while erasing blocks in lafs_invalidate_page
else we sometimes violate assertions about having non-zero refcount
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 15 Sep 2010 02:57:07 +0000 (12:57 +1000)]
Take a ref on dblock whenever the InoIdx block has a ->parent link.
This will be needed to ensure the dblock stays referenced when we stop
holding a reference to blocks on the 'leafs' lists.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 14 Sep 2010 07:01:21 +0000 (17:01 +1000)]
getiref_locked fixes
The #defines were a bit wrong here, so it would not have compiled
properly without DEBUG_REF
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 14 Sep 2010 05:44:41 +0000 (15:44 +1000)]
Change readpage to use ->chain instead of ->lru
->lru is rather overused, and ->chain is certainly never used on
non-B_Valid blocks, so convert readpage to use chain instead.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 14 Sep 2010 04:22:31 +0000 (14:22 +1000)]
lafs_refile - split out set_lru
setting the lru is not dependent on other changes so it can be move to
the top and outside the lock.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 14 Sep 2010 03:00:20 +0000 (13:00 +1000)]
lafs_refile: separate out consistency checking
lafs_refile is too big and clumsy. Time to tidy up.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 14 Sep 2010 01:50:34 +0000 (11:50 +1000)]
README update and some comment fixes.
Still working through the TODO list.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 14 Sep 2010 01:10:25 +0000 (11:10 +1000)]
Make sure checkpoint happens after a timeout.
If anything has been written since last check point, ensure another
checkpoint happens within 30 seconds.
This is partly to ensure that index block don't remain dirty forever.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 13 Sep 2010 08:15:46 +0000 (18:15 +1000)]
tidy lafs_get_flushable a bit.
There is some code in the wrong place - probably a hang over from a
previous arrangement before we made lafs_is_leaf a function.
No function change here.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 13 Sep 2010 07:58:13 +0000 (17:58 +1000)]
Remove old comment
that issue is already resolved - commit
f90959e6f492b6
NeilBrown [Mon, 13 Sep 2010 07:33:54 +0000 (17:33 +1000)]
Log symlink creation
We cannot include it in an update, so just make sure it goes in the
next write cluster. This will be before an sync or fsync and
roll-forward should pick it up, so all is OK
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 13 Sep 2010 07:13:49 +0000 (17:13 +1000)]
Remove lafs_space_use
it is identical to lafs_space_return, and no-one uses it
anyway.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 13 Sep 2010 05:44:05 +0000 (15:44 +1000)]
cluster_allocate - minor code rearrangement.
extract common code and general clean up
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 13 Sep 2010 05:28:08 +0000 (15:28 +1000)]
Get SegRef on parent of phys==0 blocks.
We still need those parents to be segrefed!
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 13 Sep 2010 05:07:27 +0000 (15:07 +1000)]
Remove pinning of iblock in place of dblock
We now allow both iblock and dblock to be pinned at the same time.
So when pinning the inode dblock, just do it and don't go bothering
the inode iblock.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 13 Sep 2010 04:39:18 +0000 (14:39 +1000)]
Refinements for triggering checkpoint when we are low on space.
This is getting messy but seems to work.
Not sure now on the difference between CleanerBlocks and
EmergencyPending.
I guess the one makes sure the cleaner does what it can and then
triggers a checkpoint.
The other prepares for EmergencyClean to be set after the next
checkpoint.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 12 Sep 2010 23:46:36 +0000 (09:46 +1000)]
Clean up error returns in lafs_reserve_block
We only want to consider EAGAIN if lafs_prealloc returns an error.
When other calls return an error, we want to pass exactly that error
back.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 15 Aug 2010 08:39:28 +0000 (18:39 +1000)]
Never set SegRef on an InoIdx block
An InoIdx block doesn't have a uptodate ->physaddr, that is only
updated in the data block. So setting SegRef on it is pointless.
Instead, when we find an InoIdx block while setting SegRef, use the
data block instead.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 15 Aug 2010 08:30:36 +0000 (18:30 +1000)]
Wait for a checkpoint before returning ENOSPC
If we seem to run out of space, it is worth waiting for
a checkpoint as that might free up some space. So add
an extra step to the sequence leading from 'no space' to 'ENOSPC'.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sun, 15 Aug 2010 05:08:49 +0000 (15:08 +1000)]
clean.c - assorted tidy-ups
Change some magic constants into named constants, and
improves some comments.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 12:25:38 +0000 (22:25 +1000)]
Check if dirblock can be orphan before making it one.
Only certain sorts of deletions can make a directory block
into and orphan - check them out before committing resources.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 11:50:14 +0000 (21:50 +1000)]
flush orphans when renaming to empty directory just like rmdir
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 11:37:56 +0000 (21:37 +1000)]
Cleaner: add a memory barrier to ensure we see i_size promptly.
There is a possible race that we need a barrier to protect against.
I think... or hope.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 11:11:12 +0000 (21:11 +1000)]
Cleaner: remove pointless signed compare.
bcnt can never be negative, so don't pretend.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 11:09:40 +0000 (21:09 +1000)]
Allow cleaner to skip new filesystems.
If we can tell the a subset-filesystem is too new to
match what we find in the write-cluster, we can skip it
quickly.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 11:05:55 +0000 (21:05 +1000)]
When cleaner finds a block beyond EOF, ignore whole descriptor.
All other blocks in descriptor must also be beyond EOF,
so ignore them all at once.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 11:04:08 +0000 (21:04 +1000)]
Update trunc_gen whenever we truncate a file to zero.
That allows some minor optimisations in the cleaner to work.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 10:55:23 +0000 (20:55 +1000)]
cleaner_parse: use truncate number to avoid looking at old inodes.
That is why we have the truncnumber in the cluster head after all.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 10:52:11 +0000 (20:52 +1000)]
Allow cleaner_parse to request multiple inodes at once.
Currently cleaner_parse stops when it hits an inode that it cannot
load immediately. This reduced the opportunities for parallelism.
Instream allow up to 16 -EAGAINs from inode lookups.
This requires that we mark headers for inodes which failed, and
always start again from the beginning of the cluster head.
We already reduce the bcnt to 0, so for inodes that can be
found, we won't lookup the blocks twice.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 10:34:45 +0000 (20:34 +1000)]
Refactor try_clean
It is a very big function - change it to 3 moderate sized functions.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 10:08:53 +0000 (20:08 +1000)]
Give to_clean.ss a meaningful name.
It is a flag set when we have a valid segment address.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 08:17:48 +0000 (18:17 +1000)]
Keep track of 'seq' number in cleaner
When reading cluster-heads, track the seq number, both for validation
and, later, for optimisation.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 08:04:47 +0000 (18:04 +1000)]
README update
NeilBrown [Sat, 14 Aug 2010 06:48:39 +0000 (16:48 +1000)]
Clean up interaction between cleaner and checkpoint.
If a checkpoint is wanted, the cleaner shouldn't start any more work.
If the cleaner or segscan is active a checkpoint cannot start, but
when they complete they should wake the checkpoint process.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 06:38:14 +0000 (16:38 +1000)]
Release stray B_Async blocks if we find them.
There could still be some stray index blocks...
maybe fix that later.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 05:41:59 +0000 (15:41 +1000)]
Combine cleaning and orphan list_heads.
A datablock is very rarely both an orphan and requiring cleaning, so
having two list_heads is a waste.
If is an orphan it will have full parent linkage and addresses already
so it will be handled promptly and removed from the cleaning list.
So arrange that if a block wants to be both, it is preferentially on
the cleaning list, and when removed from the cleaning list is gets
added back to the pending_orphan list in case it needs processing.
Note that only directory and inode blocks can ever be orphans so some
optimisation of spinlocks is possible.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 05:16:37 +0000 (15:16 +1000)]
Change when orphan blocks are refcounted.
Count when while B_Orphan is set, rather than while on a list.
This gives us some freedom to do different things with the list,
and ensures that we never lose the flag by the block disappearing.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 05:01:34 +0000 (15:01 +1000)]
Use a new flag to identify blocks being processed by the cleaner.
This will help future patch which will unify cleaning and orphans
list_heads, and make is clear when a refcount is being help.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 04:43:11 +0000 (14:43 +1000)]
Apply single-exit pattern in try_clean
This removes a lot of duplications
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 03:36:36 +0000 (13:36 +1000)]
Implement youth decay.
- After a checkpoint, check if we are close enough to the end
of youth space to need a decay.
- when we record a new youth number, un-decay it if the block hasn't
been decayed yet (and convert endian properly)
- Change scan_seg to updates free_block/free_dev atomically in just
one place, and do a block worth of decay at that point.
As part of this, the youth block is only released at one place now.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 02:08:38 +0000 (12:08 +1000)]
Use the right value of creation_age of subsets.
It should be cluster seq number. This never wraps and
is used to compare against write cluster to trim searches of new
filesets early.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Sat, 14 Aug 2010 01:16:04 +0000 (11:16 +1000)]
Ensure lafs_orphan_release doesn't block too much
Make sure orphan->i_mutex isn't held for long
periods, and ensure that orphan_abort doesn't block in
erase_dblock.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 11:58:57 +0000 (21:58 +1000)]
README update
NeilBrown [Fri, 13 Aug 2010 11:56:49 +0000 (21:56 +1000)]
Tune checkpoint freq by segments, not blocks.
If nothing else does, we should force a checkpoint
every few segments rather than every so-many blocks.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 11:48:36 +0000 (21:48 +1000)]
Separate thread management from the cleaning.
The thread does a lot more than just 'clean' so don't call it the
'cleaner' any more - just the 'thread' or 'lafsd'.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 11:32:11 +0000 (21:32 +1000)]
roll: don't update index if block address hasn't changed.
This is quite possible if the block was pushed out in the previous
phase, and could save some work
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 9 Aug 2010 04:06:58 +0000 (14:06 +1000)]
Create a backing_dev_info for a lafs filesystem.
As a lafs filesystem can span multiple devices, we need our own
bdi to handle congestion notification and unplugging.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 09:36:06 +0000 (19:36 +1000)]
Give subset objects their own operations
And make sure they 'stat' like the sort of directory
that can be used to create them.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 06:43:37 +0000 (16:43 +1000)]
Add missing set_anon_super for subset mounts
oops - missed that.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 06:26:55 +0000 (16:26 +1000)]
Add test code for subset mounts.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 06:26:30 +0000 (16:26 +1000)]
Maybe add a bug-on in dirty_dblock
I think we want this bug_on, but it doesn't quite work
yet - leave it as a reminder of '15ca/' in README
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 06:23:28 +0000 (16:23 +1000)]
Better handling of changing a Directory into an InodeFile
- actually change the type !!!
- make sure the on-disk block gets a proper index update.
Note that the checkpointing before creating things in the FS is
important for this to be correct.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 06:14:27 +0000 (16:14 +1000)]
Various fixes for lafs_get_subset
- make sure root directory is created if it doesn't exist
- also create inode usage map.
- hold a ref on the inode while the fs is mounted.
- free the sb_key at unmount.
- set s_bdi from the prime_sb
NeilBrown [Fri, 13 Aug 2010 06:09:53 +0000 (16:09 +1000)]
lafs_get_subset: balance locks properly
We drop the mutex outside the 'if' so we must take it outside
the 'if' too - which is safer as well.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 06:06:06 +0000 (16:06 +1000)]
Fix lafs_put_super for subset mounts.
- we still need a checkpoint - though not a final one - to ensure
that all dirty blocks from the fileset are written.
- We it isn't the root for a snapshot, we don't want to put
to root inode - the root inode will be in the main filesystem,
not in this one.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 04:00:56 +0000 (14:00 +1000)]
Improve lafs_iget_fs
Allow getting inodes in other filesystem.
This isn't quite perfect yet though.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 03:59:10 +0000 (13:59 +1000)]
Set PinPending in flush_data_to_inode.
Should always have this set when we pin a block.
It keeps the block pinned until it is dirtied.
As lafs_pin_block does a refile at the end, it can drop the Pinned
state as soon as it is set.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 03:56:57 +0000 (13:56 +1000)]
iget_my_inode - fix for case of ino == NULL
igrab doesn't handle NULL inodes, so we must.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Fri, 13 Aug 2010 03:55:46 +0000 (13:55 +1000)]
lafs_write_end: set new file size correctly.
We were setting the size to the start of the write, not the end!!
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 11 Aug 2010 23:42:56 +0000 (09:42 +1000)]
Discard filesys field from lafs_inode
i_sb can be used just as well.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 11 Aug 2010 22:57:55 +0000 (08:57 +1000)]
Change filesys arg of lafs_new_inode to struct super_block
It is more direct in most cases to use a super_block rather
than a filesys inode.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 10 Aug 2010 05:31:12 +0000 (15:31 +1000)]
Make lafs_new_inode work when given an explicit inode number.
In this case imni->mb isn't set, so we have to cope with that.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Tue, 10 Aug 2010 05:16:03 +0000 (15:16 +1000)]
Choose_free_inum: never return number below 16
They are for internal use.
Also fix a missing B_PinPending setting.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 9 Aug 2010 11:17:11 +0000 (21:17 +1000)]
Add 'filesys' arg to lafs_new_inode
This allows it to be called with dir == NULL - when creating an inode
that isn't in a directory.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 9 Aug 2010 11:06:34 +0000 (21:06 +1000)]
Make dir arg to lafs_new_inode optional.
After all, some inodes will be created without a directory (root and
other special inodes).
Make inodbp optional too.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 9 Aug 2010 10:43:15 +0000 (20:43 +1000)]
Revise handling of filesystem inconsistency: nlink == 0
Our handling wasn't really correct, and made the less-safe
assumption.
So change it to simply increment the linkcount.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 9 Aug 2010 10:26:25 +0000 (20:26 +1000)]
checkpoint must wait for both dblock and iblock of root to change phase.
Only waiting for iblock isn't enough - dblock might still be in the
old phase, which gets rather confusing.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 9 Aug 2010 10:24:20 +0000 (20:24 +1000)]
Unpin data blocks from previous phase before allowing them to be dirty.
While checkpointing will unpin PinPending blocks, it might not
manage to do it before the block gets Dirtied again.
So before we Pin the block - which is a required precursor to dirtying
them, unpin the block.
Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Mon, 9 Aug 2010 10:20:16 +0000 (20:20 +1000)]
Be more careful about waking cleaner in cluster_end_io.
If done was set as well as wake, we didn't wake the cleaner
so *FlushNeeded wouldn't necessarily be effective.
Signed-off-by: NeilBrown <neilb@suse.de>