2 Thinking through reservation for writeout ... again.
4 We would like to be able to deny or delay all
5 updates incase space is not available.
6 In general, 'deny' is appropriate if the the block in the
7 file hasn't been allocated yet, and 'delay' is appropriate
8 if the block has been allocated, as writing will
9 eventually free something up.
11 Sometimes we need to reserve the right to write
12 a block quite a lot later. We might reserve that
13 right several times. Each time might happen in
14 a different checkpoint and the space is used multiple
15 times. Further: index blocks may need to be written
16 in order to write out a block. So for each reservation
17 we take against a block, we really don't know
18 may many storage blocks might be needed. This
21 Every write-reservation we take on a block
22 preallocates 4 storage blocks, 2 each for
23 this phase and next phase, one for the block
24 and one to contribute towards indexes.
25 Note that this is index growth. Pre-existing
26 index blocks need some preallocation of their
29 When a block passes through a phase change and is written
30 out, we know there is space to write out once more,
31 But we might need to write out multiple times.
32 We force allocation and essentially block all further
33 new data writes until some cleaning has happen.
35 Writes that happen though system calls such as directory updates and
36 'write' can easily be blocked until enough space is available.
37 (except for index block space).
39 Write due to dirty mmapped memory is a little harder.
41 When a write request is made for a block that cannot
42 immediately be allocated more space, we unmap it
43 thus requiring nopage to bring it back in.
44 We block in the nopage operation so when the filesystem
45 is very full, we might be pausing a lot waiting for
46 checkpoints and cleaning to happen.
49 Question: how much waiting is allowed between allocating
51 We might have to read from disk.
52 We might kmalloc which could mean writing some things to disk.
53 If we could limit this to one phase change, we could be more
54 comfortable about all the preallocation.
55 But how would we enforce that. The second phase change
56 would have to wait for allocations in the previous to be used.
59 Biggest problem seems to be index block which might need to be
60 written every checkpoint.
62 I guess we do just block everything until we can allocate
65 So actually doing a write either refreshes the allocation,
66 or unmaps the block and makes sure any future attempt
67 to map or write it blocks.
68 But some reservations might have already happened.
69 We need to allow them to commit. When?
70 We need to know precisely what is being waited for,
71 and ensure that once we hit problems things start to fail
72 so we back out or commit quickly.
74 --------------------------------
75 We don't have multiple locks on a datablock. It is either locked for
78 If we want a datablock to be written in a particular checkpoint
82 If that fails, either give up, or
87 We give up if there are any new-block allocations. We only retry
88 if all blocks we try to write have been allocated space previously.
91 If we don't care when a datablock is written we
92 Try to lock outside the checkpoint, blocking if appropriate
93 On success we allow writes to happen, either via syscall
95 When a flush or whatever finally writes the block we either
96 relock or unmap the block thus blocking future writes.
97 As 'locking' reserves enough space for 2 writes we don't have to
98 unmap before writing unless a previous refill failed.
102 (*) If we take a snapshot then updating existing blocks may not be
103 possible - no amount of cleaning will free up space until a snapshot
104 is dropped. I guess that is primarily a sysadmin problem. If space
105 runs out that badly snapshots must be dropped before progress is