2 Credits allow us to be sure that we can write out all dirty blocks
3 without exhausting the log.
5 Before a block can be marked as 'Dirty' it must hold two credits, one
6 for itself and one to assist with writing index blocks (which may
8 Further, every ancestor must also have the same two credits.
10 If a data block is not Pinned to the current phase, then it and
11 every ancestor must also have a pair of credits for the next phase.
12 If these credits are hard to come by, we can still Dirty a data block
13 by pinning it. This requires a checkpoint lock.
15 Each block has 5 credit flags.
16 2 for this phase (Credit and ICredit)
17 2 for next phase (NCredit and NICredit)
18 1 to hold the ICredit between allocation and incorporation.
20 When we might want to dirty a data block (e.g. PinPending or just
21 before actually dirtying it), we allocate the needed credits. If
22 credits aren't available (i.e. the filesystem is nearly full, or needs
23 cleaning) that might block (if the update is expected to free up space
24 and we are eating into the first reserve) or it might fail (e.g. if we
25 are allocating new space).
27 When we release a data block, we free all the credits. Keeping proper
28 track of all credits is important.
30 When we write a block, the Credit is used to pay for that writeout.
31 The ICredit is moved to UnincCredit until incorporation.
32 If the block is dirtied again, before incorporation, two new credits
33 will be needed. If another writeout happens while UnincCredit is
34 still set, the ICredit is simply returned.
36 When incorporation happens, if no splitting is needed, the UnincCredits
37 from all new blocks are simply returned.
38 If splitting is needed, the UnincCredits are used to fill the Credit
39 requirements of any new block. However there might not be enough.
41 In the worst case, a split is required with just one UnincCredit.
42 This can provide the required Credit, but no ICredit.
43 In this case we need to steal Next-phase credits from lower levels in
44 the tree by pinning data blocks to the current phase....
46 If everything below is pinned to the current phase, we can take our
47 own next-phase credits.... unless we are on a phase-change.
49 Note: Old-phase index blocks are never incorporated if they have
52 Hmmm.. ICredit only needed on index blocks which do not have room for
54 Thus if there are 3 or more children, we get enough credits from them.
55 If there is only 1 or 2, we won't split so don't need ICredit...
59 Index block marked dirty when first child allocated.
60 So it needs to have Credit when not dirty/realloc.
64 1/ All incorporations work.
66 Inherited credits propagate up filling gaps.
69 2/ Incorporation cause an h-split
71 worst case: We have an index block which is full of large extents.
72 One block in the middle of one extent is relocated. This one
73 block caries an ICredit.
75 The index block has an ICredit too, so there are two ICredits.
76 One dirties the new block. One is made available to parent.
77 This pattern propagates up.
79 So: when a set of children are incorporated, at least half of them
80 must have ICredits. These are used to split a block.
81 If a double-split is needed, there must be plenty of credits.
84 3/ Incorporation requires a v-split
86 This is when the inode is full and we need to create an
88 This block is dirtied by the ICredit if the inode block.
89 Which will be filled in again when incorporation happens below.
91 Worst case is that inode is full of large extents. A single block is
92 added. The vsplit happens.
93 The ICredit from the single block becomes the ICredit for the inode.
97 If we incorporate a given index block multiple times in one checkpoint
98 it will lose all it's credits, as we only refresh them when new blocks are
99 dirtied, or on phase change..... no, that's not right.
101 When we allocate any block, we pin all the parents.
102 When we unpin a parent (or re-pin to next phase) we claim the N* credits.
103 If we cannot refill the N* credits, we pin children to the current phase.
105 1/ Steal their N* credits which may give us enough credit to continue, or
106 2/ ensure that after this block is written, there will be no
114 Ok, maybe I have it now... Let's be as detailed as possible.
116 Every data block has 2 bits signifying pre-allocation. Each bit
117 reflects one credit (block of space). When the block is dirtied
118 or locked, one bit is cleared to reflect the space about to be
119 used. The other is moved to 'Uninccredit'.
121 When Uninc is cleared by incorporation, Unincredit is available
122 for index splitting etc. Note that by this time the block could
123 be pre-allocated or dirty again, which means more blocks
126 Every index block has 4 bits signifying pre-allocation credits.
127 There are two for each phase. One is to write out the block. One is to
128 allow for splitting/incorporation. The second is not required
129 when the index block has room for 3 new allocations (30 bytes of
132 The primary rule is that if a datablock is locked or dirty, then
133 all index blocks up to the root must have at least preallocation
134 in the current phase, and two if there may not be room for 3 more
138 Before a datablocks can be locked or dirtied, it must have
139 preallocation, and all indexblocks above it must have
140 preallocation in both phases. If not, either fail or block.
142 Before an out-of-phase leaf index block can have incorporation,
143 either it must have no dirty child, or it and all index blocks
144 above it must be have all prealloc bits set. i.e. if their
145 is a failure setting some unset bit, then flush all children.
146 We know new children cannot become dirty until the bits are set.
148 Before any in-phase index block can have incorporation, it and
149 all parents must have both bits set. If there is a failure
150 setting any bits, we wait for a phase-change (which probably
153 When incorporation happens there is at least one block provided by
154 every two being incorporated, and each index block being
155 incorporated into has at least one block, or two if it doesn't have
156 room for 4 more indicies.
158 If all indicies fit, then the one that the block owns is used to
159 write it, and the one from the at-least-one child is made available
160 to the parent. Any others are discarded (in the out-of-phase case).
162 If a split is required the next-phase bits are shared. If there
163 aren't 2, then there are no children, and none are needed. If
164 there are, then they go one-each as both blocks will have some
167 If a split is required and there are 4 or more children, then there
168 is one block that the iblock owns, and 2 or more from the children.
169 These are used to write out the two blocks and pass one up. If
170 there are more children incorporated, their bits are given to the
171 next phase as they might then be needed, and possibly passed up as
174 If a split is required and there are 3 or fewer children, then
175 there are two blocks of space that the iblock owns, and 1 or more
176 from the children. These are enough to write out the two blocks and
177 pass 1 set bits on allset bits on allup.
179 When an inode 'splits' (vertically), we leave all bits with the
180 lower block as the inode will be written using the space allocated
181 to the data block. However when there is excess credits after an
182 incorporation, they should be given to the parent incase another
183 depth increase is needed.
185 So, what bits do we need:
186 - 1 - does Uninc carry a credit - sometimes it won't. - B_UnincCredit
187 - 2 - this-phase credits B_Credit B_ICredit
188 - 2 - next-phase credits B_NCredit B_NICredit
190 When we dirty or lock a block, the credits from the appropriate
191 phase move into the dirty/lock bit and the uninc_credit bit.
192 On writing, the dirty bit is cleared
193 On incorporation, the uninc credit bit is used.
195 The next-phase credit bits are never used for data blocks.
197 --------------------------------
199 What causes us to flush everything at a phase change when short of
202 When an index block changes phase, if we cannot allocate credits for
203 the new NextPhase, we Pin all data blocks below, so at the next
204 phase change everything gets written out.
205 --------------------------------
207 Credit and ICredit are set before block is dirty.
208 They are moved to Dirty and UnincCredit when block is dirtied
209 These are then cleared when block is written/incorporated.
210 NCredit/NICredit are only needed for Index blocks.
211 They are transferred to Credit and ICredit at phase change.
212 They then need to be refreshed. If that fails, we need to pin
213 all descendents to the next phase.
215 An InoIdx block doesn't get written out or incorporated. Only the
216 matching data block does.
217 So on phase-flip of an InoIdx, we move Dirty/UnincCredit to the
218 data block and Pin it to the old phase.
219 Once Pinned, metadata updates stay in the inode and don't flush
220 through to the data block. When unpinned, if I_Dirty is set, we
221 flush the inode to the data block