2 ==== this is for locking rules etc, not for discussion ===============
4 b->refcnt counts the references to a block. These include:
7 - ->parent references from children
8 - The flags: Async, Uninc
9 - presence on an ->orphans list
10 - orphan block which needs to be held on to
11 - The presence of inode->iblock implies a counted reference through
13 - presence on lists: leaf (phase,clean,account), cleaning, cluster
14 - ref from sibling with PrimaryRef set
15 - reference from segment summary record
16 - if a block has SegRef set, then it owns a reference on a segsum
17 - every segment in 'table' owns reference on segsum block
19 If you have a reference to a block, you can always get another reference.
21 A hashtable lookup holding the hashtable spinlock can find an indexblock.
22 A reference from page->private under mapping->private_lock can find a datablock.
23 A ->parent link can be followed
25 under i_mutex (for InoIdx, inode->dblock->inode->i_mutex)
26 A ->siblings list can be walked under ->private_lock
27 inode->dblock is permanent - it only goes when the dblock is destroyed
28 If we have a reference to the iblock, then ->dblock can be followed
29 otherwise we need private_lock
30 inode->iblock can take a reference under ->private lock or i_mutex.
32 When you drop a reference:
33 For index blocks, if refcnt hits zero the block must go on the freelist.
34 For data blocks, just drop the reference - memory sweep will find it.
39 If NULL, can be set. under mapping->private_lock
40 If refcnt==0, ->parent can be set to NULL under same lock that allows
41 us to take a reference
42 Can be changed (for index block split/combine) under i_mutex and ->private_lock
45 Can be manipulated under mapping->private_lock if it is permitted
47 Can be walked under ->private_lock or i_mutex
48 When ->parent is NULL blk->siblings is empty for data blocks, and
49 blk->siblings is on LAFSI(inode)->free_index for index blocks.
50 Note that ->free_index is protected by hash_lock, not ->private_lock.
54 Set or cleared under ->private_lock. refcounted when ->iblock set.
55 So if we hold a ref on iblock - possible indirect - then can access
59 Can be set to or from NULL under ->private_lock, otherwise need to be
60 IOLocked to change it.
61 If we hold a ref to a child with parent, then we can access iblock with
63 Note that this reference isn't counted. It is similar to a reference
64 held by the index hash table.
66 ->lru is used for the global list of free indexblocks, and for
67 phase_leafs, clean_leafs, account_leaf, cluster list, pending io list
68 Membership on all but first is a counted reference.
69 So if ->lru is not empty then checking the refcnt can determing if
70 the block is on the freelist (0) or another list (non-zero).
72 Once a block gets on a (non-freelist) lru list it stays there until
73 the list owner explicitly removes it. These lists are protected by
75 Note that a block can get on the wrong list in a number of ways due
76 to various races. This doesn't matter. It will eventually be found
78 A block first appears on a *_leafs lists protected by fs->lock.
79 It is eventually allocated by a clean/flush thread and moves to
80 the cluster lists which is protected by a cluster lock.
81 Then it moves to a pending_blocks list protected by fs->lock again.
84 phase_leafs is protected by fs->lock for walking and manipulation.
87 Entries are added under ->private_lock.
88 If we "know" we hold a reference to all possible entries (as cleaner
89 might) we can safely walk the list, else use private_lock.
93 B_Index - never changes, distinguished index block from data blocks
94 B_Root - is root block - there is no parent.
95 B_InoIdx - flags the inodes index block. Can be changed when ->iblock can
96 B_Valid - has been loaded or initialised. Never cleared
97 B_Dirty - For data block, the content has changed and not been written out
98 For index blocks this is set as soon as an address is ready to be
100 B_Realloc - as above for cleaning. Should change that to use dirty and have
101 a flag for 'new data' which isn't set by cleaner.
102 B_UnincCredit - have a credit to be passed to parent on incorporation
103 B_Uninc - on an uninc list, so phys addr shouldn't change
104 B_Pinned - pinned to the phase determined by B_Phase1.
105 will be considerred for writing in that phase.
106 must be on leaf list unless has children. or not dirty
107 B_Async - cleaner thread wants this when it is unlocked or written
108 B_Segref - own a refernce on the segsum for physaddr
109 B_Credit B_ICredit B_NCredit B_NICredit
110 B_IOLock - used for various locking.
112 B_PinPending - data block should stay pinned even if not dirty.
113 Some thread is working on changing it.
114 Cleaner will just dirty it to clean. writepage will skip
115 should need iolock to set this.
117 For index blocks we don't need this and just use refcnt.
118 but there are more refs held on data blocks.
119 I think we mainly need this for writepage and cleaner to ignore
121 ------------------------
123 Normally data blocks are not phase-flipped. They simply
125 For inode data blocks, when the index block flips, the pinning
126 and reservations are transferred to the datablock which is
127 then written and un-pinned.
129 -------------------------
135 --------------------------
136 my_inode points from datablock in inode file to the inode.
137 LAFSI(inode)->dblock points back.
138 These are not reference counted, rather whichever object is destroyed
139 first breaks both links.
141 if ->iblock is set, then a ref is held on ->block. So if we hold
142 a ref on a block with a parent, then we can access ->inode->dblock
144 Otherwise we need private_lock
146 We only clear ->my_inode when the refcount on the block reaches
147 zero, so if we have a refcount on the dblock, and my_inode is not NULL,
148 we can dereference it safely.
151 to find iblock from dblock - that should be locked - sometime just testing
152 to find inode that might need to have inode_fillblock called
153 to find inode to truncate in orphan handling
155 But wait: we don't destroy an inode with a dblock until dblock
156 refcount reaches 0. So if we hold a dblock, it is always safe to