| 
						
					 | 
				
			
			 | 
			 | 
			
				@ -11,130 +11,268 @@ When taking the i_rwsem on multiple non-directory objects, we
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				always acquire the locks in order by increasing address.  We'll call
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				that "inode pointer" order in the following.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				For our purposes all operations fall in 5 classes:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				1) read access.  Locking rules: caller locks directory we are accessing.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				The lock is taken shared.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Primitives
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				==========
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				2) object creation.  Locking rules: same as above, but the lock is taken
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				exclusive.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				For our purposes all operations fall in 6 classes:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				3) object removal.  Locking rules: caller locks parent, finds victim,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				locks victim and calls the method.  Locks are exclusive.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				1. read access.  Locking rules:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				4) rename() that is _not_ cross-directory.  Locking rules: caller locks
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				the parent and finds source and target.  Then we decide which of the
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				source and target need to be locked.  Source needs to be locked if it's a
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				non-directory; target - if it's a non-directory or about to be removed.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Take the locks that need to be taken, in inode pointer order if need
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				to take both (that can happen only when both source and target are
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				non-directories - the source because it wouldn't be locked otherwise
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				and the target because mixing directory and non-directory is allowed
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				only with RENAME_EXCHANGE, and that won't be removing the target).
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				After the locks had been taken, call the method.  All locks are exclusive.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the directory we are accessing (shared)
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				5) link creation.  Locking rules:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				2. object creation.  Locking rules:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock parent
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* check that source is not a directory
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock source
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* call the method.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the directory we are accessing (exclusive)
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				All locks are exclusive.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				3. object removal.  Locking rules:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				6) cross-directory rename.  The trickiest in the whole bunch.  Locking
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				rules:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the parent (exclusive)
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* find the victim
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the victim (exclusive)
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				4. link creation.  Locking rules:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the parent (exclusive)
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* check that the source is not a directory
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the source (exclusive; probably could be weakened to shared)
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				5. rename that is _not_ cross-directory.  Locking rules:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the parent (exclusive)
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* find the source and target
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* decide which of the source and target need to be locked.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  The source needs to be locked if it's a non-directory, target - if it's
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  a non-directory or about to be removed.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* take the locks that need to be taken (exlusive), in inode pointer order
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  if need to take both (that can happen only when both source and target
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  are non-directories - the source because it wouldn't need to be locked
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  otherwise and the target because mixing directory and non-directory is
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  allowed only with RENAME_EXCHANGE, and that won't be removing the target).
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				6. cross-directory rename.  The trickiest in the whole bunch.  Locking rules:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the filesystem
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock parents in "ancestors first" order. If one is not ancestor of
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  the other, lock the parent of source first.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* find source and target.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* if old parent is equal to or is a descendent of target
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  fail with -ENOTEMPTY
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* if new parent is equal to or is a descendent of source
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  fail with -ELOOP
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* Lock subdirectories involved (source before target).
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* Lock non-directories involved, in inode pointer order.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* call the method.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* if the parents don't have a common ancestor, fail the operation.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the parents in "ancestors first" order (exclusive). If neither is an
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  ancestor of the other, lock the parent of source first.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* find the source and target.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* verify that the source is not a descendent of the target and
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					  target is not a descendent of source; fail the operation otherwise.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the subdirectories involved (exclusive), source before target.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					* lock the non-directories involved (exclusive), in inode pointer order.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				All ->i_rwsem are taken exclusive.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				The rules above obviously guarantee that all directories that are going
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				to be read, modified or removed by method will be locked by the caller.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				The rules above obviously guarantee that all directories that are going to be
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				read, modified or removed by method will be locked by caller.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Splicing
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				========
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				There is one more thing to consider - splicing.  It's not an operation
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				in its own right; it may happen as part of lookup.  We speak of the
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				operations on directory trees, but we obviously do not have the full
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				picture of those - especially for network filesystems.  What we have
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				is a bunch of subtrees visible in dcache and locking happens on those.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Trees grow as we do operations; memory pressure prunes them.  Normally
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				that's not a problem, but there is a nasty twist - what should we do
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				when one growing tree reaches the root of another?  That can happen in
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				several scenarios, starting from "somebody mounted two nested subtrees
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				from the same NFS4 server and doing lookups in one of them has reached
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				the root of another"; there's also open-by-fhandle stuff, and there's a
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				possibility that directory we see in one place gets moved by the server
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				to another and we run into it when we do a lookup.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				For a lot of reasons we want to have the same directory present in dcache
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				only once.  Multiple aliases are not allowed.  So when lookup runs into
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				a subdirectory that already has an alias, something needs to be done with
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				dcache trees.  Lookup is already holding the parent locked.  If alias is
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				a root of separate tree, it gets attached to the directory we are doing a
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				lookup in, under the name we'd been looking for.  If the alias is already
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				a child of the directory we are looking in, it changes name to the one
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				we'd been looking for.  No extra locking is involved in these two cases.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				However, if it's a child of some other directory, the things get trickier.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				First of all, we verify that it is *not* an ancestor of our directory
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				and fail the lookup if it is.  Then we try to lock the filesystem and the
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				current parent of the alias.  If either trylock fails, we fail the lookup.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				If trylocks succeed, we detach the alias from its current parent and
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				attach to our directory, under the name we are looking for.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Note that splicing does *not* involve any modification of the filesystem;
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				all we change is the view in dcache.  Moreover, holding a directory locked
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				exclusive prevents such changes involving its children and holding the
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				filesystem lock prevents any changes of tree topology, other than having a
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				root of one tree becoming a child of directory in another.  In particular,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				if two dentries have been found to have a common ancestor after taking
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				the filesystem lock, their relationship will remain unchanged until
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				the lock is dropped.  So from the directory operations' point of view
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				splicing is almost irrelevant - the only place where it matters is one
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				step in cross-directory renames; we need to be careful when checking if
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				parents have a common ancestor.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Multiple-filesystem stuff
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				=========================
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				For some filesystems a method can involve a directory operation on
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				another filesystem; it may be ecryptfs doing operation in the underlying
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				filesystem, overlayfs doing something to the layers, network filesystem
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				using a local one as a cache, etc.  In all such cases the operations
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				on other filesystems must follow the same locking rules.  Moreover, "a
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				directory operation on this filesystem might involve directory operations
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				on that filesystem" should be an asymmetric relation (or, if you will,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				it should be possible to rank the filesystems so that directory operation
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				on a filesystem could trigger directory operations only on higher-ranked
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				ones - in these terms overlayfs ranks lower than its layers, network
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				filesystem ranks lower than whatever it caches on, etc.)
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Deadlock avoidance
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				==================
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				If no directory is its own ancestor, the scheme above is deadlock-free.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Proof:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				[XXX: will be updated once we are done massaging the lock_rename()]
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					First of all, at any moment we have a linear ordering of the
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					objects - A < B iff (A is an ancestor of B) or (B is not an ancestor
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				        of A and ptr(A) < ptr(B)).
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				There is a ranking on the locks, such that all primitives take
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				them in order of non-decreasing rank.  Namely,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					That ordering can change.  However, the following is true:
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				  * rank ->i_rwsem of non-directories on given filesystem in inode pointer
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    order.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				  * put ->i_rwsem of all directories on a filesystem at the same rank,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    lower than ->i_rwsem of any non-directory on the same filesystem.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				  * put ->s_vfs_rename_mutex at rank lower than that of any ->i_rwsem
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    on the same filesystem.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				  * among the locks on different filesystems use the relative
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    rank of those filesystems.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				(1) if object removal or non-cross-directory rename holds lock on A and
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    attempts to acquire lock on B, A will remain the parent of B until we
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    acquire the lock on B.  (Proof: only cross-directory rename can change
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    the parent of object and it would have to lock the parent).
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				For example, if we have NFS filesystem caching on a local one, we have
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				(2) if cross-directory rename holds the lock on filesystem, order will not
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    change until rename acquires all locks.  (Proof: other cross-directory
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    renames will be blocked on filesystem lock and we don't start changing
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    the order until we had acquired all locks).
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				  1. ->s_vfs_rename_mutex of NFS filesystem
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				  2. ->i_rwsem of directories on that NFS filesystem, same rank for all
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				  3. ->i_rwsem of non-directories on that filesystem, in order of
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				     increasing address of inode
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				  4. ->s_vfs_rename_mutex of local filesystem
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				  5. ->i_rwsem of directories on the local filesystem, same rank for all
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				  6. ->i_rwsem of non-directories on local filesystem, in order of
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				     increasing address of inode.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				(3) locks on non-directory objects are acquired only after locks on
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    directory objects, and are acquired in inode pointer order.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    (Proof: all operations but renames take lock on at most one
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    non-directory object, except renames, which take locks on source and
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				    target in inode pointer order in the case they are not directories.)
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				It's easy to verify that operations never take a lock with rank
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				lower than that of an already held lock.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Now consider the minimal deadlock.  Each process is blocked on
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				attempt to acquire some lock and already holds at least one lock.  Let's
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				consider the set of contended locks.  First of all, filesystem lock is
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				not contended, since any process blocked on it is not holding any locks.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Thus all processes are blocked on ->i_rwsem.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Suppose deadlocks are possible.  Consider the minimal deadlocked
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				set of threads.  It is a cycle of several threads, each blocked on a lock
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				held by the next thread in the cycle.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				By (3), any process holding a non-directory lock can only be
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				waiting on another non-directory lock with a larger address.  Therefore
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				the process holding the "largest" such lock can always make progress, and
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				non-directory objects are not included in the set of contended locks.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Since the locking order is consistent with the ranking, all
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				contended locks in the minimal deadlock will be of the same rank,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				i.e. they all will be ->i_rwsem of directories on the same filesystem.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Moreover, without loss of generality we can assume that all operations
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				are done directly to that filesystem and none of them has actually
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				reached the method call.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Thus link creation can't be a part of deadlock - it can't be
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				blocked on source and it means that it doesn't hold any locks.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				In other words, we have a cycle of threads, T1,..., Tn,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				and the same number of directories (D1,...,Dn) such that
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Any contended object is either held by cross-directory rename or
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				has a child that is also contended.  Indeed, suppose that it is held by
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				operation other than cross-directory rename.  Then the lock this operation
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				is blocked on belongs to child of that object due to (1).
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					T1 is blocked on D1 which is held by T2
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				It means that one of the operations is cross-directory rename.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Otherwise the set of contended objects would be infinite - each of them
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				would have a contended child and we had assumed that no object is its
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				own descendent.  Moreover, there is exactly one cross-directory rename
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				(see above).
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					T2 is blocked on D2 which is held by T3
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Consider the object blocking the cross-directory rename.  One
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				of its descendents is locked by cross-directory rename (otherwise we
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				would again have an infinite set of contended objects).  But that
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				means that cross-directory rename is taking locks out of order.  Due
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				to (2) the order hadn't changed since we had acquired filesystem lock.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				But locking rules for cross-directory rename guarantee that we do not
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				try to acquire lock on descendent before the lock on ancestor.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Contradiction.  I.e.  deadlock is impossible.  Q.E.D.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					...
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
					Tn is blocked on Dn which is held by T1.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Each operation in the minimal cycle must have locked at least
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				one directory and blocked on attempt to lock another.  That leaves
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				only 3 possible operations: directory removal (locks parent, then
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				child), same-directory rename killing a subdirectory (ditto) and
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				cross-directory rename of some sort.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				There must be a cross-directory rename in the set; indeed,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				if all operations had been of the "lock parent, then child" sort
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				we would have Dn a parent of D1, which is a parent of D2, which is
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				a parent of D3, ..., which is a parent of Dn.  Relationships couldn't
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				have changed since the moment directory locks had been acquired,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				so they would all hold simultaneously at the deadlock time and
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				we would have a loop.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Since all operations are on the same filesystem, there can't be
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				more than one cross-directory rename among them.  Without loss of
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				generality we can assume that T1 is the one doing a cross-directory
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				rename and everything else is of the "lock parent, then child" sort.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				In other words, we have a cross-directory rename that locked
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Dn and blocked on attempt to lock D1, which is a parent of D2, which is
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				a parent of D3, ..., which is a parent of Dn.  Relationships between
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				D1,...,Dn all hold simultaneously at the deadlock time.  Moreover,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				cross-directory rename does not get to locking any directories until it
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				has acquired filesystem lock and verified that directories involved have
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				a common ancestor, which guarantees that ancestry relationships between
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				all of them had been stable.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Consider the order in which directories are locked by the
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				cross-directory rename; parents first, then possibly their children.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Dn and D1 would have to be among those, with Dn locked before D1.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Which pair could it be?
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				It can't be the parents - indeed, since D1 is an ancestor of Dn,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				it would be the first parent to be locked.  Therefore at least one of the
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				children must be involved and thus neither of them could be a descendent
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				of another - otherwise the operation would not have progressed past
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				locking the parents.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				It can't be a parent and its child; otherwise we would've had
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				a loop, since the parents are locked before the children, so the parent
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				would have to be a descendent of its child.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				It can't be a parent and a child of another parent either.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Otherwise the child of the parent in question would've been a descendent
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				of another child.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				That leaves only one possibility - namely, both Dn and D1 are
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				among the children, in some order.  But that is also impossible, since
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				neither of the children is a descendent of another.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				That concludes the proof, since the set of operations with the
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				properties requiered for a minimal deadlock can not exist.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Note that the check for having a common ancestor in cross-directory
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				rename is crucial - without it a deadlock would be possible.  Indeed,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				suppose the parents are initially in different trees; we would lock the
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				parent of source, then try to lock the parent of target, only to have
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				an unrelated lookup splice a distant ancestor of source to some distant
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				descendent of the parent of target.   At that point we have cross-directory
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				rename holding the lock on parent of source and trying to lock its
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				distant ancestor.  Add a bunch of rmdir() attempts on all directories
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				in between (all of those would fail with -ENOTEMPTY, had they ever gotten
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				the locks) and voila - we have a deadlock.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Loop avoidance
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				==============
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				These operations are guaranteed to avoid loop creation.  Indeed,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				the only operation that could introduce loops is cross-directory rename.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Since the only new (parent, child) pair added by rename() is (new parent,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				source), such loop would have to contain these objects and the rest of it
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				would have to exist before rename().  I.e. at the moment of loop creation
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				rename() responsible for that would be holding filesystem lock and new parent
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				would have to be equal to or a descendent of source.  But that means that
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				new parent had been equal to or a descendent of source since the moment when
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				we had acquired filesystem lock and rename() would fail with -ELOOP in that
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				case.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Suppose after the operation there is a loop; since there hadn't been such
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				loops before the operation, at least on of the nodes in that loop must've
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				had its parent changed.  In other words, the loop must be passing through
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				the source or, in case of exchange, possibly the target.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Since the operation has succeeded, neither source nor target could have
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				been ancestors of each other.  Therefore the chain of ancestors starting
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				in the parent of source could not have passed through the target and
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				vice versa.  On the other hand, the chain of ancestors of any node could
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				not have passed through the node itself, or we would've had a loop before
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				the operation.  But everything other than source and target has kept
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				the parent after the operation, so the operation does not change the
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				chains of ancestors of (ex-)parents of source and target.  In particular,
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				those chains must end after a finite number of steps.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				Now consider the loop created by the operation.  It passes through either
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				source or target; the next node in the loop would be the ex-parent of
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				target or source resp.  After that the loop would follow the chain of
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				ancestors of that parent.  But as we have just shown, that chain must
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				end after a finite number of steps, which means that it can't be a part
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				of any loop.  Q.E.D.
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				While this locking scheme works for arbitrary DAGs, it relies on
 | 
			
		
		
	
		
			
				 | 
				 | 
			
			 | 
			 | 
			
				ability to check that directory is a descendent of another object.  Current
 | 
			
		
		
	
	
		
			
				
					| 
						
					 | 
				
			
			 | 
			 | 
			
				
 
 |