mirror of
				https://github.com/torvalds/linux.git
				synced 2025-11-04 10:40:15 +02:00 
			
		
		
		
	doc: Update control-dependencies section of memory-barriers.txt
This commit adds consistency to examples, formatting, and a couple of additional warnings. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
This commit is contained in:
		
							parent
							
								
									526914a0ae
								
							
						
					
					
						commit
						c8241f8553
					
				
					 1 changed files with 38 additions and 32 deletions
				
			
		| 
						 | 
					@ -640,6 +640,10 @@ See also the subsection on "Cache Coherency" for a more thorough example.
 | 
				
			||||||
CONTROL DEPENDENCIES
 | 
					CONTROL DEPENDENCIES
 | 
				
			||||||
--------------------
 | 
					--------------------
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					Control dependencies can be a bit tricky because current compilers do
 | 
				
			||||||
 | 
					not understand them.  The purpose of this section is to help you prevent
 | 
				
			||||||
 | 
					the compiler's ignorance from breaking your code.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
A load-load control dependency requires a full read memory barrier, not
 | 
					A load-load control dependency requires a full read memory barrier, not
 | 
				
			||||||
simply a data dependency barrier to make it work correctly.  Consider the
 | 
					simply a data dependency barrier to make it work correctly.  Consider the
 | 
				
			||||||
following bit of code:
 | 
					following bit of code:
 | 
				
			||||||
| 
						 | 
					@ -667,14 +671,15 @@ for load-store control dependencies, as in the following example:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
	q = READ_ONCE(a);
 | 
						q = READ_ONCE(a);
 | 
				
			||||||
	if (q) {
 | 
						if (q) {
 | 
				
			||||||
		WRITE_ONCE(b, p);
 | 
							WRITE_ONCE(b, 1);
 | 
				
			||||||
	}
 | 
						}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Control dependencies pair normally with other types of barriers.  That
 | 
					Control dependencies pair normally with other types of barriers.
 | 
				
			||||||
said, please note that READ_ONCE() is not optional! Without the
 | 
					That said, please note that neither READ_ONCE() nor WRITE_ONCE()
 | 
				
			||||||
READ_ONCE(), the compiler might combine the load from 'a' with other
 | 
					are optional! Without the READ_ONCE(), the compiler might combine the
 | 
				
			||||||
loads from 'a', and the store to 'b' with other stores to 'b', with
 | 
					load from 'a' with other loads from 'a'.  Without the WRITE_ONCE(),
 | 
				
			||||||
possible highly counterintuitive effects on ordering.
 | 
					the compiler might combine the store to 'b' with other stores to 'b'.
 | 
				
			||||||
 | 
					Either can result in highly counterintuitive effects on ordering.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Worse yet, if the compiler is able to prove (say) that the value of
 | 
					Worse yet, if the compiler is able to prove (say) that the value of
 | 
				
			||||||
variable 'a' is always non-zero, it would be well within its rights
 | 
					variable 'a' is always non-zero, it would be well within its rights
 | 
				
			||||||
| 
						 | 
					@ -682,7 +687,7 @@ to optimize the original example by eliminating the "if" statement
 | 
				
			||||||
as follows:
 | 
					as follows:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
	q = a;
 | 
						q = a;
 | 
				
			||||||
	b = p;  /* BUG: Compiler and CPU can both reorder!!! */
 | 
						b = 1;  /* BUG: Compiler and CPU can both reorder!!! */
 | 
				
			||||||
 | 
					
 | 
				
			||||||
So don't leave out the READ_ONCE().
 | 
					So don't leave out the READ_ONCE().
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -692,11 +697,11 @@ branches of the "if" statement as follows:
 | 
				
			||||||
	q = READ_ONCE(a);
 | 
						q = READ_ONCE(a);
 | 
				
			||||||
	if (q) {
 | 
						if (q) {
 | 
				
			||||||
		barrier();
 | 
							barrier();
 | 
				
			||||||
		WRITE_ONCE(b, p);
 | 
							WRITE_ONCE(b, 1);
 | 
				
			||||||
		do_something();
 | 
							do_something();
 | 
				
			||||||
	} else {
 | 
						} else {
 | 
				
			||||||
		barrier();
 | 
							barrier();
 | 
				
			||||||
		WRITE_ONCE(b, p);
 | 
							WRITE_ONCE(b, 1);
 | 
				
			||||||
		do_something_else();
 | 
							do_something_else();
 | 
				
			||||||
	}
 | 
						}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -705,12 +710,12 @@ optimization levels:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
	q = READ_ONCE(a);
 | 
						q = READ_ONCE(a);
 | 
				
			||||||
	barrier();
 | 
						barrier();
 | 
				
			||||||
	WRITE_ONCE(b, p);  /* BUG: No ordering vs. load from a!!! */
 | 
						WRITE_ONCE(b, 1);  /* BUG: No ordering vs. load from a!!! */
 | 
				
			||||||
	if (q) {
 | 
						if (q) {
 | 
				
			||||||
		/* WRITE_ONCE(b, p); -- moved up, BUG!!! */
 | 
							/* WRITE_ONCE(b, 1); -- moved up, BUG!!! */
 | 
				
			||||||
		do_something();
 | 
							do_something();
 | 
				
			||||||
	} else {
 | 
						} else {
 | 
				
			||||||
		/* WRITE_ONCE(b, p); -- moved up, BUG!!! */
 | 
							/* WRITE_ONCE(b, 1); -- moved up, BUG!!! */
 | 
				
			||||||
		do_something_else();
 | 
							do_something_else();
 | 
				
			||||||
	}
 | 
						}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -723,10 +728,10 @@ memory barriers, for example, smp_store_release():
 | 
				
			||||||
 | 
					
 | 
				
			||||||
	q = READ_ONCE(a);
 | 
						q = READ_ONCE(a);
 | 
				
			||||||
	if (q) {
 | 
						if (q) {
 | 
				
			||||||
		smp_store_release(&b, p);
 | 
							smp_store_release(&b, 1);
 | 
				
			||||||
		do_something();
 | 
							do_something();
 | 
				
			||||||
	} else {
 | 
						} else {
 | 
				
			||||||
		smp_store_release(&b, p);
 | 
							smp_store_release(&b, 1);
 | 
				
			||||||
		do_something_else();
 | 
							do_something_else();
 | 
				
			||||||
	}
 | 
						}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -735,10 +740,10 @@ ordering is guaranteed only when the stores differ, for example:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
	q = READ_ONCE(a);
 | 
						q = READ_ONCE(a);
 | 
				
			||||||
	if (q) {
 | 
						if (q) {
 | 
				
			||||||
		WRITE_ONCE(b, p);
 | 
							WRITE_ONCE(b, 1);
 | 
				
			||||||
		do_something();
 | 
							do_something();
 | 
				
			||||||
	} else {
 | 
						} else {
 | 
				
			||||||
		WRITE_ONCE(b, r);
 | 
							WRITE_ONCE(b, 2);
 | 
				
			||||||
		do_something_else();
 | 
							do_something_else();
 | 
				
			||||||
	}
 | 
						}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -751,10 +756,10 @@ the needed conditional.  For example:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
	q = READ_ONCE(a);
 | 
						q = READ_ONCE(a);
 | 
				
			||||||
	if (q % MAX) {
 | 
						if (q % MAX) {
 | 
				
			||||||
		WRITE_ONCE(b, p);
 | 
							WRITE_ONCE(b, 1);
 | 
				
			||||||
		do_something();
 | 
							do_something();
 | 
				
			||||||
	} else {
 | 
						} else {
 | 
				
			||||||
		WRITE_ONCE(b, r);
 | 
							WRITE_ONCE(b, 2);
 | 
				
			||||||
		do_something_else();
 | 
							do_something_else();
 | 
				
			||||||
	}
 | 
						}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -763,7 +768,7 @@ equal to zero, in which case the compiler is within its rights to
 | 
				
			||||||
transform the above code into the following:
 | 
					transform the above code into the following:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
	q = READ_ONCE(a);
 | 
						q = READ_ONCE(a);
 | 
				
			||||||
	WRITE_ONCE(b, p);
 | 
						WRITE_ONCE(b, 1);
 | 
				
			||||||
	do_something_else();
 | 
						do_something_else();
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Given this transformation, the CPU is not required to respect the ordering
 | 
					Given this transformation, the CPU is not required to respect the ordering
 | 
				
			||||||
| 
						 | 
					@ -776,10 +781,10 @@ one, perhaps as follows:
 | 
				
			||||||
	q = READ_ONCE(a);
 | 
						q = READ_ONCE(a);
 | 
				
			||||||
	BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
 | 
						BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
 | 
				
			||||||
	if (q % MAX) {
 | 
						if (q % MAX) {
 | 
				
			||||||
		WRITE_ONCE(b, p);
 | 
							WRITE_ONCE(b, 1);
 | 
				
			||||||
		do_something();
 | 
							do_something();
 | 
				
			||||||
	} else {
 | 
						} else {
 | 
				
			||||||
		WRITE_ONCE(b, r);
 | 
							WRITE_ONCE(b, 2);
 | 
				
			||||||
		do_something_else();
 | 
							do_something_else();
 | 
				
			||||||
	}
 | 
						}
 | 
				
			||||||
 | 
					
 | 
				
			||||||
| 
						 | 
					@ -812,30 +817,28 @@ not necessarily apply to code following the if-statement:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
	q = READ_ONCE(a);
 | 
						q = READ_ONCE(a);
 | 
				
			||||||
	if (q) {
 | 
						if (q) {
 | 
				
			||||||
		WRITE_ONCE(b, p);
 | 
							WRITE_ONCE(b, 1);
 | 
				
			||||||
	} else {
 | 
						} else {
 | 
				
			||||||
		WRITE_ONCE(b, r);
 | 
							WRITE_ONCE(b, 2);
 | 
				
			||||||
	}
 | 
						}
 | 
				
			||||||
	WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from "a". */
 | 
						WRITE_ONCE(c, 1);  /* BUG: No ordering against the read from 'a'. */
 | 
				
			||||||
 | 
					
 | 
				
			||||||
It is tempting to argue that there in fact is ordering because the
 | 
					It is tempting to argue that there in fact is ordering because the
 | 
				
			||||||
compiler cannot reorder volatile accesses and also cannot reorder
 | 
					compiler cannot reorder volatile accesses and also cannot reorder
 | 
				
			||||||
the writes to "b" with the condition.  Unfortunately for this line
 | 
					the writes to 'b' with the condition.  Unfortunately for this line
 | 
				
			||||||
of reasoning, the compiler might compile the two writes to "b" as
 | 
					of reasoning, the compiler might compile the two writes to 'b' as
 | 
				
			||||||
conditional-move instructions, as in this fanciful pseudo-assembly
 | 
					conditional-move instructions, as in this fanciful pseudo-assembly
 | 
				
			||||||
language:
 | 
					language:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
	ld r1,a
 | 
						ld r1,a
 | 
				
			||||||
	ld r2,p
 | 
					 | 
				
			||||||
	ld r3,r
 | 
					 | 
				
			||||||
	cmp r1,$0
 | 
						cmp r1,$0
 | 
				
			||||||
	cmov,ne r4,r2
 | 
						cmov,ne r4,$1
 | 
				
			||||||
	cmov,eq r4,r3
 | 
						cmov,eq r4,$2
 | 
				
			||||||
	st r4,b
 | 
						st r4,b
 | 
				
			||||||
	st $1,c
 | 
						st $1,c
 | 
				
			||||||
 | 
					
 | 
				
			||||||
A weakly ordered CPU would have no dependency of any sort between the load
 | 
					A weakly ordered CPU would have no dependency of any sort between the load
 | 
				
			||||||
from "a" and the store to "c".  The control dependencies would extend
 | 
					from 'a' and the store to 'c'.  The control dependencies would extend
 | 
				
			||||||
only to the pair of cmov instructions and the store depending on them.
 | 
					only to the pair of cmov instructions and the store depending on them.
 | 
				
			||||||
In short, control dependencies apply only to the stores in the then-clause
 | 
					In short, control dependencies apply only to the stores in the then-clause
 | 
				
			||||||
and else-clause of the if-statement in question (including functions
 | 
					and else-clause of the if-statement in question (including functions
 | 
				
			||||||
| 
						 | 
					@ -843,7 +846,7 @@ invoked by those two clauses), not to code following that if-statement.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Finally, control dependencies do -not- provide transitivity.  This is
 | 
					Finally, control dependencies do -not- provide transitivity.  This is
 | 
				
			||||||
demonstrated by two related examples, with the initial values of
 | 
					demonstrated by two related examples, with the initial values of
 | 
				
			||||||
x and y both being zero:
 | 
					'x' and 'y' both being zero:
 | 
				
			||||||
 | 
					
 | 
				
			||||||
	CPU 0                     CPU 1
 | 
						CPU 0                     CPU 1
 | 
				
			||||||
	=======================   =======================
 | 
						=======================   =======================
 | 
				
			||||||
| 
						 | 
					@ -915,6 +918,9 @@ In summary:
 | 
				
			||||||
  (*) Control dependencies do -not- provide transitivity.  If you
 | 
					  (*) Control dependencies do -not- provide transitivity.  If you
 | 
				
			||||||
      need transitivity, use smp_mb().
 | 
					      need transitivity, use smp_mb().
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					  (*) Compilers do not understand control dependencies.  It is therefore
 | 
				
			||||||
 | 
					      your job to ensure that they do not break your code.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 | 
					
 | 
				
			||||||
SMP BARRIER PAIRING
 | 
					SMP BARRIER PAIRING
 | 
				
			||||||
-------------------
 | 
					-------------------
 | 
				
			||||||
| 
						 | 
					
 | 
				
			||||||
		Loading…
	
		Reference in a new issue