mirror of
				https://github.com/torvalds/linux.git
				synced 2025-11-04 10:40:15 +02:00 
			
		
		
		
	
				Henry reported that rt_mutex_adjust_prio_check() has an ordering
problem and puts the lie to the comment in [7]. Sharing the sort key
between lock->waiters and owner->pi_waiters *does* create problems,
since unlike what the comment claims, holding [L] is insufficient.
Notably, consider:
	A
      /   \
     M1   M2
     |     |
     B     C
That is, task A owns both M1 and M2, B and C block on them. In this
case a concurrent chain walk (B & C) will modify their resp. sort keys
in [7] while holding M1->wait_lock and M2->wait_lock. So holding [L]
is meaningless, they're different Ls.
This then gives rise to a race condition between [7] and [11], where
the requeue of pi_waiters will observe an inconsistent tree order.
	B				C
  (holds M1->wait_lock,		(holds M2->wait_lock,
   holds B->pi_lock)		 holds A->pi_lock)
  [7]
  waiter_update_prio();
  ...
  [8]
  raw_spin_unlock(B->pi_lock);
  ...
  [10]
  raw_spin_lock(A->pi_lock);
				[11]
				rt_mutex_enqueue_pi();
				// observes inconsistent A->pi_waiters
				// tree order
Fixing this means either extending the range of the owner lock from
[10-13] to [6-13], with the immediate problem that this means [6-8]
hold both blocked and owner locks, or duplicating the sort key.
Since the locking in chain walk is horrible enough without having to
consider pi_lock nesting rules, duplicate the sort key instead.
By giving each tree their own sort key, the above race becomes
harmless, if C sees B at the old location, then B will correct things
(if they need correcting) when it walks up the chain and reaches A.
Fixes: 
		
	
					 | 
			||
|---|---|---|
| .. | ||
| irqflag-debug.c | ||
| lock_events.c | ||
| lock_events.h | ||
| lock_events_list.h | ||
| lockdep.c | ||
| lockdep_internals.h | ||
| lockdep_proc.c | ||
| lockdep_states.h | ||
| locktorture.c | ||
| Makefile | ||
| mcs_spinlock.h | ||
| mutex-debug.c | ||
| mutex.c | ||
| mutex.h | ||
| osq_lock.c | ||
| percpu-rwsem.c | ||
| qrwlock.c | ||
| qspinlock.c | ||
| qspinlock_paravirt.h | ||
| qspinlock_stat.h | ||
| rtmutex.c | ||
| rtmutex_api.c | ||
| rtmutex_common.h | ||
| rwbase_rt.c | ||
| rwsem.c | ||
| semaphore.c | ||
| spinlock.c | ||
| spinlock_debug.c | ||
| spinlock_rt.c | ||
| test-ww_mutex.c | ||
| ww_mutex.h | ||
| ww_rt_mutex.c | ||