3
0
Fork 0
forked from mirrors/linux
kernel/drivers/net/ethernet/intel
Przemek Kitszel 120f28a6f3 iavf: get rid of the crit lock
Get rid of the crit lock.
That frees us from the error prone logic of try_locks.

Thanks to netdev_lock() by Jakub it is now easy, and in most cases we were
protected by it already - replace crit lock by netdev lock when it was not
the case.

Lockdep reports that we should cancel the work under crit_lock [splat1],
and that was the scheme we have mostly followed since [1] by Slawomir.
But when that is done we still got into deadlocks [splat2]. So instead
we should look at the bigger problem, namely "weird locking/scheduling"
of the iavf. The first step to fix that is to remove the crit lock.
I will followup with a -next series that simplifies scheduling/tasks.

Cancel the work without netdev lock (weird unlock+lock scheme),
to fix the [splat2] (which would be totally ugly if we would kept
the crit lock).

Extend protected part of iavf_watchdog_task() to include scheduling
more work.

Note that the removed comment in iavf_reset_task() was misplaced,
it belonged to inside of the removed if condition, so it's gone now.

[splat1] - w/o this patch - The deadlock during VF removal:
     WARNING: possible circular locking dependency detected
     sh/3825 is trying to acquire lock:
      ((work_completion)(&(&adapter->watchdog_task)->work)){+.+.}-{0:0}, at: start_flush_work+0x1a1/0x470
          but task is already holding lock:
      (&adapter->crit_lock){+.+.}-{4:4}, at: iavf_remove+0xd1/0x690 [iavf]
          which lock already depends on the new lock.

[splat2] - when cancelling work under crit lock, w/o this series,
	   see [2] for the band aid attempt
    WARNING: possible circular locking dependency detected
    sh/3550 is trying to acquire lock:
    ((wq_completion)iavf){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x26/0x90
        but task is already holding lock:
    (&dev->lock){+.+.}-{4:4}, at: iavf_remove+0xa6/0x6e0 [iavf]
        which lock already depends on the new lock.

[1] fc2e6b3b13 ("iavf: Rework mutexes for better synchronisation")
[2] https://github.com/pkitszel/linux/commit/52dddbfc2bb60294083f5711a158a

Fixes: d1639a1731 ("iavf: fix a deadlock caused by rtnl and driver's lock circular dependencies")
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-03 09:48:03 -07:00
..
e1000 e1000: Hold RTNL when e1000_down can be called 2024-11-13 10:30:21 -08:00
e1000e net: e1000e: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() 2025-04-11 11:58:58 -07:00
fm10k treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
i40e i40e: fix MMIO write access to an invalid page in i40e_clear_hw 2025-04-11 11:58:57 -07:00
iavf iavf: get rid of the crit lock 2025-06-03 09:48:03 -07:00
ice ice: fix rebuilding the Tx scheduler tree for large queue counts 2025-05-30 13:54:43 -07:00
idpf idpf: avoid mailbox timeout delays during reset 2025-05-30 13:54:52 -07:00
igb igb: Get rid of spurious interrupts 2025-04-29 15:13:43 -07:00
igbvf treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
igc Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue 2025-05-01 17:51:31 -07:00
ixgbe ipsec-next-2025-05-23 2025-05-26 18:32:48 +02:00
ixgbevf xfrm: Add explicit dev to .xdo_dev_state_{add,delete,free} 2025-04-16 11:01:41 +02:00
libeth module: Convert symbol namespace to string literal 2024-12-02 11:34:44 -08:00
libie module: Convert symbol namespace to string literal 2024-12-02 11:34:44 -08:00
e100.c treewide: Switch/rename to timer_delete[_sync]() 2025-04-05 10:30:12 +02:00
Kconfig igc: add support for frame preemption verification 2025-04-18 09:16:58 -07:00
Makefile net: intel: introduce {, Intel} Ethernet common library 2024-04-24 11:06:25 -07:00