linux/drivers/infiniband/hw/hns
Junxian Huang 9747c0c779 RDMA/hns: Fix mbox timing out by adding retry mechanism
If a QP is modified to error state and a flush CQE process is triggered,
the subsequent QP destruction mbox can still be successfully posted but
will be blocked in HW until the flush CQE process finishes. This causes
further mbox posting timeouts in driver. The blocking time is related
to QP depth. Considering an extreme case where SQ depth and RQ depth
are both 32K, the blocking time can reach about 135ms.

This patch adds a retry mechanism for mbox posting. For each try, FW
waits 15ms for HW to complete the previous mbox, otherwise return a
timeout error code to driver. Counting other time consumption in FW,
set 8 tries for mbox posting and a 5ms time gap before each retry to
increase to a sufficient timeout limit.

Fixes: 0425e3e6e0 ("RDMA/hns: Support flush cqe for hip08 in kernel space")
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Link: https://patch.msgid.link/20250208105930.522796-1-huangjunxian6@hisilicon.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-02-09 04:57:15 -05:00
..
hns_roce_ah.c RDMA/hns: Fix ah error counter in sw stat not increasing 2024-09-16 11:21:58 +03:00
hns_roce_alloc.c RDMA/hns: Remove unused parameters and variables 2024-04-16 15:06:47 +03:00
hns_roce_cmd.c RDMA/hns: Support SW stats with debugfs 2023-11-19 14:55:43 +02:00
hns_roce_cmd.h RDMA/hns: Append SCC context to the raw dump of QPC 2024-03-07 11:26:10 +02:00
hns_roce_common.h RDMA/hns: Remove support for HIP06 2022-01-05 15:50:56 -04:00
hns_roce_cq.c RDMA/hns: Fix cpu stuck caused by printings during reset 2024-10-30 14:13:55 +02:00
hns_roce_db.c RDMA/hns: Remove support for HIP06 2022-01-05 15:50:56 -04:00
hns_roce_debugfs.c RDMA/hns: Modify debugfs name 2024-10-30 14:13:55 +02:00
hns_roce_debugfs.h RDMA/hns: Support SW stats with debugfs 2023-11-19 14:55:43 +02:00
hns_roce_device.h RDMA/hns: Fix different dgids mapping to the same dip_idx 2024-11-14 04:56:14 -05:00
hns_roce_hem.c RDMA/hns: Fix mapping error of zero-hop WQE buffer 2024-12-23 09:58:30 -05:00
hns_roce_hem.h RDMA/hns: Use complete parentheses in macros 2024-04-16 15:06:47 +03:00
hns_roce_hw_v2.c RDMA/hns: Fix mbox timing out by adding retry mechanism 2025-02-09 04:57:15 -05:00
hns_roce_hw_v2.h RDMA/hns: Fix mbox timing out by adding retry mechanism 2025-02-09 04:57:15 -05:00
hns_roce_main.c RDMA/hns: Fix different dgids mapping to the same dip_idx 2024-11-14 04:56:14 -05:00
hns_roce_mr.c RDMA/hns: Fix mapping error of zero-hop WQE buffer 2024-12-23 09:58:30 -05:00
hns_roce_pd.c RDMA/hns: Support SW stats with debugfs 2023-11-19 14:55:43 +02:00
hns_roce_qp.c RDMA/hns: Fix different dgids mapping to the same dip_idx 2024-11-14 04:56:14 -05:00
hns_roce_restrack.c RDMA/hns: Append SCC context to the raw dump of QPC 2024-03-07 11:26:10 +02:00
hns_roce_srq.c RDMA/hns: Fix cpu stuck caused by printings during reset 2024-10-30 14:13:55 +02:00
Kconfig RDMA/hns: Clean up the legacy CONFIG_INFINIBAND_HNS 2025-01-06 08:41:06 -05:00
Makefile RDMA/hns: Clean up the legacy CONFIG_INFINIBAND_HNS 2025-01-06 08:41:06 -05:00