mirror of
				https://github.com/torvalds/linux.git
				synced 2025-11-04 02:30:34 +02:00 
			
		
		
		
	
				When the following tests last for several hours, the problem will occur.
Server:
    rds-stress -r 1.1.1.16 -D 1M
Client:
    rds-stress -r 1.1.1.14 -s 1.1.1.16 -D 1M -T 30
The following will occur.
"
Starting up....
tsks   tx/s   rx/s  tx+rx K/s    mbi K/s    mbo K/s tx us/c   rtt us cpu
%
  1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
  1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
  1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
  1      0      0       0.00       0.00       0.00    0.00 0.00 -1.00
"
>From vmcore, we can find that clean_list is NULL.
>From the source code, rds_mr_flushd calls rds_ib_mr_pool_flush_worker.
Then rds_ib_mr_pool_flush_worker calls
"
 rds_ib_flush_mr_pool(pool, 0, NULL);
"
Then in function
"
int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *pool,
                         int free_all, struct rds_ib_mr **ibmr_ret)
"
ibmr_ret is NULL.
In the source code,
"
...
list_to_llist_nodes(pool, &unmap_list, &clean_nodes, &clean_tail);
if (ibmr_ret)
        *ibmr_ret = llist_entry(clean_nodes, struct rds_ib_mr, llnode);
/* more than one entry in llist nodes */
if (clean_nodes->next)
        llist_add_batch(clean_nodes->next, clean_tail, &pool->clean_list);
...
"
When ibmr_ret is NULL, llist_entry is not executed. clean_nodes->next
instead of clean_nodes is added in clean_list.
So clean_nodes is discarded. It can not be used again.
The workqueue is executed periodically. So more and more clean_nodes are
discarded. Finally the clean_list is NULL.
Then this problem will occur.
Fixes: 
		
	
					 | 
			||
|---|---|---|
| .. | ||
| af_rds.c | ||
| bind.c | ||
| cong.c | ||
| connection.c | ||
| ib.c | ||
| ib.h | ||
| ib_cm.c | ||
| ib_fmr.c | ||
| ib_frmr.c | ||
| ib_mr.h | ||
| ib_rdma.c | ||
| ib_recv.c | ||
| ib_ring.c | ||
| ib_send.c | ||
| ib_stats.c | ||
| ib_sysctl.c | ||
| info.c | ||
| info.h | ||
| Kconfig | ||
| loop.c | ||
| loop.h | ||
| Makefile | ||
| message.c | ||
| page.c | ||
| rdma.c | ||
| rdma_transport.c | ||
| rdma_transport.h | ||
| rds.h | ||
| rds_single_path.h | ||
| recv.c | ||
| send.c | ||
| stats.c | ||
| sysctl.c | ||
| tcp.c | ||
| tcp.h | ||
| tcp_connect.c | ||
| tcp_listen.c | ||
| tcp_recv.c | ||
| tcp_send.c | ||
| tcp_stats.c | ||
| threads.c | ||
| transport.c | ||