LS-DYNA mpp-dyna Cardinal: Remote access error on mlx5_0:1, RDMA_READ

Category: 
Resolution: 
Unresolved

You may encounter the following error while running mpp-dyna jobs with multiple nodes:

[c0054:22206:0:22206] ib_mlx5_log.c:179  Remote access error on mlx5_0:1/IB (synd 0x13 vend 0x88 hw_synd 0/0)
[c0054:22206:0:22206] ib_mlx5_log.c:179  RC QP 0xef8 wqe[365]: RDMA_READ s-- [rva 0x32a5cb38 rkey 0x20000] [va 0x319d3bf0 len 10200 lkey 0x2e5f98] [rqpn 0xfb8 dlid=2285 sl=0 port=1 src_path_bits=0]
forrtl: error (76): Abort trap signal

Cause of the Error

Unknown

Affected versions

mpp-dyna versions 11, 13, when running on multiple nodes

Workaround

No known workaround yet