We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1737742661.094513] [cn0244:260472:0] ib_mlx5_dv.c:430 UCX ERROR mlx5dv_devx_obj_create(QP) failed on mlx5_0, syndrome 0x2c4154: Remote I/O error [cn0244:260472] pml_ucx.c:421 Error: ucp_ep_create(proc=47) failed: Input/output error [cn0244:260472] pml_ucx.c:472 Error: Failed to resolve UCX endpoint for rank 19 [LOG_CAT_COMMPATTERNS] isend failed in comm_allreduce_pml at iterations 1
@keisukefukuda @yshestakov @pathscale @khamidouche @yuq
why such kind of error are comming any help ?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
1737742661.094513] [cn0244:260472:0] ib_mlx5_dv.c:430 UCX ERROR mlx5dv_devx_obj_create(QP) failed on mlx5_0, syndrome 0x2c4154: Remote I/O error
[cn0244:260472] pml_ucx.c:421 Error: ucp_ep_create(proc=47) failed: Input/output error
[cn0244:260472] pml_ucx.c:472 Error: Failed to resolve UCX endpoint for rank 19
[LOG_CAT_COMMPATTERNS] isend failed in comm_allreduce_pml at iterations 1
@keisukefukuda @yshestakov @pathscale @khamidouche @yuq
why such kind of error are comming any help ?
The text was updated successfully, but these errors were encountered: