You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
device
Reserved 64 bits for device identifier if using non-standard HMEM interface. This field is ignore unless the iface field is valid. Otherwise, the device field is determined by the value specified through iface.
cuda
For FI_HMEM_CUDA, this is equivalent to CUdevice (int).
However, MPICH uses attr->device that is obtained from cudaPointerGetAttributes.
I am not familiar with the difference between the handle and the device id, but the doc of cuDeviceGet seems to suggest there is a difference:
Returns a handle to a compute device.
Parameters
device
- Returned device handle
ordinal
- Device number to get handle for
The text was updated successfully, but these errors were encountered:
raffenet
added a commit
to raffenet/mpich
that referenced
this issue
Oct 2, 2024
Libfabric docs say that the value of the cuda field in the regattr
struct is the device handle gotten from cuDeviceGet, not the
ordinal. Fixespmodels#7148.
Libfabric docs say that the value of the cuda field in the regattr
struct is the device handle gotten from cuDeviceGet, not the
ordinal. Fixespmodels#7148.
raffenet
added a commit
to raffenet/mpich
that referenced
this issue
Oct 9, 2024
Libfabric docs say that the value of the cuda field in the regattr
struct is the device handle gotten from cuDeviceGet, not the
ordinal. Fixespmodels#7148.
Hi all,
when taking a closer look at #7140 I realized that MPICH seems to not use the correct handle for
fi_mr_regattr
.The documentation specifies that:
However, MPICH uses
attr->device
that is obtained fromcudaPointerGetAttributes
.I am not familiar with the difference between the handle and the device id, but the doc of cuDeviceGet seems to suggest there is a difference:
The text was updated successfully, but these errors were encountered: