prov/efa: avoid gdr_pin/gdr_map for dmabuf mrs

efa_mr_hmem_setup previously always called ofi_hmem_dev_register on all FI_HMEM_CUDA calls, regardless of the presence of FI_MR_DMABUF in flags. When gdrcopy is enabled, this means deconstructing the fi_mr_dmabuf into a struct iovec from its {base, offset, len} 3-tuple, then passing the resulting iovec to gdr_pin followed by gdr_map. a dmabuf cannot be exported by the nvidia module without an implicit promise that the address space is already reserved and mapped in the current pid, of appropriate size and alignment, and that all pages/ranges backing it can be made available to an importer. All requirements are enforced by the cuda APIs used to acquire one. At best, calls to libgdrcopy here are unnecessary for dmabufs, and at worst the pgprots set by gdrdrv are different enough from the ones setup by cuda proper to cause issues, or the redundant mappings become costly for the driver to maintain. Prior to this patch, apps can only prevent these gdr_map calls on dmabuf arguments by disabling gdrcopy entirely through environment variables before launch. But apps may wish to use fi_mr_regattr with dmabuf arguments in the default case, while still reserving the right to call fi_mr_regattr with iov arguments on the same domain, where the gdr flow may still be desired in the latter case. This makes that possible. Signed-off-by: Nicholas Sielicki <[email protected]>
pmodels · Oct 10, 2024 · 8708b5c · 8708b5c
1 parent 2c7d786
commit 8708b5c
Showing 1 changed file with 2 additions and 7 deletions.
diff --git a/prov/efa/src/efa_mr.c b/prov/efa/src/efa_mr.c
@@ -184,12 +184,6 @@ static int efa_mr_hmem_setup(struct efa_mr *efa_mr,
 {
 	int err;
 	struct iovec mr_iov = {0};
-
-	if (flags & FI_MR_DMABUF)
-		ofi_mr_get_iov_from_dmabuf(&mr_iov, attr->dmabuf, 1);
-	else
-		mr_iov = *attr->mr_iov;
-
 	efa_mr->peer.flags = flags;
 
 	if (attr->iface == FI_HMEM_SYSTEM) {
@@ -227,7 +221,8 @@ static int efa_mr_hmem_setup(struct efa_mr *efa_mr,
 		efa_mr->needs_sync = true;
 		efa_mr->peer.device.cuda = attr->device.cuda;
 
-		if (cuda_is_gdrcopy_enabled()) {
+		if (!(flags & FI_MR_DMABUF) && cuda_is_gdrcopy_enabled()) {
+			mr_iov = *attr->mr_iov;
 			err = ofi_hmem_dev_register(FI_HMEM_CUDA, mr_iov.iov_base, mr_iov.iov_len,
 						    (uint64_t *)&efa_mr->peer.hmem_data);
 			efa_mr->peer.flags |= OFI_HMEM_DATA_DEV_REG_HANDLE;