Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
prov/efa: avoid gdr_pin/gdr_map for dmabuf mrs
efa_mr_hmem_setup previously always called ofi_hmem_dev_register on all FI_HMEM_CUDA calls, regardless of the presence of FI_MR_DMABUF in flags. When gdrcopy is enabled, this means deconstructing the fi_mr_dmabuf into a struct iovec from its {base, offset, len} 3-tuple, then passing the resulting iovec to gdr_pin followed by gdr_map. a dmabuf cannot be exported by the nvidia module without an implicit promise that the address space is already reserved and mapped in the current pid, of appropriate size and alignment, and that all pages/ranges backing it can be made available to an importer. All requirements are enforced by the cuda APIs used to acquire one. At best, calls to libgdrcopy here are unnecessary for dmabufs, and at worst the pgprots set by gdrdrv are different enough from the ones setup by cuda proper to cause issues, or the redundant mappings become costly for the driver to maintain. Prior to this patch, apps can only prevent these gdr_map calls on dmabuf arguments by disabling gdrcopy entirely through environment variables before launch. But apps may wish to use fi_mr_regattr with dmabuf arguments in the default case, while still reserving the right to call fi_mr_regattr with iov arguments on the same domain, where the gdr flow may still be desired in the latter case. This makes that possible. Signed-off-by: Nicholas Sielicki <[email protected]>
- Loading branch information