From 781093e3b5857cd3e543b6138f2c830eace70c84 Mon Sep 17 00:00:00 2001 From: Jianxin Xiong Date: Fri, 15 Mar 2024 10:38:44 -0700 Subject: [PATCH] v1.21.0rc1 Signed-off-by: Jianxin Xiong --- AUTHORS | 18 ++ Makefile.am | 2 +- NEWS.md | 546 ++++++++++++++++++++++++++++++++++++++- README.md | 67 ++--- configure.ac | 2 +- fabtests/configure.ac | 2 +- include/rdma/fabric.h | 2 +- include/windows/config.h | 2 +- man/fi_mlx.7.md | 20 -- man/fi_netdir.7.md | 116 --------- man/fi_provider.7.md | 4 + man/man7/fi_mlx.7 | 16 -- man/man7/fi_netdir.7 | 112 -------- 13 files changed, 600 insertions(+), 309 deletions(-) delete mode 100644 man/fi_mlx.7.md delete mode 100644 man/fi_netdir.7.md delete mode 100644 man/man7/fi_mlx.7 delete mode 100644 man/man7/fi_netdir.7 diff --git a/AUTHORS b/AUTHORS index 0928a5e4d49..e5ef4f08911 100644 --- a/AUTHORS +++ b/AUTHORS @@ -8,6 +8,7 @@ Alex McKinley Alex McKinley Alexia Ingerson alexia.ingerson +Amir Shehata Amir Shehata Amith Abraham Ana Guerrero López @@ -17,6 +18,7 @@ Andrew Friedley Andrey Lobanov Anthony Zinger Ao Li +Archana Venkatesha Arun C Ilango arun ilango Arun Ilango @@ -28,6 +30,7 @@ AWS ParallelCluster user AWS ParallelCluster user aws-ceenugal <123417666+aws-ceenugal@users.noreply.github.com> Ben Lynam +Ben Lynam Ben Lynam Ben Menadue Ben Turrubiates @@ -43,6 +46,7 @@ Brian J. Murrell Brian Li bwilsoncn Casey Carter +chadkoster-hpe Chang Hyun Park Charles J Archer Charles King @@ -52,10 +56,13 @@ Chen Zhao Chenwei Zhang Chien Tin Tung Chris Dolan +Chris Taylor Chuck Fossen +Cody Mann Coni Gehler ct-clmsn Dardo D Kleiner +dariuszsciebura <93722774+dariuszsciebura@users.noreply.github.com> Darryl Abbate Dave Goodell David Noel @@ -66,6 +73,8 @@ Dmitry Durnov Dmitry Gladkov Doug Oucharek Edgar Gabriel +Elias Kozah +Elias Kozah Eric Raut Erik Paulson Erik Paulson @@ -76,6 +85,7 @@ Evgeny Leksikov Ezra Kissel Firas Jahjah Frank Zago +Franz Pöschel fullerdj Gal Pressman Gengbin Zheng @@ -110,7 +120,9 @@ Jeff Squyres Jerome Berryhill Jerome Boyd Berryhill Jerome Soumagne +Jessie Yang Jiakun Yan +Jianshui Yu Jianxin Xiong jianxin.xiong Jie Zhang @@ -118,6 +130,7 @@ Jim Snow Jingyin Tang Jithin Jose Joe Doyle +Joe Nemeth Johannes Ziegenbalg John Biddiscombe John Byrne @@ -139,6 +152,7 @@ kseager Kyle Gerheiser Latchesar Ionkov Leena Radeke +Lindsay Reiser Lisanna Dettwyler Lisanna Dettwyler Lukasz Dorau @@ -163,6 +177,7 @@ mmubarak Mohan Gandhi muttormark Neil Spruit +Nicholas Sielicki Nicolas Morey-Chaisemartin Nikhil Nanal nikhilnanal @@ -204,6 +219,7 @@ Robert Wespetal Rohit Zambre Ryan Hankins Ryan Hankins +Rémi Dehenne Sai Sunku Sannikov, Alexander Sayantan Sur @@ -237,7 +253,9 @@ Thomas Gillis Thomas Huber Thomas Huber Thomas Smith +thomasgillis Thorsten Schütt +Tim Hu Tim Thompson <80290075+timothom64@users.noreply.github.com> Tim Thompson Todd Rimmer diff --git a/Makefile.am b/Makefile.am index b386d107b25..8571a583c18 100644 --- a/Makefile.am +++ b/Makefile.am @@ -220,7 +220,7 @@ src_libfabric_la_LIBADD = src_libfabric_la_DEPENDENCIES = libfabric.map if !EMBEDDED -src_libfabric_la_LDFLAGS += -version-info 24:0:23 +src_libfabric_la_LDFLAGS += -version-info 25:0:24 endif src_libfabric_la_LDFLAGS += -export-dynamic \ $(libfabric_version_script) diff --git a/NEWS.md b/NEWS.md index 152c685de3e..7848b17a410 100644 --- a/NEWS.md +++ b/NEWS.md @@ -6,35 +6,579 @@ bug fixes (and other actions) for each version of Libfabric since version 1.0. New major releases include all fixes from minor releases with earlier release dates. -v1.20.0, Fri Nov 17, 2023 +v1.21.0, Fri Mar 22, 2024 ======================== ## Core +## BGQ + +Removed. + +## CXI + +New provider supporting Cray's Slingshot network. + ## EFA +## GNI + +Removed. + ## Hooks +# NETDIR + +Removed. The functionality is intergrated into the verbs provider. + ## OPX ## Peer ## PSM3 +## RSTREAM + +Removed. + +## RXM + +## SHM + +## TCP + +## UCX + +## Util + +## Verbs + + +## Fabtests + + +v1.20.1, Mon Jan 22, 2024 +========================= + +## Core + +- hmem/ze: Change the library name passed to dlopen +- hmem/ze: map device id to physical device +- hmem/ze: skip duplicate initialization +- hmem/ze: dynamically allocate device resources based on number of devices +- hmem/ze: fix hmem_ze_copy_engine variable look up +- hmem/ze: Increase ZE_MAX_DEVICES to 32 +- man: Fix typo in fi_getinfo man page +- Fix compiler warning when compiling with ICX +- man: Fix fi_rxm.7 and fi_collective.3 man pages +- man: Update EFA docs for FI_EFA_INTER_MIN_READ_WRITE_SIZE + +## EFA + +- efa_rdm_ep_record_tx_op_submitted() rm peer lookup +- Remove peer lookup from efa_rdm_pke_sendv() +- Make handshake response use txe +- test: Only close SHM if SHM peer is Created +- Handshake code allocs txe via efa util +- Initialize txe.rma_iov_count to 0 +- Switch fi_addr to efa_rdm_peer in trigger_handshake +- Downgrade EFA Endpoint Creation WARN to INFO +- Init srx_ctx before use +- Clean up generic_send path +- Pass in efa_rdm_ep to efa_rdm_msg_generic_recv() +- Make recv path slightly more efficient +- re-org rma write to avoid duplicate checks +- Add missing sync_memops call to writedata +- use peer pointer from txe in read, write and send +- Pass in peer pointer to txe +- Get rid of noop instruction from empty #define +- Remove noop memset +- Fix the ibv cq error handling. +- Don't do handshake for local read +- Fix a typo in configure.m4 +- Make runt_size aligned + +## NetDir + +- Add missing unlock in error path of nd_send_ack() + +## OPX + +- Initialize cq error data size + +## RXM + +- Fix data error with FI_OFI_RXM_USE_RNDV_WRITE=1 + +## SHM + +- Fix coverity issue about resource leak +- Adjust the order of smr_region fields. +- Allocate peer device fds dynamically + +## Util + +- Fix coverity issue about missing lock +- Implement timeout in util_wait_yield_run() +- Fix bug in util_cq startup error case +- util_mem_hooks: add missing parantheses + +## Verbs + +- Windows: Resolve regression in user data retrieval + +## Fabtests + +- efa: Close ibv device after use +- efa: Get device MR limit from ibv_query_device +- efa: Add simple unexpected test to MR exhaustion test +- pytest: add a new ssh connection error pattern + + +v1.19.1, Mon Jan 22, 2024 +========================= + +## Core + +- hmem/ze: Change the library name passed to dlopen +- hmem/ze: map device id to physical device +- hmem/ze: skip duplicate initialization +- hmem/ze: dynamically allocate device resources based on number of devices +- hmem/ze: fix hmem_ze_copy_engine variable look up +- hmem/ze: Increase ZE_MAX_DEVICES to 32 +- man: Fix typo in fi_getinfo man page +- Fix compiler warning when compiling with ICX +- man: Fix fi_rxm.7 and fi_collective.3 man pages +- man: Fix the fi_provider.7 man page for the man page converter +- hmem/synapseai: Refine the error handling and warning +- configure.ac Fix `--with-lttng` causing `yes/` to populate {CPP,LD}FLAGS +- hmem: Only initalize synapseai if device exists +- hmem/ze: fix incorrect device id in copy function +- configure.ac: Fix `with_synaposeai` typo + +## EFA + +- Fix the ibv cq error handling. +- Don't do handshake for local read +- Don't do handshake for local fi_write +- Make runt_size aligned +- Add pingpong test after exhausting MRs +- Introduce utilities to exhaust MRs on EFA device +- Add read nack protocol docs +- Receiver send NACK if runt read fails with ENOMR +- Sender switch to long CTS protocol if runt read fails with ENOMR +- Receiver send NACK if long read fails with ENOMR +- Update efa_rdm_rxe_map_remove to accept msg_id and addr +- Sender switch to long CTS protocol if long read fails with ENOMR +- Introduce new READ_NACK feature +- Do not abort on all deprecated env vars +- Allocate pke_vec, recv_wr_vec, sge_vec from heap +- Close shm resource when it is disabled in ep +- Disable RUNTING for Neuron +- Move cuda-sync-memops from MR to EP +- Do not insert shm av inside efa progress engine +- Fix coverity warning in efa_mr_reg_impl +- Fix typos in packet macros +- Adjust posted receive size to pkt_size +- RDMA write with immediate data completion bugfix +- Do not create SHM peer when SHM is disabled +- Use correct threading model for shm +- Restrict RDMA read to compatible EFA devices +- Add EFA device version to handshake +- Cleanup/fix some unit test code +- Touch up RDM protocol header, doc +- Fix efa device name matching +- Add missing locks in efa_cntr_wait. +- Fix the efa_env_initialize() call sequence. +- Fix a compilation warning +- Handle RNRs from RDMA writedata +- Add writedata RNR fabtest +- Correct typo in RMA context type + +## NetDir + +- Add missing unlock in error path of nd_send_ack() + +## RXM + +- Fix data error with FI_OFI_RXM_USE_RNDV_WRITE=1 + +## SHM + +- Fix coverity issue about resource leak +- Allocate peer device fds dynamically +- Add memory barrier before updating resp for atomic +- Use peer cntr inc ops in smr_progress_cmd +- Only increment tx cntr when inject rma succeeded. + +## TCP + +- Pass through rdm_ep flags to msg eps. +- Derive cq flags from op and msg flags +- Set FI_MULTI_RECV for last completed RX slice + +## UCX + +- Initialize ep_flush to 1 + +## Util + +- Fix coverity issue about missing lock +- Implement timeout in util_wait_yield_run() +- memhooks: Fix a bug when calculating mprotect region + +## Verbs + +- Windows: Resolve regression in user data retrieval +- Windows: Check error code from GetPrivateData +- Bug fix for matching domain name with device name + +## Fabtests + +- efa: Close ibv device after use +- efa: Get device MR limit from ibv_query_device +- efa: Add simple unexpected test to MR exhaustion test +- pytest: Add a new ssh connection error pattern +- Make ft_force_progress non-static +- memcopy-xe: Fix data verification error for device buffer +- dmabuf: Increase the number of NICs that can be tested +- cq_data: Relax CQ data validation to cq_data_size +- dmabuf: Handle partial read scenario for fi_xe_rdmabw test +- pytest/efa: Add cuda memory marker + + +v1.18.3, Mon Jan 22, 2024 +========================= + +## Core + +- hmem/ze: Change the library name passed to dlopen +- hmem/ze: map device id to physical device +- hmem/ze: skip duplicate initialization +- hmem/ze: dynamically allocate device resources based on number of devices +- hmem/ze: fix hmem_ze_copy_engine variable look up +- hmem/ze: Increase ZE_MAX_DEVICES to 32 +- man: Fix typo in fi_getinfo man page +- man: Fix fi_rxm.7 and fi_collective.3 man pages +- man: Fix the fi_provider.7 man page for the man page converter +- configure.ac Fix `--with-lttng` causing `yes/` to populate {CPP,LD}FLAGS +- hmem/ze: fix incorrect device id in copy function +- configure.ac: Fix `with_synaposeai` typo + +## EFA + +- Fix efa device name matching +- Add writedata RNR fabtest +- Handle RNRs from RDMA writedata + +## NetDir + +- Add missing unlock in error path of nd_send_ack() +- Release lock prior to returning from nd_send_ack + +## RXM + +- Fix data error with FI_OFI_RXM_USE_RNDV_WRITE=1 + +## SHM + +- Fix coverity issue about resource leak +- Allocate peer device fds dynamically + +## TCP + +- Pass through rdm_ep flags to msg eps. +- Derive cq flags from op and msg flags +- Set FI_MULTI_RECV for last completed RX slice + +## UCX + +- Initialize ep_flush to 1 + +## Util + +- Fix coverity issue about missing lock +- Implement timeout in util_wait_yield_run() +- memhooks: Fix a bug when calculating mprotect region + +## Verbs + +- Windows: Resolve regression in user data retrieval +- Windows: Check error code from GetPrivateData +- Bug fix for matching domain name with device name + +## Fabtests + +- rdm_tagged_peek: Fix race condition synchronization +- Make rdm_tagged_peek test more general +- Split cq_read and cq_readerr in ft_spin_for_comp +- sock_test: Do not use epoll if not available +- Use dummy ft_pin_core on macOS +- Avoid using memset function name +- Fix some header includes +- memcopy-xe: Fix data verification error for device buffer +- dmabuf: Increase the number of NICs that can be tested +- dmabuf: Handle partial read scenario for fi_xe_rdmabw test +- pytest/efa: add cuda memory marker + + +v1.20.0, Fri Nov 17, 2023 +========================= + +## Core + +- General bug fixes and code clean-up +- configure.ac: add extra check for 128 bit atomic support +- hmem/synapseai: Refine the error handling and warning +- Introduce FI_ENOMR +- hmem/cuda: fix a bug when calculating aligned size. +- Handle dmabuf for ofi_mr_cache* functions. +- Handle dmabuf flag in ofi_mr_attr_update +- Handle dmabuf for mr_map insert. +- man: Fix the description of virtual address when FI_MR_DMABUF is set +- man: Clarify the defition of FI_OPT_MIN_MULTI_RECV +- hmem/cuda: Add dmabuf fd ops functions +- include/ofi_atomic_queue: Properly align atomic values +- Define fi_av_set_user_id +- Support multiple auth keys per EP +- Simplify restricted-dl feature +- hmem: Only initalize synapseai if device exists +- Add "--enable-profile" option +- windows: Updated config.h +- Add environment variable for selective HMEM initialization +- Add restricted dlopen flag to configure options +- hmem: generalize the use of OFI_HMEM_DATA to non-cuda iface +- hmem: fail cuda_dev_register if gdrcopy is not enabled +- Add 1.7 ABI compat +- Define fi_domain_attr::max_ep_auth_key +- hmem: Add new op to hmem_ops for getting dmabuf fd +- hmem/cuda: Update cuda_gdrcopy_dev_register's signature +- mr_cache: Define ofi_mr_info::flags +- Add ABI compat for fi_cq_err_entry::src_addr +- Define fi_cq_err_entry::src_addr +- Add base_addr to fi_mr_dmabuf +- hmem: Set FI_HMEM_HOST_ALLOC for ze addr valid +- hmem: Support dev reg with FI_HMEM_ZE +- tostr: Added fi_tostr() for data type struct fi_cq_err_entry. +- hmem_ze: fix incorrect device id in copy function +- Introduce new profiling interface for low-level statistics +- hmem: Support dev reg with FI_HMEM_CUDA +- hmem: Support dev reg with FI_HMEM_ROCR +- hmem: Support dev reg with FI_HMEM_SYSTEM +- hmem: Define optimized HMEM memcpy APIs +- Implement memhooks atfork child handler +- hmem: Support ofi_hmem_get_base_addr with sys mem +- hmem: Add length field to ofi_hmem_get_base_addr +- mr_cache: Improve cache hit rate +- mr_cache: Purge dead regions in find +- mr_cache: Update find to remove invalid MR entries +- mr_cache: Update find with MM valid check +- Add direct support for dma-buf memory registration +- man/fi_tagged: Remove the peek for data ability +- indexer: Add byte idx abstraction +- Add missing FI_REMOTE_CQ_DATA for fi_inject_writedata +- Add configure flags for more sanitizers +- Fix fi_peer man page inconsistency +- include/fi_peer: Add cq_data to rx_entry, allow peer to modify on unexp +- Add XPMEM support + +## EFA + +- General bug fix and code clean-up +- Do not abort on all deprecated env vars +- Onboard fi_mr_dmabuf API in mem reg ops. +- Try registering cuda memory via dmabuf when checking p2p +- Introduce HAVE_EFA_DMABUF_MR macro in configure +- Add read nack protocol docs +- Receiver send NACK if runt read fails with ENOMR +- Sender switch to long CTS protocol if runt read fails with ENOMR +- Receiver send NACK if long read fails with ENOMR +- Update efa_rdm_rxe_map_remove to accept msg_id and addr +- Sender switch to long CTS protocol if long read fails with ENOMR +- Introduce new READ_NACK feature +- Use SHM's full inject size +- Add testing for small messages without inject +- Enable inject rdma write +- Use bounce buffer for 0 byte writes +- Onboard ofi_hmem_dev_register API +- Update cuda_gdrcopy_dev_register's signature +- Allocate pke_vec, recv_wr_vec, sge_vec from heap +- Close shm resource when it is disabled in ep +- Disable RUNTING for Neuron +- Move cuda-sync-memops from MR to EP +- Do not insert shm av inside efa progress engine +- Enable shm when FI_HMEM and FI_ATOMIC are requested +- Adjust posted receive size to pkt_size +- Do not create SHM peer when SHM is disabled +- Use correct threading model for shm +- Restrict RDMA read to compatible EFA devices +- Add EFA device version to handshake +- Add missing locks in efa_cntr_wait. +- Add writedata RNR fabtest +- Handle RNRs from RDMA writedata +- Check opt_len in efa_rdm_ep_getopt +- Use correct tx/rx op_flags for shm + +## Hooks + +- dmabuf: Initialize fd to supress compiler warning +- trace: Add log on FI_VAR_UNEXP_MSG_CNT when enabled. +- trace: Fixed trace log format on some attributes. + +## OPX + +- Fix compiler warnings + +## PSM3 + +- Fix compiler warnings +- Update provider to sync with IEFS 11.5.1.1.1 + ## RXM +- Remove unused function +- Use gdrcopy in rma when emulating injection +- Use gdrcopy in eager send/recv +- Add hmem gdrcopy functions +- Remove unused dynamic rbuf support + ## SHM +- General bug fixes and cleanup +- Add ofi_buf_alloc error handling +- Only copy header + msg on unexpected path +- Add FI_HMEM atomic support +- Add memory barrier before updating resp for atomic +- Add more error output +- Reduce atomic locking with ofi_mr_map_verify +- Only increment tx cntr when inject rma succeeded. +- Use peer cntr inc ops in smr_progress_cmd +- Allow for inject protocol to buffer more unexpected messages +- Change pending fs to bufpool to allow it to grow +- Add unexpected SAR buffering +- Use generic acronym for shm cap +- Move CMA to use the p2p infrastructure +- Add p2p abstraction +- Load DSA dependency dynamically +- Replace tx_lock with ep_lock +- Calculate comp vars when writing completion +- Move progress_sar above progress_cmd +- Rename SAR status enum to be more clear +- Make SAR protocol handle 0 byte transfer. +- Move selection logic to smr_select_proto() + +## Sockets + +- Fix compiler warnings +- Fix provider name and api version in returned fi_info struct + ## TCP +- Add profiling interface support +- Pass through rdm_ep flags to msg eps +- Derive cq flags from op and msg flags +- Do not progress ep that is disconnected +- Set FI_MULTI_RECV for last completed RX slice +- Return an error if invalid sequence number received +- xnet_progress_rx() must only be called when connected +- Reset ep->rx_avail to 0 after RX queue is flushed +- Disable the EP if an error is detected for zero-copy +- Add debug tracking of transfer entries +- Negotiate support for rendezvous +- Add rendezvous protocol option +- Generalize xnet_send_ack +- Flatten protocol header definitions +- Remove unused dynamic rbuf support +- Define tcp specific protocol ops +- Remove unneeded and incorrect rx_entry init code + ## UCX +- Add FI_HMEM support +- Initialize ep_flush to 1 + ## Util +- General bug fixes +- memhooks: Fix a bug when calculating mprotect region +- Check the return value of ofi_genlock_init() +- Update checks for FI_AV_AUTH_KEY +- Define domain primary and secondary caps +- Add profiling util functions +- Update util_cq to support err_data +- Update ofi_cq_readerr to use new memcpy +- Update ofi_cq_err_memcpy to handle err_data +- Zero util cancel err entry +- Move FI_REMOTE/LOCAL_COMM to secondary caps +- Alter domain max_ep_auth_key +- Add domain checks for max_ep_auth_key +- Revert util_cntr->ep_list_lock to ofi_mutex +- Add NIC FID functions to ofi.h +- Add EP and domain auth key checking +- Add bounds checks to ibuf get +- Define dlist_first_entry_or_null +- Update util_getinfo to dup auth_key +- Revert util_av, util_cq and util_cntr to mutex +- Add missing calls to (de)initialize monitor's mutexes +- Avoid attempting to cleanup an uninitialized MR cache +- Rename ofi_mr_info fields +- Add rv64g support to memory hooks + ## Verbs +- Windows: Check error code from GetPrivateData +- Add missing lock to protect SRX +- Add synapseai dmabuf mr support +- Bug fix for matching domain name with device name +- Windows: Fetch rejected connection data +- Add support for DMA-buf memory registration +- Windows: Fix use-after-free in case of failure in fi_listen +- Windows: Map ND request type to ibverbs opcode +- Fix memory leak when creating EQ with unsupported wait object +- Track ep state to prevent duplicate shutdown events + ## Fabtests +- Update man page +- pytests/efa: onboard dmabuf argument for test_mr +- pytest: make do_dmabuf_reg_for_hmem an cmdline argument +- Bump Libfabric API version. +- mr_test: Add dmabuf support +- Introduce ft_get_dmabuf_from_iov +- unexpected_msg: Use ft_reg_mr to register memory +- pytest: Allow registering mr with dmabuf +- Add dmabuf support to ft_reg_mr +- Add dmabuf ops for cuda. +- Test max inject size +- Add FI_HMEM support to fi_rdm_rma_event and fi_rdm tests +- memcopy-xe: Fix data verification error for device buffer +- dmabuf-rdma: Increase the number of NICs that can be tested +- dmabuf-rdma: Remove redundant libze_ops definition +- fi-mr-reg-xe: Skip native dmabuf reg test for system memory +- Check if fi_info is returned correctly in case of FI_CONNREQ +- cq_data: relax CQ data validation to cq_data_size +- Add ZE host alloc function +- Use common device host buffer for check_buf +- hmem_ze: allocate one cq and cl on init +- fi-mr-reg-xe: Add testing for dmabuf registration +- scripts: use yaml safe_load +- macos: Fix build error with clang +- multinode: Use FI_DELIVERY_COMPLETE for 'barrier' +- Handle partial read scenario for fi_xe_rdmabw test For cross node tests +- pytest/efa: add cuda memory marker +- pytest/efa: Skip some configuration for unexp msg test on neuron. +- runfabtests.py: ignore error due to no tests are collected. +- pytest/efa: extend unexpected msg test range +- pytest/shm: extend unexpected msg test range +- pytest: Allow running shm fabtests in parallel +- unexpected_msg.c: Allow running the test with FI_DELIVERY_COMPLETE +- runfabtests.sh: run fi_unexpected_msg with data validation +- pytest/shm: Extend test_unexpected_message +- unexpected_msg: Make tx/rx_size large enough +- pytest/shm: Extend shm's rma bw test +- Update shm.exclude + + v1.19.0, Fri Sep 1, 2023 ======================== diff --git a/README.md b/README.md index 1e8335c4a75..b8b9d3f8286 100644 --- a/README.md +++ b/README.md @@ -131,22 +131,6 @@ A more comprehensive test package is available via the fabtests package. ## Providers -### gni - -*** - -The `gni` provider runs on Cray XC (TM) systems utilizing the user-space -Generic Network Interface (`uGNI`), which provides low-level access to -the Aries interconnect. The Aries interconnect is designed for -low-latency one-sided messaging and also includes direct hardware -support for common atomic operations and optimized collectives. - -See the `fi_gni(7)` man page for more details. - -#### Dependencies - -- The `gni` provider requires `gcc` version 4.9 or higher. - ### opx *** @@ -283,6 +267,9 @@ transport and translates OFI calls to appropriate verbs API calls. It uses librdmacm for communication management and libibverbs for other control and data transfer operations. +The verbs provider can also be built on Windows using the Microsoft Network +Direct SPI for network transport. + See the `fi_verbs(7)` man page for more details. #### Dependencies @@ -293,29 +280,8 @@ See the `fi_verbs(7)` man page for more details. If the libraries and header files are not in default paths, specify them in CFLAGS, LDFLAGS and LD_LIBRARY_PATH environment variables. -### Network Direct - -*** - -The Network Direct provider enables applications using OFI to be run over -any verbs hardware (Infiniband, iWarp, and RoCE). It uses the Microsoft Network -Direct SPI for network transport and provides a translation of OFI calls to -appropriate Network Direct API calls. -The Network Direct providers enables OFI-based applications to utilize -zero-copy data transfers between applications, kernel-bypass I/O generation and -one-sided data transfer operations on Microsoft Windows OS. -An application can use OFI with the Network Direct provider enabled on -Windows OS to expose the capabilities of the networking devices if the hardware -vendors of the devices implemented the Network Direct service provider interface -(SPI) for their hardware. - -See the `fi_netdir(7)` man page for more details. - -#### Dependencies - -- The Network Direct provider requires Network Direct SPI. If you are compiling - libfabric from source and want to enable Network Direct support, you will also - need the matching header files for the Network Direct SPI. +- Windows built requires Network Direct SPI. If you are compiling libfabric from + source, you will also need the matching header files for the Network Direct SPI. If the libraries and header files are not in default paths, specify them in the configuration properties of the VS project. @@ -378,3 +344,26 @@ It is possible to compile and link libfabric with windows applications. - choose C/C++ > General and add `\include` to "Additional include Directories" - choose Linker > Input and add `\x64\\libfabric.lib` to "Additional Dependencies" - depending on what you are building you may also need to copy `libfabric.dll` into the target folder of your own project. + +### cxi + +The CXI provider enables libfabric on Cray's Slingshot network. Slingshot is +comprised of the Rosetta switch and Cassini NIC. Slingshot is an +Ethernet-compliant network. However, The provider takes advantage of proprietary +extensions to support HPC applications. + +The CXI provider supports reliable, connection-less endpoint semantics. It +supports two-sided messaging interfaces with message matching offloaded by the +Cassini NIC. It also supports one-sided RMA and AMO interfaces, light-weight +counting events, triggered operations (via the deferred work API), and +fabric-accelerated small reductions. + +See the `fi_cxi(7)` man page for more details. + +#### Dependencies + +- The CXI Provider requires Cassini's optimized HPC protocol which is only + supported in combination with the Rosetta switch. + +- The provider uses the libCXI library for control operations and a set of + Cassini-specific header files to enable direct hardware access in the data path. diff --git a/configure.ac b/configure.ac index 4119bd1f9fc..7cad98f1e17 100644 --- a/configure.ac +++ b/configure.ac @@ -9,7 +9,7 @@ dnl dnl Process this file with autoconf to produce a configure script. AC_PREREQ([2.60]) -AC_INIT([libfabric], [1.21.0a1], [ofiwg@lists.openfabrics.org]) +AC_INIT([libfabric], [1.21.0rc1], [ofiwg@lists.openfabrics.org]) AC_CONFIG_SRCDIR([src/fabric.c]) AC_CONFIG_AUX_DIR(config) AC_CONFIG_MACRO_DIR(config) diff --git a/fabtests/configure.ac b/fabtests/configure.ac index c8426bb87ce..0568f862a63 100644 --- a/fabtests/configure.ac +++ b/fabtests/configure.ac @@ -5,7 +5,7 @@ dnl dnl Process this file with autoconf to produce a configure script. AC_PREREQ(2.57) -AC_INIT([fabtests], [1.21.0a1], [ofiwg@lists.openfabrics.org]) +AC_INIT([fabtests], [1.21.0rc1], [ofiwg@lists.openfabrics.org]) AC_CONFIG_AUX_DIR(config) AC_CONFIG_MACRO_DIR(config) AC_CONFIG_HEADERS(config.h) diff --git a/include/rdma/fabric.h b/include/rdma/fabric.h index 0e08233be45..b0c8252da9c 100644 --- a/include/rdma/fabric.h +++ b/include/rdma/fabric.h @@ -84,7 +84,7 @@ extern "C" { #endif #define FI_MAJOR_VERSION 1 -#define FI_MINOR_VERSION 20 +#define FI_MINOR_VERSION 21 #define FI_REVISION_VERSION 0 enum { diff --git a/include/windows/config.h b/include/windows/config.h index 70f939d8926..80ee5cb3c08 100644 --- a/include/windows/config.h +++ b/include/windows/config.h @@ -256,7 +256,7 @@ #define PACKAGE_TARNAME PACKAGE /* Define to the version of this package. */ -#define PACKAGE_VERSION "1.21.0a1" +#define PACKAGE_VERSION "1.21.0rc1" /* Define to the full name and version of this package. */ #define PACKAGE_STRING PACKAGE_NAME " " PACKAGE_VERSION diff --git a/man/fi_mlx.7.md b/man/fi_mlx.7.md deleted file mode 100644 index 0c382356f88..00000000000 --- a/man/fi_mlx.7.md +++ /dev/null @@ -1,20 +0,0 @@ ---- -layout: page -title: fi_mlx(7) -tagline: Libfabric Programmer's Manual ---- -{% include JB/setup %} - -# NAME - -fi_mlx \- The MLX Fabric Provider - -# OVERVIEW - -The mlx provider was deprecated and removed in libfabric 1.9 -due to a lack of a maintainer. - -# SEE ALSO - -[`fabric`(7)](fabric.7.html), -[`fi_provider`(7)](fi_provider.7.html), diff --git a/man/fi_netdir.7.md b/man/fi_netdir.7.md deleted file mode 100644 index dccf4c72ec3..00000000000 --- a/man/fi_netdir.7.md +++ /dev/null @@ -1,116 +0,0 @@ ---- -layout: page -title: fi_netdir(7) -tagline: Libfabric Programmer's Manual ---- -{% include JB/setup %} - -# NAME - -fi_netdir \- The Network Direct Fabric Provider - -# OVERVIEW - -The Network Direct provider enables applications using OFI to be run over -any verbs hardware (Infiniband, iWarp and etc). It uses the Microsoft Network -Direct SPI for network transport and provides a translation of OFI calls to -appropriate Network Direct API calls. -The Network Direct providers allows to OFI-based applications utilize -zero-copy data transfers between applications, kernel-bypass I/O generation and -one-sided data transfer operations on Microsoft Windows OS. -An application is able to use OFI with Network Direct provider enabled on -Windows OS to expose the capabilities of the networking devices if the hardware -vendors of the devices implemented the Network Direct service provider interface -(SPI) for their hardware. - -# SUPPORTED FEATURES - -The Network Direct provider support the following features defined for the -libfabric API: - -*Endpoint types* -: The provider support the FI_EP_MSG endpoint types. - -*Memory registration modes* -: The provider implements the *FI_MR_BASIC* memory registration mode. - -*Data transfer operations* -: The following data transfer interfaces are supported for the following - endpoint types: *FI_MSG*, *FI_RMA*. See DATA TRANSFER OPERATIONS below - for more details. - -*Modes* -: The Network Direct provider requires applications to support - the following modes: - * FI_LOCAL_MR for all applications. - -*Addressing Formats* -: Supported addressing formats include FI_SOCKADDR, FI_SOCKADDR_IN, FI_SOCKADDR_IN6 - -*Progress* -: The Network Direct provider supports FI_PROGRESS_AUTO: Asynchronous operations - make forward progress automatically. - -*Operation flags* -: The provider supports FI_INJECT, FI_COMPLETION, FI_TRANSMIT_COMPLETE, - FI_INJECT_COMPLETE, FI_DELIVERY_COMPLETE, FI_SELECTIVE_COMPLETION - -*Completion ordering* -: RX/TX contexts: FI_ORDER_STRICT - -*Other supported features* -: Multiple input/output vector (IOV) is supported for FI_RMA read/write and - FI_MSG receive/transmit operations. - -# LIMITATIONS - -*Memory Regions* -: Only FI_MR_BASIC mode is supported. Adding regions via s/g list is - supported only up to a s/g list size of 1. No support for binding memory - regions to a counter. - -*Wait objects* -: Wait object and wait sets are not supported. - -*Resource Management* -: Application has to make sure CQs are not overrun as this cannot be detected - by the provider. - -*Unsupported Endpoint types* -: FI_EP_DGRAM, FI_EP_RDM - -*Other unsupported features* -: Scalable endpoints, FABRIC_DIRECT - -*Unsupported features specific to MSG endpoints* -: FI_SOURCE, FI_TAGGED, FI_CLAIM, fi_ep_alias, shared TX context, operations. - -# RUNTIME PARAMETERS - -The Network Direct provider checks for the following environment variables. - -### Variables specific to RDM endpoints - -*FI_NETDIR_INLINETHR* -: The size of the (default: 8 Kbyte): - * Transmitted data that can be inlined - * Preposted data for the unexpected receive queue - -*FI_NETDIR_PREPOSTCNT* -: The number of pre-registered buffers between the endpoints that are not require - internal ACK messages, must be a power of 2 (default: 8). - -*FI_NETDIR_PREPOSTBUFCNT* -: The number of preposted arrays of buffers, must be a power of 2 (default: 1). - -### Environment variables notes -The fi_info utility would give the up-to-date information on environment variables: -fi_info -p netdir -e - -# SEE ALSO - -[`fabric`(7)](fabric.7.html), -[`fi_open_ops`(3)](fi_open_ops.3.html), -[`fi_provider`(7)](fi_provider.7.html), -[`fi_getinfo`(3)](fi_getinfo.3.html) -[`fi_atomic`(3)](fi_atomic.3.html) diff --git a/man/fi_provider.7.md b/man/fi_provider.7.md index 46e20cbf431..ba820906a12 100644 --- a/man/fi_provider.7.md +++ b/man/fi_provider.7.md @@ -67,6 +67,10 @@ The following core providers are built into libfabric by default, assuming all build pre-requisites are met. That is, necessary libraries are installed, operating system support is available, etc. This list is not exhaustive. +*CXI* +: Provider for Cray's Slingshot network. See + [`fi_cxi`(7)](fi_cxi.7.html) for more information. + *EFA* : A provider for the [Amazon EC2 Elastic Fabric Adapter (EFA)](https://aws.amazon.com/hpc/efa/), a custom-built OS bypass diff --git a/man/man7/fi_mlx.7 b/man/man7/fi_mlx.7 deleted file mode 100644 index b87ae42baec..00000000000 --- a/man/man7/fi_mlx.7 +++ /dev/null @@ -1,16 +0,0 @@ -.\" Automatically generated by Pandoc 2.9.2.1 -.\" -.TH "fi_mlx" "7" "2022\-12\-09" "Libfabric Programmer\[cq]s Manual" "#VERSION#" -.hy -.SH NAME -.PP -fi_mlx - The MLX Fabric Provider -.SH OVERVIEW -.PP -The mlx provider was deprecated and removed in libfabric 1.9 due to a -lack of a maintainer. -.SH SEE ALSO -.PP -\f[C]fabric\f[R](7), \f[C]fi_provider\f[R](7), -.SH AUTHORS -OpenFabrics. diff --git a/man/man7/fi_netdir.7 b/man/man7/fi_netdir.7 deleted file mode 100644 index d442436c2af..00000000000 --- a/man/man7/fi_netdir.7 +++ /dev/null @@ -1,112 +0,0 @@ -.\" Automatically generated by Pandoc 2.9.2.1 -.\" -.TH "fi_netdir" "7" "2022\-12\-09" "Libfabric Programmer\[cq]s Manual" "#VERSION#" -.hy -.SH NAME -.PP -fi_netdir - The Network Direct Fabric Provider -.SH OVERVIEW -.PP -The Network Direct provider enables applications using OFI to be run -over any verbs hardware (Infiniband, iWarp and etc). -It uses the Microsoft Network Direct SPI for network transport and -provides a translation of OFI calls to appropriate Network Direct API -calls. -The Network Direct providers allows to OFI-based applications utilize -zero-copy data transfers between applications, kernel-bypass I/O -generation and one-sided data transfer operations on Microsoft Windows -OS. -An application is able to use OFI with Network Direct provider enabled -on Windows OS to expose the capabilities of the networking devices if -the hardware vendors of the devices implemented the Network Direct -service provider interface (SPI) for their hardware. -.SH SUPPORTED FEATURES -.PP -The Network Direct provider support the following features defined for -the libfabric API: -.TP -\f[I]Endpoint types\f[R] -The provider support the FI_EP_MSG endpoint types. -.TP -\f[I]Memory registration modes\f[R] -The provider implements the \f[I]FI_MR_BASIC\f[R] memory registration -mode. -.TP -\f[I]Data transfer operations\f[R] -The following data transfer interfaces are supported for the following -endpoint types: \f[I]FI_MSG\f[R], \f[I]FI_RMA\f[R]. -See DATA TRANSFER OPERATIONS below for more details. -.TP -\f[I]Modes\f[R] -The Network Direct provider requires applications to support the -following modes: * FI_LOCAL_MR for all applications. -.TP -\f[I]Addressing Formats\f[R] -Supported addressing formats include FI_SOCKADDR, FI_SOCKADDR_IN, -FI_SOCKADDR_IN6 -.TP -\f[I]Progress\f[R] -The Network Direct provider supports FI_PROGRESS_AUTO: Asynchronous -operations make forward progress automatically. -.TP -\f[I]Operation flags\f[R] -The provider supports FI_INJECT, FI_COMPLETION, FI_TRANSMIT_COMPLETE, -FI_INJECT_COMPLETE, FI_DELIVERY_COMPLETE, FI_SELECTIVE_COMPLETION -.TP -\f[I]Completion ordering\f[R] -RX/TX contexts: FI_ORDER_STRICT -.TP -\f[I]Other supported features\f[R] -Multiple input/output vector (IOV) is supported for FI_RMA read/write -and FI_MSG receive/transmit operations. -.SH LIMITATIONS -.TP -\f[I]Memory Regions\f[R] -Only FI_MR_BASIC mode is supported. -Adding regions via s/g list is supported only up to a s/g list size of -1. -No support for binding memory regions to a counter. -.TP -\f[I]Wait objects\f[R] -Wait object and wait sets are not supported. -.TP -\f[I]Resource Management\f[R] -Application has to make sure CQs are not overrun as this cannot be -detected by the provider. -.TP -\f[I]Unsupported Endpoint types\f[R] -FI_EP_DGRAM, FI_EP_RDM -.TP -\f[I]Other unsupported features\f[R] -Scalable endpoints, FABRIC_DIRECT -.TP -\f[I]Unsupported features specific to MSG endpoints\f[R] -FI_SOURCE, FI_TAGGED, FI_CLAIM, fi_ep_alias, shared TX context, -operations. -.SH RUNTIME PARAMETERS -.PP -The Network Direct provider checks for the following environment -variables. -.SS Variables specific to RDM endpoints -.TP -\f[I]FI_NETDIR_INLINETHR\f[R] -The size of the (default: 8 Kbyte): * Transmitted data that can be -inlined * Preposted data for the unexpected receive queue -.TP -\f[I]FI_NETDIR_PREPOSTCNT\f[R] -The number of pre-registered buffers between the endpoints that are not -require internal ACK messages, must be a power of 2 (default: 8). -.TP -\f[I]FI_NETDIR_PREPOSTBUFCNT\f[R] -The number of preposted arrays of buffers, must be a power of 2 -(default: 1). -.SS Environment variables notes -.PP -The fi_info utility would give the up-to-date information on environment -variables: fi_info -p netdir -e -.SH SEE ALSO -.PP -\f[C]fabric\f[R](7), \f[C]fi_open_ops\f[R](3), \f[C]fi_provider\f[R](7), -\f[C]fi_getinfo\f[R](3) \f[C]fi_atomic\f[R](3) -.SH AUTHORS -OpenFabrics.