LibGPM contains the source of our CUDA library which provides a user-friendly interface for GPU-accelerated recoverable applications.
- (host) void *gpm_map_file(const char *path, size_t &len, bool create)
- If create is true, creates memory-mapped file at path, with of len bytes size.
- If create is false, opens memory-mapped file at path, storing size of file in len. Input len should be 0.
- Returns pointer to persistent region in GPU of len size
- (host) cudaError_t gpm_unmap(void *addr, size_t len)
- Unmaps memory-mapped variable with starting pointer addr and size len, persisting all data.
- Returns cudaSuccess on success or error code otherwise
- (host) void gpm_persist_begin(void)
- Turns DDIO off, allowing in-kernel persistence for further GPU kernels accessing PM.
- (host) void gpm_persist_end(void)
- Turns DDIO on, in-kernel persistence is no longer guaranteed.
- (device) void gpm_persist()
- Guarantees all prior writes to PM by the calling thread are persisted.
- Implemented using
__threadfence_system()
.
- (device, host) cudaError_t gpm_memcpy_nodrain(void *gpmdest, const void *src, size_t len, cudaMemcpyKind kind)
- Memcpys data from src to gpmdest of size len, without persisting the region. The variable kind indicates the type of memcpy (cudaMemcpyDeviceToDevice, cudaMemcpyHostToDevice, etc.).
- (device, host) cudaError_t gpm_memset_nodrain(void *gpmdest, unsigned char value, size_t len)
- Assigns value byte-by-byte to the region starting from pointer gpmdest of size len, without persisting the region.
- (device, host) cudaError_t gpm_memcpy(void *pmemdest, const void *src, size_t len, cudaMemcpyKind kind)
- Memcpys data from src to gpmdest of size len, guaranteeing persistance on completion. The variable kind indicates the type of memcpy (cudaMemcpyDeviceToDevice, cudaMemcpyHostToDevice, etc.).
- Failure-atomicity is not guaranteed.
- (device, host) cudaError_t pmem_memset(void *pmemdest, int c, size_t len)
- Assigns value byte-by-byte to the region starting from pointer gpmdest of size len, guaranteeing persistance on completion.
- Failure-atomicity is not guaranteed.
- TODO
- TODO
This folder contains 6 files, which we explain below:
- libgpm.cuh - Contains the main implementation details of GPM, which allow for allocation/deallocation on PMEM.
- libgpmlog.cuh - Contains the relevant implementation of logging (HCL and conventional) in GPM.
- libgpmcp.cuh - Contains the relevant implementation of checkpointing in GPM.
- bandwidth_analysis.cuh - Contains helper definitions used for bandwidth measurement.
- change-ddio.h - Contains functions to turn DDIO off and on. GPM wrappers for these functions are found in libgpm. [1]
- gpm-helper.cuh - Contains common helper functions needed across other files.
References: [1] Characterizing and Optimizing Remote Persistent Memory with RDMA and NVM. Authors: Xingda Wei and Xiating Xie and Rong Chen and Haibo Chen and Binyu Zang. Published in: Usenix ATC'2021