OAP is a project to optimize Spark by providing optimized implementation of packages for various aspects including cache, shuffle, and so on. In this version, we include the optimized implementation of SQL Index and Data Source Cache supporting DRAM and PMem, RDD Cache PMem Extension, Shuffle Remote PMem Extension and Remote Shuffle.
Please follow the below link for the guide to compile and install OAP to your system.
Please refer to the corresponding documents below for the introduction and how to use the features.
- SQL Index and Data Source Cache
- RDD Cache PMem Extension
- Shuffle Remote PMem Extension
- Remote Shuffle
- Intel MLlib
- Native SQL Engine
Please follow the below link for the guide for developers.