forked from apache/systemds
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SYSTEMDS-3606] Performance shuffle-based spark quaternary operations
This patch significantly improves the performance of shuffle-based spark quaternary operations, where more than one input is an RDD (too large to broadcast). Instead of replicating the factor blocks, we now use custom join keys enabling spark to perform more efficient 1:M joins. With appropriate function abstractions, the implementation also got simpler and thus, easier to maintain. On the scenario mentioned in the JIRA task, the original implementation did not finish any task of the first shuffle phase after >9000s, while with the new implementation the entire script (with two shuffle-based quaternary operators) finishes in 1276s. Here are the stats: SystemDS Statistics: Total elapsed time: 1276.917 sec. Total compilation time: 2.338 sec. Total execution time: 1274.578 sec. Number of compiled Spark inst: 4. Number of executed Spark inst: 4. Cache hits (Mem/Li/WB/FS/HDFS): 13/2/0/1/0. Cache writes (Li/WB/FS/HDFS): 4/6/4/1. Cache times (ACQr/m, RLS, EXP): 1209.517/0.001/10.926/8.589 sec. HOP DAGs recompiled (PRED, SB): 0/1. HOP DAGs recompile time: 0.006 sec. Functions recompiled: 1. Functions recompile time: 0.011 sec. Spark ctx create time (lazy): 19.302 sec. Spark trans counts (par,bc,col):0/3/1. Spark trans times (par,bc,col): 0.000/13.671/644.719 secs. Spark async. count (pf,bc,op): 0/0/0. Total JIT compile time: 73.677 sec. Total JVM GC count: 188. Total JVM GC time: 23.182 sec. Heavy hitter instructions: 1 m_pnmf 714.304 1 2 r' 653.012 5 3 uak+ 560.027 2 4 sp_redwdivmm 42.446 2 5 rand 9.414 4 6 * 3.544 1 7 / 3.491 1 8 uack+ 3.466 1 9 uark+ 2.146 1 10 rmvar 0.246 15
- Loading branch information
Showing
1 changed file
with
59 additions
and
115 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters