You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To analyze the performance and resource usage characteristics of multidimensional array accesses and partitioning strategies.
Questions:
For 2D arrays w/ various partitioning configurations without any unrolling, how does the resource count and runtime change?
Run Gemm ncubed kernel with 2D memories partitioned [1,2,4] * [1,2,4] 9 configurations.
Run Gemm ncubed kernel with 1D memories partitioned [1, 4, 8, 16]. Only need to run the 16 case and re-use numbers from previous experiment 1 configuration.
See if there is difference between the (1, 2) and (2, 1) cases for the 2D array.
See how (1, 2), (2, 1) cases compare with 2 case for 1D memory.
For 2D arrays, does unrolling the accessor for the dimension which is partitioned help more?
Run 2D Gemm with memories partitioned [1,2,4] * [1,2,4] and loop unrolling factors [1, 2, 4] * [1, 2, 4]. Ignore configurations where both the memories are partitioned the same or both the unrolling factors are the same and where part1 * part2 != unroll1 * unroll2. 12 configurations.
Compare performance to 1D array partition [1,2,4,16] and unrolling factor [1,2,4,16] with the same restriction (unroll factor = partition factor). Use numbers from previous experiment. 0 configurations.
See if partitioning the right dimension (the one being unrolled) makes performance better.
Compare the configurations (1, 2, 1, 2), (1, 2, 2, 1) and (2, 2) and see if the former are better/worse.
The text was updated successfully, but these errors were encountered:
Experiment
To analyze the performance and resource usage characteristics of multidimensional array accesses and partitioning strategies.
Questions:
For 2D arrays w/ various partitioning configurations without any unrolling, how does the resource count and runtime change?
For 2D arrays, does unrolling the accessor for the dimension which is partitioned help more?
part1 * part2 != unroll1 * unroll2
. 12 configurations.The text was updated successfully, but these errors were encountered: