Heat 1.5 Release Notes
- Overview
- Highlights
- Performance Improvements
- Sparse
- Signal Processing
- RNG
- Statistics
- Manipulations
- I/O
- Machine Learning
- Deep Learning
- Other Updates
- Contributors
Overview
With Heat 1.5 we release the first set of features developed within the ESAPCA project funded by the European Space Agency (ESA).
The main focus of this release is on distributed linear algebra operations, such as tall-skinny SVD, batch matrix multiplication, and triangular solver. We also introduce vectorization via vmap
across MPI processes, and batch-parallel random number generation as default for distributed operations.
This release also includes a new class for distributed Compressed Sparse Column matrices, paving the way for future implementation of distributed sparse matrix multiplication.
On the performance side, our new array redistribution via MPI Custom Datatypes provides significant speed-up in operations that require it, such as FFTs (see Dalcin et al., 2018).
We are grateful to our community of users, students, open-source contributors, the European Space Agency and the Helmholtz Association for their support and feedback.
Highlights
- [ESAPCA] Distributed tall-skinny SVD:
ht.linalg.svd
(by @mrfh92) - Distributed batch matrix multiplication:
ht.linalg.matmul
(by @FOsterfeld) - Distributed solver for triangular systems:
ht.linalg.solve_triangular
(by @FOsterfeld) - Vectorization across MPI processes:
ht.vmap
(by @mrfh92)
Other Changes
Performance Improvements
- #1493 Redistribution speed-up via MPI Custom Datatypes available by default in
ht.resplit
(by @JuanPedroGHM)
Sparse
- #1377 New class: Distributed Compressed Sparse Column Matrix
ht.sparse.DCSC_matrix()
(by @Mystic-Slice)
Signal Processing
- #1515 Support batch 1-d convolution in
ht.signal.convolve
(by @ClaudiaComito)
RNG
Statistics
- #1420
Support sketched percentile/median for large datasets withht.percentile(sketched=True)
(andht.median
) (by @mrhf92) - #1510 Support multiple axes for distributed
ht.percentile
andht.median
(by @ClaudiaComito)
Manipulations
- #1419 Implement distributed
unfold
operation (by @FOsterfeld)
I/O
- #1602 Improve load balancing when loading .npy files from path (by @Reisii)
- #1551 Improve load balancing when loading .csv files from path (by @Reisii)
Machine Learning
- #1593 Improved batch-parallel clustering
ht.cluster.BatchParallelKMeans
andht.cluster.BatchParallelKMedians
(by @mrfh92)
Deep Learning
Other Updates
- #1618 Support mpi4py 4.x.x (by @JuanPedroGHM)
Contributors
@mrfh92, @FOsterfeld, @JuanPedroGHM, @Mystic-Slice, @ClaudiaComito, @Reisii, @mtar and @krajsek