pyDNTNK: Python Distributed Non Negative Tensor Networks
pyDNTNK is a software package for applying non-negative Hierarchical Tensor decompositions such as Tensor train and Hierarchical Tucker decompositons in a distributed fashion to large datasets.
It is built on top of pyDNMFk. Tensor train (TT) and Hierarchical Tucker(HT) are state-of-the-art tensor network introduced for factorization of high-dimensional tensors. These methods transform the initial high-dimensional tensor in a network of low dimensional tensors that requires only a linear storage. Many real-world data,such as, density, temperature, population, probability, etc., are non-negative and for an easy interpretation, the algorithms preserving non-negativity are preferred. Here, we introduce the distributed non-negative Hierarchical tensor decomposition tools and demonstrate their scalability and the compression on synthetic and real world big datasets.
Features:
- Utilization of MPI4py for distributed operation.
- Distributed Reshaping and Unfolding operations with Zarr and Dask.
- Distributed Hierarchical Tensor decompositions such as Tensor train and Hierarchical Tucker.
- Ability to perform both standard SVD based and NMF based decompositions.
- Scalability to Tensors of very high dimensions.
- Automated rank estimation with SVD for each stage of tensor decomposition.
- Distributed Pruning of zero row and zero columns of the data.