-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse Kronecker sums #49
Comments
have you tried to use |
To get some feelings about which part to optimize julia> a = sprand(500, 500, 0.02);
julia> m = kron(a, IMatrix{500}());
julia> n = kron(IMatrix{500}(), a);
julia> @benchmark $m + $n
BenchmarkTools.Trial:
memory estimate: 78.57 MiB
allocs estimate: 8
--------------
minimum time: 43.870 ms (0.69% GC)
median time: 69.541 ms (7.15% GC)
mean time: 72.327 ms (10.12% GC)
maximum time: 160.216 ms (61.05% GC)
--------------
samples: 70
evals/sample: 1
julia> @benchmark kron($a, IMatrix{500}())
BenchmarkTools.Trial:
memory estimate: 40.24 MiB
allocs estimate: 8
--------------
minimum time: 7.091 ms (0.00% GC)
median time: 8.242 ms (0.00% GC)
mean time: 8.540 ms (2.21% GC)
maximum time: 14.998 ms (4.26% GC)
--------------
samples: 585
evals/sample: 1
julia> @benchmark kron(IMatrix{500}(), $a)
BenchmarkTools.Trial:
memory estimate: 59.40 MiB
allocs estimate: 10
--------------
minimum time: 7.096 ms (0.00% GC)
median time: 9.283 ms (4.30% GC)
mean time: 11.296 ms (6.95% GC)
maximum time: 25.206 ms (3.04% GC)
--------------
samples: 443
evals/sample: 1
julia> @benchmark copy($res)
BenchmarkTools.Trial:
memory estimate: 77.62 MiB
allocs estimate: 8
--------------
minimum time: 37.340 ms (0.00% GC)
median time: 55.627 ms (0.00% GC)
mean time: 58.335 ms (7.91% GC)
maximum time: 72.644 ms (20.63% GC)
--------------
samples: 86
evals/sample: 1 So, the question is whether it is possible that |
In Kronecker.jl, the (out-of-place) equivalent of """
collect(K::AbstractKroneckerSum)
Collects a lazy instance of the `AbstractKroneckerSum` type into a full,
native matrix. Returns the result as a sparse matrix.
"""
function Base.collect(K::AbstractKroneckerSum)
A, B = getmatrices(sparse(K))
IA, IB = oneunit(A), oneunit(B)
return kron(A, IB) + kron(IA, B)
end
"""
SparseArrays.sparse(K::KroneckerSum)
Creates a lazy instance of a `KroneckerSum` type with sparse
matrices. If the matrices are already sparse, `K` is returned.
"""
function SparseArrays.sparse(K::KroneckerSum{T, TA, TB}) where {T, TA <: AbstractSparseMatrix, TB <: AbstractSparseMatrix}
return K
end
function SparseArrays.sparse(K::KroneckerSum{T, TA, TB}) where {T, TA <: AbstractMatrix, TB <: AbstractMatrix}
return sparse(K.A) ⊕ sparse(K.B)
end For my usecase, I would preallocate the output matrix julia> A, B = (sprand(32, 32, 0.1) for _ in 1:2);
julia> IA, IB = (IMatrix(X) for X in (A,B));
julia> C = kron(A, IB) + kron(IA, B); C.nzval .= 0;
... When the sparsity structure is fixed, a struct BatchSparseMatrixCSC{Tv,Ti<:Integer} <: AbstractSparseArray{Tv,Ti,3} end |
julia> A, B = (sprand(100, 100, 0.1) for _ in 1:2);
julia> IA, IB = (IMatrix(X) for X in (A,B));
julia> C = kron(A, IB) + kron(IA, B); C.nzval .= 0;
julia> @benchmark kronsum!($C, $A, $B)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 65.690 ms (0.00% GC)
median time: 66.072 ms (0.00% GC)
mean time: 66.199 ms (0.00% GC)
maximum time: 67.885 ms (0.00% GC)
--------------
samples: 76
evals/sample: 1
julia> @benchmark kron($A, $IB) + kron($IA, $B)
BenchmarkTools.Trial:
memory estimate: 6.88 MiB
allocs estimate: 26
--------------
minimum time: 1.381 ms (0.00% GC)
median time: 1.427 ms (0.00% GC)
mean time: 1.479 ms (2.43% GC)
maximum time: 2.693 ms (31.06% GC)
--------------
samples: 3364
evals/sample: 1
julia> C == kron(A,IB) + kron(IA, B)
true
julia> unA, unB = (oneunit(X) for X in (A,B));
julia> @benchmark kron($A, $unB) + kron($unA, $B)
BenchmarkTools.Trial:
memory estimate: 6.15 MiB
allocs estimate: 24
--------------
minimum time: 1.922 ms (0.00% GC)
median time: 1.968 ms (0.00% GC)
mean time: 2.012 ms (1.59% GC)
maximum time: 2.960 ms (26.67% GC)
--------------
samples: 2475
evals/sample: 1 |
If the matrices are dense: using julia> A, B = (rand(40, 40) for _ in 1:2);
julia> IA, IB = (IMatrix(X) for X in (A,B));
julia> C = kron(A, IB) + kron(IA, B); C.nzval .= 0;
julia> @benchmark kronsum!($C, $A, $B)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 3.664 ms (0.00% GC)
median time: 3.728 ms (0.00% GC)
mean time: 3.746 ms (0.00% GC)
maximum time: 5.585 ms (0.00% GC)
--------------
samples: 1334
evals/sample: 1
julia> @benchmark kron($A, $IB) + kron($IA, $B)
BenchmarkTools.Trial:
memory estimate: 3.94 MiB
allocs estimate: 21
--------------
minimum time: 710.607 μs (0.00% GC)
median time: 733.885 μs (0.00% GC)
mean time: 768.422 μs (2.74% GC)
maximum time: 1.837 ms (44.32% GC)
--------------
samples: 6452
evals/sample: 1
julia> unA, unB = (oneunit(X) for X in (A,B));
julia> @benchmark kron($A, $unB) + kron($unA, $B)
BenchmarkTools.Trial:
memory estimate: 58.59 MiB
allocs estimate: 6
--------------
minimum time: 9.340 ms (0.00% GC)
median time: 9.791 ms (0.00% GC)
mean time: 9.924 ms (2.67% GC)
maximum time: 10.843 ms (7.24% GC)
--------------
samples: 504
evals/sample: 1 |
NOTE:
no, |
Right now, I'm working on in-place, CPU versions of
kronsum(A,B) = kron(A, oneunit(B)) + kron(oneunit(A), B)
So far, this works for dense arrays:
I'm having trouble doing something similar with
SparseMatrixCSC
, since parts ofA
andB
only overlap on the diagonal ofC
.Using Diagonal Format (from DIA.jl) could work, but I think converting to CSC would take too much time (especially if either
A
orB
are sparse) and there's the risk of storing too many zeroes for the diagonal storageVector
s.Currently, I'm inspired by the sparse
kron!
in JuliaLang/julia#31069. It works for eitherkron(A,oneunit(B))
orkron(oneunit(A), B)
, but not the sum.A
, the other forB
)C
whereC isa SparseMatrixCSC
?The text was updated successfully, but these errors were encountered: