-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sandbox for heap based sparse linear algebra #3969
base: master
Are you sure you want to change the base?
Add sandbox for heap based sparse linear algebra #3969
Conversation
I pushed a first naive implementation. This code can be used for testing:
The lazy evaluation allows quite quick in-place operations. However, many other operations are, of course, expensive. @flenzen: Let me know what you think. Is this going in the right direction in any reasonable sense? |
In invariant theory, one needs to compute kernels of huge, but very sparse, matrices over fields. I assume that this is in the scope of the "Gauss-like" things you are trying to improve here? If so, I would be very interested in trying out your approach! |
Yes, people doing cohomology computations for clique complexes made some good experiences with this way of implementing sparse matrices for kernel computations. See, for example, https://doi.org/10.1016/j.jsc.2016.03.008 for an overview of this application and for run times in a persistent homology context. |
Thanks, I will have a look at that! |
# following subtree shall be multiplied | ||
# with `cofactor` to get the actual values. | ||
left::Union{HeapNode{T}, Nothing} | ||
right::Union{HeapNode{T}, Nothing} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to say, storing a heap as an actual tree like this is extremely inefficient. A classic inline binary heap implemented as a Vector
will beat this easily (my guess would be: by at least an order of magnitude in terms of performance, possibly more in terms of allocations). However it won't be able to accommodate your cofactor
.
Are you by chance trying to use a heap to speed up repeated pivot reduction (as is common in e.g. GB computations) ? Then have you considered using a heap (storing the terms) plus a dictionary (storing the coefficients) ? This technique is e.g. described in section 4 of https://arxiv.org/abs/1206.6940.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the idea. Can't we store a row index-cofactor pair in a vector, or can't pairs be stored contiguously? I'm not very experienced with memory layout in Julia.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to play, but few questions:
- what operations are supported?
- should this not be done in Hecke, where the sparse stuff is, or in AA where it should be?
- how does it compare with magma? The in my experience fastest sparse Lin alg available in general - and not using plain Gauss
- is this any good at finding kernel vectors?
I am in favour of speeding up Lin alg, as this the bottleneck for cohomology....
Not over F_2 though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what operations are supported?
Probably the usual and pivot
, pop!
, and friends on the rows? I think, it might also be nice to have support for lazy evaluation of linear combinations.
should this not be done in Hecke, where the sparse stuff is, or in AA where it should be?
If we get somewhere, then probably yes. We can move it later, right?
is this any good at finding kernel vectors?
No idea. Actually I don't even recall how kernels are computed over the integers.
function add!(v::HeapSRow{T}, w::HeapSRow{T}) where {T} | ||
@assert base_ring(v) === base_ring(w) | ||
R = base_ring(v) | ||
isempty(v) && return w |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not an inplace addition, the return value should be, if at all possible in v.
Looking at the benchmark: this is not fair. It should use the add!... Functions for the old srow as well |
In this paper, they just work over |
Completely agreed. Yet, the use of the old |
Thanks for all the contributions to the discussion! From my point of view I would try the following:
|
175b4b3
to
1f2849a
Compare
@testset "heap-dict vectors" begin | ||
k = 100000 | ||
i = unique!([rand(1:10*k) for i in 1:k]) | ||
v = [ZZ(rand(1:10))^10000 for i in 1:length(i)] | ||
a = Oscar.HeapDictSRow(ZZ, i, v) | ||
aa = sparse_row(a) | ||
i = unique!([rand(1:10*k) for i in 1:k]) | ||
v = [ZZ(rand(1:10))^10000 for i in 1:length(i)] | ||
b = Oscar.HeapDictSRow(ZZ, i, v) | ||
bb = sparse_row(b) | ||
|
||
# For comparison of timings do | ||
# julia> c = copy(a); @time w = add!(c, b); | ||
# 0.042619 seconds (27.40 k allocations: 41.765 MiB, 17.17% gc time) | ||
# | ||
# julia> cc = copy(aa); @time cc += bb; | ||
# 0.226098 seconds (192.12 k allocations: 510.190 MiB, 5.40% gc time) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we are slowly getting somewhere. But this only works for RingElem
s which are big in the sense that copying them around in memory is expensive. For small elements it seems that the original SRow
is still winning. In that case the rehashing while merging the dictionaries is the expensive part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joschmitt There are other implementations that use other finite fields with the same approach, e.g. https://link.springer.com/article/10.1007/s41468-021-00071-5. I have no idea how the heap based approach behaves in char zero. But I'd try that. Also, the cohomology computation where this approach is beneficial is quite specific (clique complexes), so it may not generalize. @HechtiDerLachs I'll look at your code soon and will play around with the underlying heap.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3969 +/- ##
==========================================
- Coverage 84.30% 84.30% -0.01%
==========================================
Files 596 599 +3
Lines 81894 82781 +887
==========================================
+ Hits 69041 69788 +747
- Misses 12853 12993 +140
|
Now It is my impression that the new data structure performs slightly better, but not by orders of magnitude. Still: I managed to get something almost as performant as stuff done by the experts. |
68b412d
to
d9dace8
Compare
The error in the 1.6-short-ubuntu CI job seems to be due to some julia feature not being available in this version. But I hope that this has a trivial fix |
the advantage of the heap actually does not make too much sense: the sorting is only called on creation of a row - if asked for. Once the rows are there, addition just iterates through both, keeping the order.... |
I think this might be beneficial only over division rings. In this case, you can merge a multiple of one row into the other, without touching the elements of the latter. If you repeatedly add to one row, it may get denser and denser, which makes not iterating over it's entries appealing. For sorted lists, you'll have to iterate both lists to produce a new, sorted one.
|
That's true. But it is my impression that even the in-place operations for addition of With the dictionary approach merging seems to come at the cost of eventually doing a rehashing. When the ring elements are big (polynomials over
That should be a great opportunity! Looking forward to this. |
What is this about?
Over a coffee break at the ICMS @flenzen and I discussed a bit about sparse linear algebra. He mentioned that in some application in TDA (topological data analysis) there was a data type for sparse vectors based on heaps which had proven to have advantages in various settings where Gauss-like algorithms are applied to simplify complexes (see e.g. this paper.) I became interested because of my implementation of
simplify
for complexes of modules: In the computation of derived pushforwards for coherent sheaves, the Gauss algorithm is the bottleneck at the moment.I suggested to give it a try together in OSCAR. This PR is our sandbox for collaboration.
What's the state here?
I couldn't hold myself back and started implementing a naive version of what I understood the architecture to be. As a test-case I am using matrices over the integers and implemented a simple algorithm to determine an upper-triangular form (
upper_triangular_form!
). When running the tests it becomes clear that my naive implementation of heap-based data structures for sparse linear algebra clearly looses against the existing implementation viaSRow
andSMat
.However, I think this PR can still be useful since it provides a blueprint for how to contribute a new experimental section to OSCAR and @flenzen might also have ideas and try to play with and improve the code.
In fact, he already suggested to me that realizing heaps via linked binary trees is probably a dumb idea. Seems like he was right. But hey: If you know better, let's try it out! We have this code for comparison for the moment to build benchmarks.
Update: We now have an implementation of heap-dict based sparse rows as was pointed out by @fingolfin. I also made the
HeapSMat
type generic so that it can be used with arbitrary concrete types of sparse rows. With these new data types I now get slightly better performance in many cases when compared with the classicalSMat
andSRow
.It might still be the case that I'm not using enough in-place operations for the latter, though. Feel free to take a stand for the classical data types and improve their benchmark!
Should I care?
If you're interested or an expert in sparse linear algebra over rings, feel free to chime in. Comments are welcome! Otherwise, we will notify the relevant people once we get to the point that we feel that we have produced something useful, or just close this PR.