Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BlockSparseArrays] Simplify and test adjoint(::BlockSparseMatrix) #1573

Closed
wants to merge 1 commit into from

Conversation

lkdvos
Copy link

@lkdvos lkdvos commented Nov 7, 2024

This PR simplifies both adjoint(::BlockSparseMatrix) and transpose(::BlockSparseMatrix).

This was discussed in #1572, but was already previously mentioned by @ogauthe as well:
#1336 (comment)

Regarding your comment there about the vector types, keep in mind that I (for now) only intercepted the matrix case. I can investigate if we can make this work for Vector as well, keeping in mind that the Adjoint wrapper information is still there, it is just not the outer wrapper.

@mtfishman
Copy link
Member

Regarding your comment there about the vector types, keep in mind that I (for now) only intercepted the matrix case. I can investigate if we can make this work for Vector as well, keeping in mind that the Adjoint wrapper information is still there, it is just not the outer wrapper.

I forgot that was a reason why I was concerned about pushing the wrapper further down but I agree the current design still preserves that information. Definitely something to think carefully about.

@mtfishman
Copy link
Member

I think it would be strange for adjoint(::BlockSparseMatrix) to return BlockSparseMatrix{_,_,<:Adjoint} while adjoint(::BlockSparseVector) returns Adjoint{_,<:BlockSparseVector}. However, returning BlockSparseVector{_,_,<:Adjoint} would mean we need to special case operations for BlockSparseVector{_,_,<:Adjoint}, which is slightly annoying and repeating Base logic. So I'm a bit torn, maybe it would be good to think through how annoying it is to deal with BlockSparseVector{_,_,<:Adjoint} and get things to match Base behavior.

@lkdvos
Copy link
Author

lkdvos commented Nov 8, 2024

I think the type of what Base.adjoint returns should be considered an implementation detail, as already there is actually nothing that forces you to return an Adjoint type (and there really shouldn't). In that sense, I don't think it really matters that these two are different.

Alternatively, I think these lines are not correct, I think if you want to have the Adjoint as a wrapper, the MemoryLayout should be DualLayout{BlockLayout}.

@ogauthe
Copy link
Contributor

ogauthe commented Nov 8, 2024

Indeed I previously proposed a similar change. I understand that extra logic would be needed for the BlockSparseVector case, but I believe the change is worth it. It would automatically fix slicing issues in 1336. It would also simplify dispatch in my use case in FusionTensors. Currently I have to use Union{BlockSparseMatrix,LinearAlgebra.Adjoint{<:Number,<:BlockSparseMatrix}}, to match data_matrix.

@mtfishman
Copy link
Member

mtfishman commented Nov 8, 2024

I agree that to some extent it is an implementation detail, though I'm sure you know these kind of abstractions are "leaky", i.e. inevitably someone will write code like f(::Adjoint{<:Any,<:BlockSparseVector}), or the other way around. For consistency I would prefer that adjoint(a::BlockSparseMatrix) and adjoint(v::BlockSparseVector) return the same type of object, I think it would be surprising otherwise.

Another hesitation I have about doing this right now is that I do want Adjoint(m::BlockSparseMatrix) and Adjoint(v::BlockSparseVector) to return objects (of type Adjoint) that work properly and efficiently as block sparse-like objects, similar to how I want PermutedDimsArray(a::BlockSparseArray, perm) and SubArray(a::BlockSparseArray, inds) to work properly and efficiently as block sparse-like objects. EDIT: I'm worried if we make this change now, we are all just pushing off making fixes to Adjoint(::AbstractBlockSparseVectorOrMatrix) which we should fix anyway (and I don't think will be that hard to fix, we just need to take the time to fix them).

So for now I think we should hold off on this and just work through the current and future issues that come up when using Adjoint{T,<:AbstractBlockSparse[Matrix|Vector]{T}}, and reconsider this later. Maybe we can come up with a design where in the end both formats share a lot of internal code.

Probably you have seen, but the design I have been working with is that when you call blocks(Adjoint(m::BlockSparseMatrix)), it returns a SparseAdjointBlocks object that reinterprets the blocks as the adjoint of the parent blocks (and there are corresponding types like that for other wrappers). Maybe that can be simplified so that blocks(adjoint(m)) directly returns adjoint(blocks(parent(m))) rather than SparseAdjointBlocks, so from a certain perspective it is very similar to the design of this PR, it just happens that the outer type is different (most BlockSparseArray functions call blocks anyway to work with the data). I can't remember why I ended up defining SparseAdjointBlocks rather than directly using Adjoint(blocks(parent(m))), I think I may have tried that at first and faced issues using the Adjoint wrapper so felt like it was cleaner to define my own type (and also the other wrappers have corresponding types so maybe I was just following symmetries with that design). But maybe we should try that out again to see if we can get rid of SparseAdjointBlocks.

Alternatively, I think these lines are not correct, I think if you want to have the Adjoint as a wrapper, the MemoryLayout should be DualLayout{BlockLayout}.

I wasn't aware of DualLayout, thanks for pointing that out. Probably we should use that for the MemoryLayout of adjoint(::AbstractBlockSparseVector), though I think for adjoint(::AbstractBlockSparseMatrix) we can still use BlockLayout, i.e.:

julia> using ArrayLayouts

julia> MemoryLayout(randn(4, 4))
DenseColumnMajor()

julia> MemoryLayout(randn(4, 4)')
ArrayLayouts.DenseRowMajor()

julia> MemoryLayout(randn(4))
DenseColumnMajor()

julia> MemoryLayout(randn(4)')
DualLayout{ArrayLayouts.DenseRowMajor}()

(I really had not thought about adjoint(::AbstractBlockSparseVector) much when designing the current code so I'm sure there are various issues with that.)

@mtfishman
Copy link
Member

mtfishman commented Nov 8, 2024

Indeed I previously proposed a similar change. I understand that extra logic would be needed for the BlockSparseVector case, but I believe the change is worth it. It would automatically fix slicing issues in 1336. It would also simplify dispatch in my use case in FusionTensors. Currently I have to use Union{BlockSparseMatrix,LinearAlgebra.Adjoint{<:Number,<:BlockSparseMatrix}}, to match data_matrix.

I noticed you use that in certain places, but I think you could just use AbstractMatrix instead (similar to how we made code based on GradedAxes more generic by changing more definitions to use AbstractUnitRange rather than hardcoding with concrete types too much). So I think you are conflating issues to some extent, and we should think about the design of FusionTensors separately and design FusionTensors such that it is agnostic about this design choice. As Lukas said, for most code this should just be an internal implementation detail if the code is written generically.

EDIT: We just need to make sure that whatever gets output by adjoint(::AbstractBlockSparseMatrix), it has the functionality that is needed to act as the data matrix of the FusionTensor, which isn't up to debate here, it is just a matter of whether we keep the current design and work through the current (very solvable) bugs or change the design since we think it will make the internal code of BlockSparseArrays simpler/easier to maintain. I'm not convinced of the latter, primarily because I want to be able to support Adjoint(::BlockSparseVectorOrMatrix) wrappers anyway, independent of what we decide will be the output of adjoint(::BlockSparseVectorOrMatrix), and I outlined a design of blocks(::Adjoint{T,BlockSparseMatrix{T}}) that would give many of the same benefits of this PR anyway (in short, blocks(a::Adjoint{T,BlockSparseMatrix{T}}) = blocks(parent(a))', which is the same thing that would get output with this PR).

@mtfishman mtfishman changed the title Simplify and test adjoint(::BlockSparseMatrix) [BlockSparseArrays] Simplify and test adjoint(::BlockSparseMatrix) Nov 8, 2024
@lkdvos
Copy link
Author

lkdvos commented Nov 11, 2024

After discussion, let me close this PR as "not planned", with the goal of ensuring that the wrapper types work.

@lkdvos lkdvos closed this Nov 11, 2024
@mtfishman
Copy link
Member

Sounds good. We can keep this design in mind if the wrapper type becomes too complicated to work with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants