Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for LinearAlgebra.pinv #2070

Closed
lgravina1997 opened this issue Sep 1, 2023 · 11 comments
Closed

Support for LinearAlgebra.pinv #2070

lgravina1997 opened this issue Sep 1, 2023 · 11 comments
Labels
cuda array Stuff about CuArray. enhancement New feature or request good first issue Good for newcomers upstream Somebody else's problem.

Comments

@lgravina1997
Copy link

Using pinv() to find the partial transpose of a matrix in GPU fails. A minimal example is as follows


A = Matrix(Diagonal([2,0,0,0,0]))
pinv(A)

correctly gives the equivalent of Matrix(Diagonal([0.5,0,0,0,0])). If however we take

A_gpu = cu(A)
pinv(A)

the following error is raised:

GPU compilation of MethodInstance for (::GPUArrays.var"#broadcast_kernel#26")(::CUDA.CuKernelContext, ::SubArray{Float64, 1, CuDeviceVector{Float64, 1}, Tuple{StepRange{Int64, Int64}}, true}, ::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1}, Tuple{Base.OneTo{Int64}}, LinearAlgebra.var"#34#35", Tuple{Base.Broadcast.Extruded{SubArray{Int64, 1, CuDeviceVector{Int64, 1}, Tuple{StepRange{Int64, Int64}}, true}, Tuple{Bool}, Tuple{Int64}}}}, ::Int64) failed
KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1}, Tuple{Base.OneTo{Int64}}, LinearAlgebra.var"#34#35", Tuple{Base.Broadcast.Extruded{SubArray{Int64, 1, CuDeviceVector{Int64, 1}, Tuple{StepRange{Int64, Int64}}, true}, Tuple{Bool}, Tuple{Int64}}}}, which is not isbits:
.f is of type LinearAlgebra.var"#34#35" which is not isbits.
.tol is of type Core.Box which is not isbits.
.contents is of type Any which is not isbits.

Details on Julia:
Julia Version 1.9.2
Commit e4ee485e909 (2023-07-05 09:39 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 20 × 12th Gen Intel(R) Core(TM) i7-12700K
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, alderlake)
  Threads: 21 on 20 virtual cores
Environment:
  JULIA_NUM_THREADS = auto
Details on CUDA:
CUDA runtime 12.1, artifact installation
CUDA driver 12.0
NVIDIA driver 525.125.6

CUDA libraries: 
- CUBLAS: 12.1.3
- CURAND: 10.3.2
- CUFFT: 11.0.2
- CUSOLVER: 11.4.5
- CUSPARSE: 12.1.0
- CUPTI: 18.0.0
- NVML: 12.0.0+525.125.6

Julia packages: 
- CUDA: 4.4.1
- CUDA_Driver_jll: 0.5.0+1
- CUDA_Runtime_jll: 0.6.0+0

Toolchain:
- Julia: 1.9.2
- LLVM: 14.0.6
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2, 7.3, 7.4, 7.5
- Device capability support: sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86

1 device:
  0: NVIDIA GeForce RTX 3070 (sm_86, 5.829 GiB [/](https://file+.vscode-resource.vscode-cdn.net/) 8.000 GiB available)
@lgravina1997 lgravina1997 added the bug Something isn't working label Sep 1, 2023
@maleadt maleadt added enhancement New feature or request cuda array Stuff about CuArray. and removed bug Something isn't working labels Sep 1, 2023
@maleadt maleadt changed the title pinv not working in simplest cases Support for LinearAlgebra.pinv Sep 1, 2023
@maleadt
Copy link
Member

maleadt commented Sep 1, 2023

CUDA.jl doesn't promise compatibility with, say, all of LinearAlgebra.jl. As there's no functionality, or tests, for pinv in CUDA.jl or in GPUArrays.jl, that functionality just hasn't been ported (and it doesn't happen to be GPU compatible).

Specifically, here the pinv implementation from LinearAlgebra.jl contains a box, which is GPU incompatible. Getting rid of that box may make the current implementation work on the GPU, and probably would also be good for performance on the CPU, so if you're interested in this operation I would suggest taking a look at that.

@maleadt maleadt added the good first issue Good for newcomers label Sep 1, 2023
@Zentrik
Copy link
Contributor

Zentrik commented Sep 16, 2023

Possible fix here https://discourse.julialang.org/t/pinv-not-type-stable/103885/13

@Zentrik
Copy link
Contributor

Zentrik commented Sep 18, 2023

After JuliaLang/julia#51351, pinv works on CuArrays for me. Presumably you'll have to wait for Julia 1.11 then.

@maleadt
Copy link
Member

maleadt commented Sep 19, 2023

Great, thanks for the update! If anybody cares about this functionality on current versions of Julia, please open a PR on e.g. GPUArrays back-porting this definition (but using ::AbstractGPUArray for dispatch).

@maleadt maleadt closed this as completed Sep 19, 2023
@maleadt maleadt added the upstream Somebody else's problem. label Sep 19, 2023
@behinger
Copy link

for me pinv doesnt work on GPU:

pinv(cu(rand(10,10)))

results in a ERROR: Scalar indexing is disallowed. - maybe there is something obvious I'm missing.

[052768ef] CUDA v5.5.2

Julia Version 1.11.0
Commit 501a4f25c2b (2024-10-07 11:40 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 128 × AMD EPYC 7452 32-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, znver2)
Threads: 20 default, 0 interactive, 10 GC (on 128 virtual cores)
Environment:
  LD_LIBRARY_PATH = :/opt/julia-1.10.0/lib/julia/
  JULIA_DEPOT_PATH = ~/.julia
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 20
  JULIA_DEBUG = Main

@maleadt
Copy link
Member

maleadt commented Oct 22, 2024

Works here

julia> CUDA.allowscalar(false)

julia> pinv(cu(rand(10,10)))
10×10 CuArray{Float32, 2, CUDA.DeviceMemory}:
 -18.7796    0.192063   14.539     -7.29527   12.6741   …   2.7728    -0.92594  -34.9468      8.98255
  35.6819   -2.92856   -27.3291    12.2244   -26.8333      -8.74039    7.76778   63.6059    -14.8943
  -2.44133   2.86932    -1.20003    1.08595    3.9517       0.934879  -1.18788    0.477386   -0.278232
 -24.4774    1.57473    17.685     -8.31187   19.321        7.05574   -5.72314  -41.1695     10.0091
  23.554     1.54656   -19.0748     9.83059  -15.4369      -5.54631    2.04141   48.1308    -11.8832
   4.1851   -2.73147    -5.20277    3.73084   -4.98535  …  -0.774467   3.27064    7.51045    -1.7731
 -34.7699   -1.1468     27.7664   -13.1913    22.9859       8.49761   -3.8388   -68.323      17.4873
  21.5302   -2.89991   -14.2583     6.05705  -17.9435      -4.51505    4.37752   34.6052     -8.49921
   7.74482   1.66042    -5.73368    2.92117   -4.26454     -1.98245   -1.15659   15.2358     -3.64701
  -1.65787   2.4189      5.64062   -3.53541    3.54468     -0.596906  -3.18818   -5.78155     0.146442

@behinger
Copy link

behinger commented Oct 22, 2024

thanks for checking - now the obvious question: why?

CUDA.runtime_version()
v"12.6.0"

running on a A6000

edit: (I'm asking because I have no idea how to even approach debugging this - but it is not highest priority - if it works for others, fine with me)

@maleadt
Copy link
Member

maleadt commented Oct 22, 2024

I'm using devved packages, so this is probably a recent fix on CUDA.jl or GPUArrays.jl. Not sure which one though.

@behinger
Copy link

I tried deving-both but no help here. Thanks for the comment

@maleadt
Copy link
Member

maleadt commented Oct 24, 2024

Yeah, I'm not sure what to say here...

❯ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.1 (2024-10-16)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(@v1.11) pkg> activate --temp
  Activating new project at `/tmp/jl_lUSpdY`

(jl_lUSpdY) pkg> add CUDA#master
    Updating git-repo `https://github.com/JuliaGPU/CUDA.jl.git`
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
   Installed GPUCompiler ─ v1.0.1
   Installed LLVM ──────── v9.1.3
    Updating `/tmp/jl_lUSpdY/Project.toml`
  [052768ef] + CUDA v5.5.2 `https://github.com/JuliaGPU/CUDA.jl.git#master`

julia> using CUDA, LinearAlgebra

julia> CUDA.allowscalar(false)

julia> pinv(cu(rand(10,10)))
10×10 CuArray{Float32, 2, CUDA.DeviceMemory}:
   0.79093    -3.27        1.61866    0.332101    1.51121   -0.141722  -0.789732    2.41107   -0.787362  -0.748518
   3.02281    -4.83408     2.19427    0.673688    2.17965   -1.5945    -2.05704     3.44536    0.593075  -1.61866
  -4.06717     5.31124    -3.32017   -0.383606   -0.741184   1.35816    0.646221   -3.79391   -0.792143   3.22927
   4.30408    -5.67278     3.60602   -0.373965    2.28992   -2.24799   -2.40252     5.19473    0.52331   -1.9391
   3.73021    -8.38128     4.81611    1.22624     3.23734   -2.7525    -4.09566     7.25994   -0.109592  -1.62358
   0.32791    -0.108507   -0.248915   0.156819   -0.352967   0.598659  -0.900688    0.613525  -0.323573   0.516044
  13.3623    -19.4055     13.5906     1.24436     6.29929   -6.5096    -9.11556    16.5429     0.802335  -6.99507
 -11.327      19.4026    -12.4005    -1.12684    -7.04506    5.5385    10.12      -16.9265    -0.316425   5.52855
   0.606266    2.28777    -0.466847  -1.12594    -2.0018     0.646723   1.4594     -2.69254    1.71311    0.109465
  -7.48365     9.57157    -6.09877    0.0717196  -2.85697    3.2981     4.4907     -7.35427   -1.07708    2.55932

@behinger
Copy link

most likely some dependency in my project lead to issues, not exactly sure why it didnt show up when adding the dev-version with the compat check - super sorry for not testing in a clean environment. thanks for your help, it works now too for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda array Stuff about CuArray. enhancement New feature or request good first issue Good for newcomers upstream Somebody else's problem.
Projects
None yet
Development

No branches or pull requests

4 participants