You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I currently have an issue where I cannot run the code from KernelAbstractions.jl on CUDA (H100). I was able to run the code on an A100 without any problem. I already tried to update the packages but they have compatibility constraints. I also tried to change the CUDA runtime version multiple times but had no success. Any advice/help will help a lot, Thank you!
I am compiling/running the code as the following:
julia --project=KernelAbstractions -e 'import Pkg; Pkg.instantiate()'
julia --project=KernelAbstractions src/KernelAbstractions.jl
CUDA.versioninfo():
CUDA runtime 11.8, artifact installation
CUDA driver 12.4
NVIDIA driver 550.90.7
Hello,
I currently have an issue where I cannot run the code from KernelAbstractions.jl on CUDA (H100). I was able to run the code on an A100 without any problem. I already tried to update the packages but they have compatibility constraints. I also tried to change the CUDA runtime version multiple times but had no success. Any advice/help will help a lot, Thank you!
I am compiling/running the code as the following:
julia --project=KernelAbstractions -e 'import Pkg; Pkg.instantiate()'
julia --project=KernelAbstractions src/KernelAbstractions.jl
CUDA.versioninfo():
CUDA runtime 11.8, artifact installation
CUDA driver 12.4
NVIDIA driver 550.90.7
Libraries:
Toolchain:
2 devices:
0: NVIDIA H100 NVL (sm_90, 93.000 GiB / 93.584 GiB available)
1: NVIDIA H100 NVL (sm_90, 93.000 GiB / 93.584 GiB available)
Pkg status:
⌅ [21141c5a] AMDGPU v0.4.8
[c7e460c6] ArgParse v1.2.0
⌅ [052768ef] CUDA v4.0.1
[72cfdca4] CUDAKernels v0.4.7
⌅ [63c18a36] KernelAbstractions v0.8.6
[d96e819e] Parameters v0.12.3
[7eb9e9f0] ROCKernels v0.3.5
[90137ffa] StaticArrays v1.9.7
This is the error I am getting:
ERROR: LoadError: CUDA error: device kernel image is invalid (code 300, ERROR_INVALID_SOURCE)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA ./strings/substring.jl:222
[2] CuModule(data::Vector{UInt8}, options::Dict{CUDA.CUjit_option_enum, Any})
@ CUDA ~/.julia/packages/CUDA/ZdCxS/lib/cudadrv/module.jl:60
[3] CuModule
@ ~/.julia/packages/CUDA/ZdCxS/lib/cudadrv/module.jl:23 [inlined]
[4] cufunction_link(job::GPUCompiler.CompilerJob, compiled::NamedTuple{(:image, :entry, :external_gvars), Tuple{Vector{UInt8}, String, Vector{String}}})
@ CUDA ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:488
[5] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
@ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/cache.jl:95
[6] cufunction(f::typeof(gpu_fasten_main), tt::Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.StaticSize{(16384,)}, KernelAbstractions.NDIteration.DynamicCheck, Nothing, Nothing, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.StaticSize{(256,)}, KernelAbstractions.NDIteration.StaticSize{(64,)}, Nothing, Nothing}}, Int64, Int64, CuDeviceVector{Atom, 1}, CuDeviceVector{Atom, 1}, CuDeviceVector{FFParams, 1}, CuDeviceMatrix{Float32, 1}, CuDeviceVector{Float32, 1}, Val{4}}}; name::Nothing, always_inline::Bool, kwargs::Base.Pairs{Symbol, Int64, Tuple{Symbol}, NamedTuple{(:maxthreads,), Tuple{Int64}}})
@ CUDA ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:306
[7] macro expansion
@ ~/.julia/packages/CUDA/ZdCxS/src/compiler/execution.jl:102 [inlined]
[8] (::KernelAbstractions.Kernel{CUDADevice{false, false}, KernelAbstractions.NDIteration.StaticSize{(64,)}, KernelAbstractions.NDIteration.StaticSize{(16384,)}, typeof(gpu_fasten_main)})(::Int64, ::Vararg{Any}; ndrange::Int64, dependencies::CUDAKernels.CudaEvent, workgroupsize::Nothing, progress::Function)
@ CUDAKernels ~/.julia/packages/CUDAKernels/3IKLV/src/CUDAKernels.jl:283
[9] run(params::Params, deck::Deck, device::Tuple{CuDevice, String, Backend})
@ Main ~/Fall2024/miniBUDE/src/julia/miniBUDE.jl/src/KernelAbstractions.jl:146
[10] main()
@ Main ~/Fall2024/miniBUDE/src/julia/miniBUDE.jl/src/BUDE.jl:222
[11] top-level scope
@ ~/Fall2024/miniBUDE/src/julia/miniBUDE.jl/src/KernelAbstractions.jl:313
in expression starting at /home/xa2/Fall2024/miniBUDE/src/julia/miniBUDE.jl/src/KernelAbstractions.jl:313
The text was updated successfully, but these errors were encountered: