Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia 1.11 Unsupported #1358

Open
gdalle opened this issue Mar 25, 2024 · 21 comments
Open

Julia 1.11 Unsupported #1358

gdalle opened this issue Mar 25, 2024 · 21 comments

Comments

@gdalle
Copy link
Contributor

gdalle commented Mar 25, 2024

I wanted to use split reverse mode in order to compute pullbacks with array outputs. The only doc I found is

https://enzymead.github.io/Enzyme.jl/stable/api/#EnzymeCore.autodiff_thunk-Union{Tuple{RABI},%20Tuple{ModifiedBetweenT},%20Tuple{Width},%20Tuple{ReturnShadow},%20Tuple{ReturnPrimal},%20Tuple{A},%20Tuple{FA},%20Tuple{EnzymeCore.ReverseModeSplit{ReturnPrimal,%20ReturnShadow,%20Width,%20ModifiedBetweenT,%20RABI},%20Type{FA},%20Type{A},%20Vararg{Any}}}%20where%20{FA%3C:Annotation,%20A%3C:Annotation,%20ReturnPrimal,%20ReturnShadow,%20Width,%20ModifiedBetweenT,%20RABI%3C:EnzymeCore.ABI}

so I tried it but it fails on Julia 1.11:

julia> using Enzyme

julia> A = [2.2]; ∂A = zero(A)
1-element Vector{Float64}:
 0.0

julia> v = 3.3
3.3

julia> function f(A, v)
           res = A[1] * v
           A[1] = 0
           res
       end
f (generic function with 1 method)

julia> forward, reverse = autodiff_thunk(ReverseSplitWithPrimal, Const{typeof(f)}, Active, Duplicated{typeof(A)}, Active{typeof(v)})
ERROR: MethodError: no method matching get_inference_world(::Enzyme.Compiler.Interpreter.EnzymeInterpreter)
The function `get_inference_world` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  get_inference_world(::REPL.REPLCompletions.REPLInterpreter)
   @ REPL ~/.julia/juliaup/julia-1.11.0-alpha2+0.x64.linux.gnu/share/julia/stdlib/v1.11/REPL/src/REPLCompletions.jl:550
  get_inference_world(::Core.Compiler.NativeInterpreter)
   @ Core compiler/types.jl:402

Stacktrace:
       internal @ Core.Compiler, Enzyme.Compiler, GPUCompiler, Core, Unknown
 [11] autodiff_thunk(::EnzymeCore.ReverseModeSplit{true, true, 0, true, FFIABI}, ::Type{Const{typeof(f)}}, ::Type{Active}, ::Type, ::Vararg{Type})
    @ Enzyme ~/.julia/packages/Enzyme/l4FS0/src/Enzyme.jl:524
Use `err` to retrieve the full stack trace.

julia> err
1-element ExceptionStack:
MethodError: no method matching get_inference_world(::Enzyme.Compiler.Interpreter.EnzymeInterpreter)
The function `get_inference_world` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  get_inference_world(::REPL.REPLCompletions.REPLInterpreter)
   @ REPL ~/.julia/juliaup/julia-1.11.0-alpha2+0.x64.linux.gnu/share/julia/stdlib/v1.11/REPL/src/REPLCompletions.jl:550
  get_inference_world(::Core.Compiler.NativeInterpreter)
   @ Core compiler/types.jl:402

Stacktrace:
  [1] Core.Compiler.InferenceState(result::Core.Compiler.InferenceResult, cache_mode::UInt8, interp::Enzyme.Compiler.Interpreter.EnzymeInterpreter)
    @ Core.Compiler ./compiler/inferencestate.jl:493
  [2] Core.Compiler.InferenceState(result::Core.Compiler.InferenceResult, cache_mode::Symbol, interp::Enzyme.Compiler.Interpreter.EnzymeInterpreter)
    @ Core.Compiler ./compiler/inferencestate.jl:499
  [3] typeinf(interp::Enzyme.Compiler.Interpreter.EnzymeInterpreter, result::Core.Compiler.InferenceResult, cache_mode::Symbol)
    @ Core.Compiler ./compiler/typeinfer.jl:9
  [4] typeinf_type(interp::Enzyme.Compiler.Interpreter.EnzymeInterpreter, mi::Core.MethodInstance)
    @ Core.Compiler ./compiler/typeinfer.jl:1072
  [5] typeinf_type(interp::Enzyme.Compiler.Interpreter.EnzymeInterpreter, method::Method, atype::Any, sparams::Core.SimpleVector)
    @ Core.Compiler ./compiler/typeinfer.jl:1059
  [6] (::Enzyme.Compiler.var"#532#533"{})(ctx::LLVM.Context)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/l4FS0/src/compiler.jl:5495
  [7] JuliaContext(f::Enzyme.Compiler.var"#532#533"{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/U36Ed/src/driver.jl:47
  [8] #s1883#531
    @ ~/.julia/packages/Enzyme/l4FS0/src/compiler.jl:5482 [inlined]
  [9] 
    @ Enzyme.Compiler ./none:0
 [10] (::Core.GeneratedFunctionStub)(::UInt64, ::LineNumberNode, ::Any, ::Vararg{Any})
    @ Core ./boot.jl:705
 [11] autodiff_thunk(::EnzymeCore.ReverseModeSplit{true, true, 0, true, FFIABI}, ::Type{Const{typeof(f)}}, ::Type{Active}, ::Type, ::Vararg{Type})
    @ Enzyme ~/.julia/packages/Enzyme/l4FS0/src/Enzyme.jl:524
 [12] top-level scope
    @ REPL[23]:1
Some type information was truncated. Use `show(err)` to see complete types.

(docs) pkg> st
Status `~/Work/GitHub/Julia/DifferentiationInterface.jl/docs/Project.toml`
  [47edcb42] ADTypes v0.2.7
  [d360d2e6] ChainRulesCore v1.23.0
  [0ca39b1e] Chairmarks v1.2.0
  [a93c6f00] DataFrames v1.6.1
  [163ba53b] DiffResults v1.1.0
  [a0c0ee7d] DifferentiationInterface v0.1.0 `..`
  [9f5e2b26] Diffractor v0.2.5
  [e30172f5] Documenter v1.3.0
  [a078cd44] DocumenterMermaid v0.1.1
  [7da242da] Enzyme v0.11.20
  [eb9bf01b] FastDifferentiation v0.3.6
  [1a297f60] FillArrays v1.9.3
  [6a86dc24] FiniteDiff v2.23.0
  [26cc04aa] FiniteDifferences v0.12.31
  [f6369f11] ForwardDiff v0.10.36
  [c3a54625] JET v0.8.29
  [98d1487c] PolyesterForwardDiff v0.1.1
  [37e2e3b7] ReverseDiff v1.15.1
  [9f7883ad] Tracker v0.2.33
  [e88e6eb3] Zygote v0.6.69
  [d6f4376e] Markdown v1.11.0
  [9a3f8284] Random v1.11.0
  [8dfed614] Test v1.11.0

julia> versioninfo()
Julia Version 1.11.0-alpha2
Commit 9dfd28ab751 (2024-03-18 20:35 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 12 × Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 12 virtual cores)
@vchuravy
Copy link
Member

We need an update to the new GPUCompiler version

@wsmoses wsmoses changed the title Split reverse mode example fails on Julia 1.11 but works on 1.10 Julia 1.11 Unsupported Mar 25, 2024
@mofeing
Copy link
Contributor

mofeing commented Jul 15, 2024

I'm getting the following error here

ERROR: LoadError: UndefVarError: `PassBuilder` not defined in `Enzyme.Compiler`
Stacktrace:
  [1] macro expansion
    @ ~/.julia/packages/LLVM/5DlHM/src/base.jl:96 [inlined]
  [2] (::Enzyme.Compiler.var"#prop_julia_addr#28416"{LLVM.TargetMachine})(f::LLVM.Function)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler/optimize.jl:75
  [3] function_pass_callback(ptr::Ptr{Nothing}, data::Ptr{Nothing})
    @ LLVM ~/.julia/packages/LLVM/5DlHM/src/pass.jl:49
  [4] LLVMRunPassManager
    @ ~/.julia/packages/LLVM/5DlHM/lib/16/libLLVM.jl:3351 [inlined]
  [5] run!
    @ ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:39 [inlined]
  [6] (::Enzyme.Compiler.var"#28512#28513"{LLVM.Module, LLVM.TargetMachine})(pm::LLVM.ModulePassManager)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler/optimize.jl:2029
  [7] LLVM.ModulePassManager(::Enzyme.Compiler.var"#28512#28513"{LLVM.Module, LLVM.TargetMachine}; kwargs::@Kwargs{})
    @ LLVM ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:33
  [8] ModulePassManager
    @ ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:30 [inlined]
  [9] optimize!
    @ ~/.julia/packages/Enzyme/Pljwm/src/compiler/optimize.jl:1951 [inlined]
 [10] codegen(output::Symbol, job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:5787
 [11] codegen
    @ ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:5194 [inlined]
 [12] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6682
 [13] _thunk
    @ ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6682 [inlined]
 [14] cached_compilation
    @ ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6720 [inlined]
 [15] (::Enzyme.Compiler.var"#28633#28634"{Active, FFIABI, Const{typeof(loss_function)}, Enzyme.API.DEM_ReverseModeCombined, (false, false, false, false, false, false), true, false, Tuple{Const{Lux.Chain{@NamedTuple{layer_1::Lux.Dense{true, typeof(tanh_fast), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_2::Lux.Dense{true, typeof(identity), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_3::WrappedFunction{:direct_call, typeof(softmax)}}, Nothing}}, Const{Matrix{Float32}}, Const{OneHotMatrix{UInt32, Vector{UInt32}}}, Duplicated{@NamedTuple{layer_1::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_2::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_3::@NamedTuple{}}}, Const{@NamedTuple{layer_1::@NamedTuple{}, layer_2::@NamedTuple{}, layer_3::@NamedTuple{}}}}, 0x00000000000068d4, 1, Core.MethodInstance})(ctx::LLVM.Context)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6795
 [16] JuliaContext(f::Enzyme.Compiler.var"#28633#28634"{Active, FFIABI, Const{typeof(loss_function)}, Enzyme.API.DEM_ReverseModeCombined, (false, false, false, false, false, false), true, false, Tuple{Const{Lux.Chain{@NamedTuple{layer_1::Lux.Dense{true, typeof(tanh_fast), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_2::Lux.Dense{true, typeof(identity), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_3::WrappedFunction{:direct_call, typeof(softmax)}}, Nothing}}, Const{Matrix{Float32}}, Const{OneHotMatrix{UInt32, Vector{UInt32}}}, Duplicated{@NamedTuple{layer_1::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_2::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_3::@NamedTuple{}}}, Const{@NamedTuple{layer_1::@NamedTuple{}, layer_2::@NamedTuple{}, layer_3::@NamedTuple{}}}}, 0x00000000000068d4, 1, Core.MethodInstance}; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:52
 [17] JuliaContext
    @ ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:42 [inlined]
 [18] thunkbase(mi::Core.MethodInstance, ::Val{0x00000000000068d4}, ::Type{Const{typeof(loss_function)}}, ::Type{Active}, tt::Type{Tuple{Const{Lux.Chain{@NamedTuple{layer_1::Lux.Dense{true, typeof(tanh_fast), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_2::Lux.Dense{true, typeof(identity), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_3::WrappedFunction{:direct_call, typeof(softmax)}}, Nothing}}, Const{Matrix{Float32}}, Const{OneHotMatrix{UInt32, Vector{UInt32}}}, Duplicated{@NamedTuple{layer_1::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_2::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_3::@NamedTuple{}}}, Const{@NamedTuple{layer_1::@NamedTuple{}, layer_2::@NamedTuple{}, layer_3::@NamedTuple{}}}}}, ::Val{Enzyme.API.DEM_ReverseModeCombined}, ::Val{1}, ::Val{(false, false, false, false, false, false)}, ::Val{true}, ::Val{false}, ::Type{FFIABI})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6740
 [19] #s2021#28635
    @ ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6826 [inlined]
 [20] var"#s2021#28635"(FA::Any, A::Any, TT::Any, Mode::Any, ModifiedBetween::Any, width::Any, ReturnPrimal::Any, ShadowInit::Any, World::Any, ABI::Any, ::Any, ::Any, ::Any, ::Any, tt::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any)
    @ Enzyme.Compiler ./none:0
 [21] (::Core.GeneratedFunctionStub)(::UInt64, ::LineNumberNode, ::Any, ::Vararg{Any})
    @ Core ./boot.jl:709
 [22] autodiff
    @ ~/.julia/packages/Enzyme/Pljwm/src/Enzyme.jl:309 [inlined]
 [23] autodiff
    @ ~/.julia/packages/Enzyme/Pljwm/src/Enzyme.jl:326 [inlined]
 [24] gradient_loss_function(model::Lux.Chain{@NamedTuple{layer_1::Lux.Dense{true, typeof(tanh_fast), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_2::Lux.Dense{true, typeof(identity), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_3::WrappedFunction{:direct_call, typeof(softmax)}}, Nothing}, x::Matrix{Float32}, y::OneHotMatrix{UInt32, Vector{UInt32}}, ps::@NamedTuple{layer_1::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_2::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_3::@NamedTuple{}}, st::@NamedTuple{layer_1::@NamedTuple{}, layer_2::@NamedTuple{}, layer_3::@NamedTuple{}})
    @ Main ~/work/Reactant.jl/Reactant.jl/test/nn_lux.jl:65
 [25] top-level scope
    @ ~/work/Reactant.jl/Reactant.jl/test/nn_lux.jl:78
 [26] include(fname::String)
    @ Main ./sysimg.jl:38
 [27] top-level scope
    @ ~/work/Reactant.jl/Reactant.jl/test/runtests.jl:49
 [28] include(fname::String)
    @ Main ./sysimg.jl:38
 [29] top-level scope
    @ none:6
in expression starting at /home/runner/work/Reactant.jl/Reactant.jl/test/nn_lux.jl:78
in expression starting at /home/runner/work/Reactant.jl/Reactant.jl/test/runtests.jl:49
Package Reactant errored during testing

@vchuravy
Copy link
Member

Is this on main?

@avik-pal
Copy link
Contributor

Is this on main?

I am getting this on main

@vchuravy
Copy link
Member

What is your Manifest? Did you re-resolve. This means you ended up with an old version of GPUCompiler most likely.

@wsmoses
Copy link
Member

wsmoses commented Jul 17, 2024

I think Enzyme has an old GPUCompiler in its compat, which we should likely remove

@avik-pal
Copy link
Contributor

avik-pal commented Jul 21, 2024

With Enzyme#main, Enzyme_jll#main, GPUCompiler#master I am still getting failure on 1.11

using Enzyme

x = rand(Float32, 32);

Enzyme.gradient(Reverse, sum, x)
Error
Closest candidates are:
  cpu_features!(::LLVM.Module)
   @ GPUCompiler ~/.julia/packages/GPUCompiler/89rev/src/optim.jl:290

Stacktrace:
  [1] (::Enzyme.Compiler.var"#28298#28299"{LLVM.Module, LLVM.TargetMachine})(pm::LLVM.ModulePassManager)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/YDcYf/src/compiler/optimize.jl:1965
  [2] LLVM.ModulePassManager(::Enzyme.Compiler.var"#28298#28299"{LLVM.Module, LLVM.TargetMachine}; kwargs::@Kwargs{})
    @ LLVM ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:33
  [3] ModulePassManager
    @ ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:30
  [4] optimize!(mod::LLVM.Module, tm::LLVM.TargetMachine)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/YDcYf/src/compiler/optimize.jl:1951
  [5] codegen(output::Symbol, job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:5807
  [6] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:6710
  [7] cached_compilation
    @ ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:6748 [inlined]
  [8] thunkbase(ctx::LLVM.Context, mi::Core.MethodInstance, ::Val{0x0000000000006819}, ::Type{Const{…}}, ::Type{Active}, tt::Type{Tuple{…}}, ::Val{Enzyme.API.DEM_ReverseModeCombined}, ::Val{1}, ::Val{(false, false)}, ::Val{false}, ::Val{false}, ::Type{FFIABI})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:6821
  [9] #s2021#28415
    @ ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:6873 [inlined]
 [10] var"#s2021#28415"(FA::Any, A::Any, TT::Any, Mode::Any, ModifiedBetween::Any, width::Any, ReturnPrimal::Any, ShadowInit::Any, World::Any, ABI::Any, ::Any, ::Any, ::Any, ::Any, tt::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any)
    @ Enzyme.Compiler ./none:0
 [11] (::Core.GeneratedFunctionStub)(::UInt64, ::LineNumberNode, ::Any, ::Vararg{Any})
    @ Core ./boot.jl:709
 [12] autodiff
    @ ~/.julia/packages/Enzyme/YDcYf/src/Enzyme.jl:309 [inlined]
 [13] autodiff
    @ ~/.julia/packages/Enzyme/YDcYf/src/Enzyme.jl:326 [inlined]
 [14] gradient(rm::ReverseMode{false, FFIABI, false}, f::typeof(sum), x::Vector{Float32})
    @ Enzyme ~/.julia/packages/Enzyme/YDcYf/src/Enzyme.jl:1027
 [15] top-level scope
    @ REPL[3]:1
 [16] top-level scope
    @ none:1
Some type information was truncated. Use `show(err)` to see complete types.
Manifest.toml
# This file is machine-generated - editing it directly is not advised

julia_version = "1.11.0-rc1"
manifest_format = "2.0"
project_hash = "9d9865fdd982a60cc61fa85e3720d1f44ecb386c"

[[deps.ArgTools]]
uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f"
version = "1.1.2"

[[deps.Artifacts]]
uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33"
version = "1.11.0"

[[deps.Base64]]
uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f"
version = "1.11.0"

[[deps.CEnum]]
git-tree-sha1 = "389ad5c84de1ae7cf0e28e381131c98ea87d54fc"
uuid = "fa961155-64e5-5f13-b03f-caf6b980ea82"
version = "0.5.0"

[[deps.CompilerSupportLibraries_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae"
version = "1.1.1+0"

[[deps.Dates]]
deps = ["Printf"]
uuid = "ade2ca70-3891-5945-98fb-dc099432e06a"
version = "1.11.0"

[[deps.Downloads]]
deps = ["ArgTools", "FileWatching", "LibCURL", "NetworkOptions"]
uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6"
version = "1.6.0"

[[deps.Enzyme]]
deps = ["CEnum", "EnzymeCore", "Enzyme_jll", "GPUCompiler", "LLVM", "Libdl", "LinearAlgebra", "ObjectFile", "Preferences", "Printf", "Random"]
git-tree-sha1 = "b11d3c1aa7166ef05331ee762e7e1108722af436"
repo-rev = "main"
repo-url = "https://github.com/EnzymeAD/Enzyme.jl.git"
uuid = "7da242da-08ed-463a-9acd-ee780be4f1d9"
version = "0.12.24"

    [deps.Enzyme.extensions]
    EnzymeChainRulesCoreExt = "ChainRulesCore"
    EnzymeLogExpFunctionsExt = "LogExpFunctions"
    EnzymeSpecialFunctionsExt = "SpecialFunctions"
    EnzymeStaticArraysExt = "StaticArrays"

    [deps.Enzyme.weakdeps]
    ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
    LogExpFunctions = "2ab3a3ac-af41-5b50-aa03-7779005ae688"
    SpecialFunctions = "276daf66-3868-5448-9aa4-cd146d93841b"
    StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"

[[deps.EnzymeCore]]
git-tree-sha1 = "d445df66dd8761a4c27df950db89c6a3a0629fe7"
uuid = "f151be2c-9106-41f4-ab19-57ee4f262869"
version = "0.7.7"

    [deps.EnzymeCore.extensions]
    AdaptExt = "Adapt"

    [deps.EnzymeCore.weakdeps]
    Adapt = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"

[[deps.Enzyme_jll]]
deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"]
git-tree-sha1 = "a509ac21f4f44df224bf7cb6901fa4236d015b5e"
repo-rev = "main"
repo-url = "https://github.com/JuliaBinaryWrappers/Enzyme_jll.jl.git"
uuid = "7cc45869-7501-5eee-bdea-0790c847d4ef"
version = "0.0.136+0"

[[deps.ExprTools]]
git-tree-sha1 = "27415f162e6028e81c72b82ef756bf321213b6ec"
uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04"
version = "0.1.10"

[[deps.FileWatching]]
uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee"
version = "1.11.0"

[[deps.GPUCompiler]]
deps = ["ExprTools", "InteractiveUtils", "LLVM", "Libdl", "Logging", "PrecompileTools", "Preferences", "Scratch", "Serialization", "TOML", "TimerOutputs", "UUIDs"]
git-tree-sha1 = "36e1cbe62869fe2a95958d0d3fcd5dad47cac6fd"
repo-rev = "master"
repo-url = "https://github.com/JuliaGPU/GPUCompiler.jl.git"
uuid = "61eb1bfa-7361-4325-ad38-22787b887f55"
version = "0.26.7"

[[deps.InteractiveUtils]]
deps = ["Markdown"]
uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
version = "1.11.0"

[[deps.JLLWrappers]]
deps = ["Artifacts", "Preferences"]
git-tree-sha1 = "7e5d6779a1e09a36db2a7b6cff50942a0a7d0fca"
uuid = "692b3bcd-3c85-4b1f-b108-f13ce0eb3210"
version = "1.5.0"

[[deps.LLVM]]
deps = ["CEnum", "LLVMExtra_jll", "Libdl", "Preferences", "Printf", "Requires", "Unicode"]
git-tree-sha1 = "020abd49586480c1be84f57da0017b5d3db73f7c"
uuid = "929cbde3-209d-540e-8aea-75f648917ca0"
version = "8.0.0"

    [deps.LLVM.extensions]
    BFloat16sExt = "BFloat16s"

    [deps.LLVM.weakdeps]
    BFloat16s = "ab4f0b2a-ad5b-11e8-123f-65d77653426b"

[[deps.LLVMExtra_jll]]
deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"]
git-tree-sha1 = "c2636c264861edc6d305e6b4d528f09566d24c5e"
uuid = "dad2f222-ce93-54a1-a47d-0025e8a3acab"
version = "0.0.30+0"

[[deps.LazyArtifacts]]
deps = ["Artifacts", "Pkg"]
uuid = "4af54fe1-eca0-43a8-85a7-787d91b784e3"
version = "1.11.0"

[[deps.LibCURL]]
deps = ["LibCURL_jll", "MozillaCACerts_jll"]
uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21"
version = "0.6.4"

[[deps.LibCURL_jll]]
deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"]
uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0"
version = "8.6.0+0"

[[deps.LibGit2]]
deps = ["Base64", "LibGit2_jll", "NetworkOptions", "Printf", "SHA"]
uuid = "76f85450-5226-5b5a-8eaa-529ad045b433"
version = "1.11.0"

[[deps.LibGit2_jll]]
deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll"]
uuid = "e37daf67-58a4-590a-8e99-b0245dd2ffc5"
version = "1.7.2+0"

[[deps.LibSSH2_jll]]
deps = ["Artifacts", "Libdl", "MbedTLS_jll"]
uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8"
version = "1.11.0+1"

[[deps.Libdl]]
uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
version = "1.11.0"

[[deps.LinearAlgebra]]
deps = ["Libdl", "OpenBLAS_jll", "libblastrampoline_jll"]
uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
version = "1.11.0"

[[deps.Logging]]
uuid = "56ddb016-857b-54e1-b83d-db4d58db5568"
version = "1.11.0"

[[deps.Markdown]]
deps = ["Base64"]
uuid = "d6f4376e-aef5-505a-96c1-9c027394607a"
version = "1.11.0"

[[deps.MbedTLS_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1"
version = "2.28.6+0"

[[deps.MozillaCACerts_jll]]
uuid = "14a3606d-f60d-562e-9121-12d972cd8159"
version = "2023.12.12"

[[deps.NetworkOptions]]
uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908"
version = "1.2.0"

[[deps.ObjectFile]]
deps = ["Reexport", "StructIO"]
git-tree-sha1 = "195e0a19842f678dd3473ceafbe9d82dfacc583c"
uuid = "d8793406-e978-5875-9003-1fc021f44a92"
version = "0.4.1"

[[deps.OpenBLAS_jll]]
deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"]
uuid = "4536629a-c528-5b80-bd46-f80d51c5b363"
version = "0.3.27+1"

[[deps.Pkg]]
deps = ["Artifacts", "Dates", "Downloads", "FileWatching", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "Random", "SHA", "TOML", "Tar", "UUIDs", "p7zip_jll"]
uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
version = "1.11.0"

    [deps.Pkg.extensions]
    REPLExt = "REPL"

    [deps.Pkg.weakdeps]
    REPL = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb"

[[deps.PrecompileTools]]
deps = ["Preferences"]
git-tree-sha1 = "5aa36f7049a63a1528fe8f7c3f2113413ffd4e1f"
uuid = "aea7be01-6a6a-4083-8856-8a6e6704d82a"
version = "1.2.1"

[[deps.Preferences]]
deps = ["TOML"]
git-tree-sha1 = "9306f6085165d270f7e3db02af26a400d580f5c6"
uuid = "21216c6a-2e73-6563-6e65-726566657250"
version = "1.4.3"

[[deps.Printf]]
deps = ["Unicode"]
uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7"
version = "1.11.0"

[[deps.Random]]
deps = ["SHA"]
uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
version = "1.11.0"

[[deps.Reexport]]
git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b"
uuid = "189a3867-3050-52da-a836-e630ba90ab69"
version = "1.2.2"

[[deps.Requires]]
deps = ["UUIDs"]
git-tree-sha1 = "838a3a4188e2ded87a4f9f184b4b0d78a1e91cb7"
uuid = "ae029012-a4dd-5104-9daa-d747884805df"
version = "1.3.0"

[[deps.SHA]]
uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce"
version = "0.7.0"

[[deps.Scratch]]
deps = ["Dates"]
git-tree-sha1 = "3bac05bc7e74a75fd9cba4295cde4045d9fe2386"
uuid = "6c6a2e73-6563-6170-7368-637461726353"
version = "1.2.1"

[[deps.Serialization]]
uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
version = "1.11.0"

[[deps.StructIO]]
deps = ["Test"]
git-tree-sha1 = "010dc73c7146869c042b49adcdb6bf528c12e859"
uuid = "53d494c1-5632-5724-8f4c-31dff12d585f"
version = "0.3.0"

[[deps.TOML]]
deps = ["Dates"]
uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76"
version = "1.0.3"

[[deps.Tar]]
deps = ["ArgTools", "SHA"]
uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e"
version = "1.10.0"

[[deps.Test]]
deps = ["InteractiveUtils", "Logging", "Random", "Serialization"]
uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
version = "1.11.0"

[[deps.TimerOutputs]]
deps = ["ExprTools", "Printf"]
git-tree-sha1 = "5a13ae8a41237cff5ecf34f73eb1b8f42fff6531"
uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
version = "0.5.24"

[[deps.UUIDs]]
deps = ["Random", "SHA"]
uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4"
version = "1.11.0"

[[deps.Unicode]]
uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5"
version = "1.11.0"

[[deps.Zlib_jll]]
deps = ["Libdl"]
uuid = "83775a58-1f1d-513f-b197-d71354ab007a"
version = "1.2.13+1"

[[deps.libblastrampoline_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "8e850b90-86db-534c-a0d3-1478176c7d93"
version = "5.8.0+1"

[[deps.nghttp2_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d"
version = "1.59.0+0"

[[deps.p7zip_jll]]
deps = ["Artifacts", "Libdl"]
uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0"
version = "17.4.0+2"

Installing the latest release of GPUCompiler I get a different error:

Error
ERROR: UndefVarError: `PassBuilder` not defined in `Enzyme.Compiler`
Suggestion: check for spelling errors or missing imports.
Stacktrace:
  [1] macro expansion
    @ ~/.julia/packages/LLVM/5DlHM/src/base.jl:96 [inlined]
  [2] (::Enzyme.Compiler.var"#prop_julia_addr#28202"{LLVM.TargetMachine})(f::LLVM.Function)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/YDcYf/src/compiler/optimize.jl:75
  [3] function_pass_callback(ptr::Ptr{Nothing}, data::Ptr{Nothing})
    @ LLVM ~/.julia/packages/LLVM/5DlHM/src/pass.jl:49
  [4] LLVMRunPassManager
    @ ~/.julia/packages/LLVM/5DlHM/lib/16/libLLVM.jl:3351 [inlined]
  [5] run!
    @ ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:39 [inlined]
  [6] (::Enzyme.Compiler.var"#28298#28299"{LLVM.Module, LLVM.TargetMachine})(pm::LLVM.ModulePassManager)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/YDcYf/src/compiler/optimize.jl:2029
  [7] LLVM.ModulePassManager(::Enzyme.Compiler.var"#28298#28299"{LLVM.Module, LLVM.TargetMachine}; kwargs::@Kwargs{})
    @ LLVM ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:33
  [8] ModulePassManager
    @ ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:30 [inlined]
  [9] optimize!
    @ ~/.julia/packages/Enzyme/YDcYf/src/compiler/optimize.jl:1951 [inlined]
 [10] codegen(output::Symbol, job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:5807
 [11] codegen
    @ ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:5208 [inlined]
 [12] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:6710
 [13] _thunk
    @ ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:6710 [inlined]
 [14] cached_compilation
    @ ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:6748 [inlined]
 [15] thunkbase(ctx::LLVM.Context, mi::Core.MethodInstance, ::Val{0x0000000000006819}, ::Type{Const{…}}, ::Type{Active}, tt::Type{Tuple{…}}, ::Val{Enzyme.API.DEM_ReverseModeCombined}, ::Val{1}, ::Val{(false, false)}, ::Val{false}, ::Val{false}, ::Type{FFIABI})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:6821
 [16] #s2021#28415
    @ ~/.julia/packages/Enzyme/YDcYf/src/compiler.jl:6873 [inlined]
 [17] var"#s2021#28415"(FA::Any, A::Any, TT::Any, Mode::Any, ModifiedBetween::Any, width::Any, ReturnPrimal::Any, ShadowInit::Any, World::Any, ABI::Any, ::Any, ::Any, ::Any, ::Any, tt::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any)
    @ Enzyme.Compiler ./none:0
 [18] (::Core.GeneratedFunctionStub)(::UInt64, ::LineNumberNode, ::Any, ::Vararg{Any})
    @ Core ./boot.jl:709
 [19] autodiff
    @ ~/.julia/packages/Enzyme/YDcYf/src/Enzyme.jl:309 [inlined]
 [20] autodiff
    @ ~/.julia/packages/Enzyme/YDcYf/src/Enzyme.jl:326 [inlined]
 [21] gradient(rm::ReverseMode{false, FFIABI, false}, f::typeof(sum), x::Vector{Float32})
    @ Enzyme ~/.julia/packages/Enzyme/YDcYf/src/Enzyme.jl:1027
 [22] top-level scope
    @ REPL[3]:1
 [23] top-level scope
    @ none:1
Some type information was truncated. Use `show(err)` to see complete types.

@avik-pal
Copy link
Contributor

That said enzyme definitely has some older GPUCompiler in compat which shouldn't be there. Currently installing AMDGPU (which installs GPUCompiler 0.26.5) causes enzyme to not precompile https://buildkite.com/julialang/luxlib-dot-jl/builds/835#0190d6c1-141c-4f6b-ab1d-eec8c2e4f7bc/317-648

@frankwswang
Copy link

I'm getting the following error here

ERROR: LoadError: UndefVarError: `PassBuilder` not defined in `Enzyme.Compiler`
Stacktrace:
  [1] macro expansion
    @ ~/.julia/packages/LLVM/5DlHM/src/base.jl:96 [inlined]
  [2] (::Enzyme.Compiler.var"#prop_julia_addr#28416"{LLVM.TargetMachine})(f::LLVM.Function)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler/optimize.jl:75
  [3] function_pass_callback(ptr::Ptr{Nothing}, data::Ptr{Nothing})
    @ LLVM ~/.julia/packages/LLVM/5DlHM/src/pass.jl:49
  [4] LLVMRunPassManager
    @ ~/.julia/packages/LLVM/5DlHM/lib/16/libLLVM.jl:3351 [inlined]
  [5] run!
    @ ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:39 [inlined]
  [6] (::Enzyme.Compiler.var"#28512#28513"{LLVM.Module, LLVM.TargetMachine})(pm::LLVM.ModulePassManager)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler/optimize.jl:2029
  [7] LLVM.ModulePassManager(::Enzyme.Compiler.var"#28512#28513"{LLVM.Module, LLVM.TargetMachine}; kwargs::@Kwargs{})
    @ LLVM ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:33
  [8] ModulePassManager
    @ ~/.julia/packages/LLVM/5DlHM/src/passmanager.jl:30 [inlined]
  [9] optimize!
    @ ~/.julia/packages/Enzyme/Pljwm/src/compiler/optimize.jl:1951 [inlined]
 [10] codegen(output::Symbol, job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:5787
 [11] codegen
    @ ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:5194 [inlined]
 [12] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6682
 [13] _thunk
    @ ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6682 [inlined]
 [14] cached_compilation
    @ ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6720 [inlined]
 [15] (::Enzyme.Compiler.var"#28633#28634"{Active, FFIABI, Const{typeof(loss_function)}, Enzyme.API.DEM_ReverseModeCombined, (false, false, false, false, false, false), true, false, Tuple{Const{Lux.Chain{@NamedTuple{layer_1::Lux.Dense{true, typeof(tanh_fast), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_2::Lux.Dense{true, typeof(identity), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_3::WrappedFunction{:direct_call, typeof(softmax)}}, Nothing}}, Const{Matrix{Float32}}, Const{OneHotMatrix{UInt32, Vector{UInt32}}}, Duplicated{@NamedTuple{layer_1::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_2::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_3::@NamedTuple{}}}, Const{@NamedTuple{layer_1::@NamedTuple{}, layer_2::@NamedTuple{}, layer_3::@NamedTuple{}}}}, 0x00000000000068d4, 1, Core.MethodInstance})(ctx::LLVM.Context)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6795
 [16] JuliaContext(f::Enzyme.Compiler.var"#28633#28634"{Active, FFIABI, Const{typeof(loss_function)}, Enzyme.API.DEM_ReverseModeCombined, (false, false, false, false, false, false), true, false, Tuple{Const{Lux.Chain{@NamedTuple{layer_1::Lux.Dense{true, typeof(tanh_fast), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_2::Lux.Dense{true, typeof(identity), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_3::WrappedFunction{:direct_call, typeof(softmax)}}, Nothing}}, Const{Matrix{Float32}}, Const{OneHotMatrix{UInt32, Vector{UInt32}}}, Duplicated{@NamedTuple{layer_1::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_2::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_3::@NamedTuple{}}}, Const{@NamedTuple{layer_1::@NamedTuple{}, layer_2::@NamedTuple{}, layer_3::@NamedTuple{}}}}, 0x00000000000068d4, 1, Core.MethodInstance}; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:52
 [17] JuliaContext
    @ ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:42 [inlined]
 [18] thunkbase(mi::Core.MethodInstance, ::Val{0x00000000000068d4}, ::Type{Const{typeof(loss_function)}}, ::Type{Active}, tt::Type{Tuple{Const{Lux.Chain{@NamedTuple{layer_1::Lux.Dense{true, typeof(tanh_fast), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_2::Lux.Dense{true, typeof(identity), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_3::WrappedFunction{:direct_call, typeof(softmax)}}, Nothing}}, Const{Matrix{Float32}}, Const{OneHotMatrix{UInt32, Vector{UInt32}}}, Duplicated{@NamedTuple{layer_1::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_2::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_3::@NamedTuple{}}}, Const{@NamedTuple{layer_1::@NamedTuple{}, layer_2::@NamedTuple{}, layer_3::@NamedTuple{}}}}}, ::Val{Enzyme.API.DEM_ReverseModeCombined}, ::Val{1}, ::Val{(false, false, false, false, false, false)}, ::Val{true}, ::Val{false}, ::Type{FFIABI})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6740
 [19] #s2021#28635
    @ ~/.julia/packages/Enzyme/Pljwm/src/compiler.jl:6826 [inlined]
 [20] var"#s2021#28635"(FA::Any, A::Any, TT::Any, Mode::Any, ModifiedBetween::Any, width::Any, ReturnPrimal::Any, ShadowInit::Any, World::Any, ABI::Any, ::Any, ::Any, ::Any, ::Any, tt::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any)
    @ Enzyme.Compiler ./none:0
 [21] (::Core.GeneratedFunctionStub)(::UInt64, ::LineNumberNode, ::Any, ::Vararg{Any})
    @ Core ./boot.jl:709
 [22] autodiff
    @ ~/.julia/packages/Enzyme/Pljwm/src/Enzyme.jl:309 [inlined]
 [23] autodiff
    @ ~/.julia/packages/Enzyme/Pljwm/src/Enzyme.jl:326 [inlined]
 [24] gradient_loss_function(model::Lux.Chain{@NamedTuple{layer_1::Lux.Dense{true, typeof(tanh_fast), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_2::Lux.Dense{true, typeof(identity), typeof(glorot_uniform), typeof(WeightInitializers.zeros32)}, layer_3::WrappedFunction{:direct_call, typeof(softmax)}}, Nothing}, x::Matrix{Float32}, y::OneHotMatrix{UInt32, Vector{UInt32}}, ps::@NamedTuple{layer_1::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_2::@NamedTuple{weight::Matrix{Float32}, bias::Matrix{Float32}}, layer_3::@NamedTuple{}}, st::@NamedTuple{layer_1::@NamedTuple{}, layer_2::@NamedTuple{}, layer_3::@NamedTuple{}})
    @ Main ~/work/Reactant.jl/Reactant.jl/test/nn_lux.jl:65
 [25] top-level scope
    @ ~/work/Reactant.jl/Reactant.jl/test/nn_lux.jl:78
 [26] include(fname::String)
    @ Main ./sysimg.jl:38
 [27] top-level scope
    @ ~/work/Reactant.jl/Reactant.jl/test/runtests.jl:49
 [28] include(fname::String)
    @ Main ./sysimg.jl:38
 [29] top-level scope
    @ none:6
in expression starting at /home/runner/work/Reactant.jl/Reactant.jl/test/nn_lux.jl:78
in expression starting at /home/runner/work/Reactant.jl/Reactant.jl/test/runtests.jl:49
Package Reactant errored during testing

Same here with the latest release candidate of Julia 1.11:

julia> using Enzyme #v0.12.26

julia> function gradByEnzyme(f, inVal)
           dp = zero(inVal)
           Enzyme.autodiff(Reverse, f, Active, Duplicated(inVal, dp))
           dp
       end
gradByEnzyme (generic function with 1 method)

julia> gradByEnzyme(x->sum(x .^ 2), [1., 2., 3.])
ERROR: UndefVarError: `PassBuilder` not defined in `Enzyme.Compiler`
Suggestion: check for spelling errors or missing imports.
Stacktrace:
  [1] macro expansion
    @ C:\Users\frank\.julia\packages\LLVM\5DlHM\src\base.jl:96 [inlined]
  [2] (::Enzyme.Compiler.var"#prop_julia_addr#28202"{LLVM.TargetMachine})(f::LLVM.Function)
    @ Enzyme.Compiler C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\compiler\optimize.jl:75
  [3] function_pass_callback(ptr::Ptr{Nothing}, data::Ptr{Nothing})
    @ LLVM C:\Users\frank\.julia\packages\LLVM\5DlHM\src\pass.jl:49
  [4] LLVMRunPassManager
    @ C:\Users\frank\.julia\packages\LLVM\5DlHM\lib\16\libLLVM.jl:3351 [inlined]
  [5] run!
    @ C:\Users\frank\.julia\packages\LLVM\5DlHM\src\passmanager.jl:39 [inlined]
  [6] (::Enzyme.Compiler.var"#28298#28299"{LLVM.Module, LLVM.TargetMachine})(pm::LLVM.ModulePassManager)
    @ Enzyme.Compiler C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\compiler\optimize.jl:2033
  [7] LLVM.ModulePassManager(::Enzyme.Compiler.var"#28298#28299"{LLVM.Module, LLVM.TargetMachine}; kwargs::@Kwargs{})
    @ LLVM C:\Users\frank\.julia\packages\LLVM\5DlHM\src\passmanager.jl:33
  [8] ModulePassManager
    @ C:\Users\frank\.julia\packages\LLVM\5DlHM\src\passmanager.jl:30 [inlined]
  [9] optimize!(mod::LLVM.Module, tm::LLVM.TargetMachine)
    @ Enzyme.Compiler C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\compiler\optimize.jl:1955
 [10] codegen(output::Symbol, job::GPUCompiler.CompilerJob{…}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\compiler.jl:5968
 [11] codegen
    @ C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\compiler.jl:5371 [inlined]
 [12] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\compiler.jl:6871
 [13] _thunk
    @ C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\compiler.jl:6871 [inlined]
 [14] cached_compilation
    @ C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\compiler.jl:6909 [inlined]
 [15] thunkbase(ctx::LLVM.Context, mi::Core.MethodInstance, ::Val{…}, ::Type{…}, ::Type{…}, tt::Type{…}, ::Val{…}, ::Val{…}, ::Val{…}, ::Val{…}, ::Val{…}, ::Type{…})
    @ Enzyme.Compiler C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\compiler.jl:6982
 [16] #s2043#28415
    @ C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\compiler.jl:7034 [inlined]
 [17]
    @ Enzyme.Compiler .\none:0
 [18] (::Core.GeneratedFunctionStub)(::UInt64, ::LineNumberNode, ::Any, ::Vararg{Any})
    @ Core .\boot.jl:706
 [19] autodiff(::ReverseMode{false, FFIABI, false}, f::Const{var"#1#2"}, ::Type{Active}, args::Duplicated{Vector{Float64}})
    @ Enzyme C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\Enzyme.jl:309
 [20] autodiff
    @ C:\Users\frank\.julia\packages\Enzyme\r8mFE\src\Enzyme.jl:326 [inlined]
 [21] gradByEnzyme(f::Function, inVal::Vector{Float64})
    @ Main .\REPL[2]:3
 [22] top-level scope
    @ REPL[3]:1
Some type information was truncated. Use `show(err)` to see complete types.

System info:

Julia Version 1.11.0-rc2
Commit 34c3a63147 (2024-07-29 06:24 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 18 × 12th Gen Intel(R) Core(TM) i9-12900HK
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, alderlake)
Threads: 1 default, 0 interactive, 1 GC (on 18 virtual cores)

@wsmoses
Copy link
Member

wsmoses commented Aug 4, 2024

Present issue here is that LLVM.jl dropped support for API's which we need to support 1.11

x/ref maleadt/LLVM.jl#435

cc @frankwswang @avik-pal @mofeing @vchuravy

@gdalle
Copy link
Contributor Author

gdalle commented Aug 29, 2024

I was wondering if there are any updates on 1.11 support, as the release draws nearer?

@wsmoses
Copy link
Member

wsmoses commented Aug 29, 2024

Various codes work now and no precompilation failures, but not all. Specifically support for the new gc_loaded intrinsic needs to be added, but I don't understand the semantics of it yet and need help from @gbaraldi and or @vtjnash to add.

If you understand the meaning of it well enough to explain it and/or support it, be my guest! But since it's a GC related thing and I don't want to accidentally cause segfaults, it remains as an error atm.

@gdalle
Copy link
Contributor Author

gdalle commented Aug 29, 2024

I don't know the first thing about that, just wanted to check if I could re-activate DI tests for Enzyme on v1.11. Guess I'll give it a try!

@avik-pal
Copy link
Contributor

avik-pal commented Oct 8, 2024

With 1.11 released, I am seeing quite a few GC failures https://buildkite.com/julialang/lux-dot-jl/builds/4429#019269e2-95b3-4504-8f1e-48deeadeeaee/322-701. I am assuming this is the unsupported feature you mentioned?

@wsmoses
Copy link
Member

wsmoses commented Oct 8, 2024

yeah, we added some support for the new gc_loaded intrinsic, but clearly need to continue a conversation with @gbaraldi and @vchuravy about semantics of that to add support of

@gdalle
Copy link
Contributor Author

gdalle commented Oct 10, 2024

Here's a simple MWE distilled from #1951, in case it helps:

using Enzyme
f(x) = sum(x .* transpose(x))
Enzyme.gradient(Enzyme.Reverse, f, [1.0])
julia> Enzyme.gradient(Enzyme.Reverse, f, [1.0])
ERROR: Enzyme compilation failed.
Current scope: 
; Function Attrs: mustprogress willreturn
define "enzyme_type"="{[-1]:Float@double}" double @preprocess_julia_f_47698({} addrspace(10)* noundef nonnull align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" "enzymejl_parmtype"="126279284194864" "enzymejl_parmtype_ref"="2" %0) local_unnamed_addr #22 !dbg !459 {
top:
  %1 = alloca [2 x i64], align 8
  %2 = alloca [1 x {} addrspace(10)*], align 8
  %3 = alloca [1 x i64], align 8
  %4 = alloca [1 x i64], align 8
  %5 = alloca [1 x i64], align 8
  %6 = alloca [1 x i64], align 8
  %7 = alloca [1 x i64], align 8
  %pgcstack = call {}*** @julia.get_pgcstack() #27
  %current_task1227 = getelementptr inbounds {}**, {}*** %pgcstack, i64 -14
  %8 = bitcast {}*** %current_task1227 to {}*
  %ptls_field228 = getelementptr inbounds {}**, {}*** %pgcstack, i64 2
  %9 = bitcast {}*** %ptls_field228 to i64***
  %ptls_load229230 = load i64**, i64*** %9, align 8, !tbaa !18
  %10 = getelementptr inbounds i64*, i64** %ptls_load229230, i64 2
  %safepoint = load i64*, i64** %10, align 8, !tbaa !22
  fence syncscope("singlethread") seq_cst
  call void @julia.safepoint(i64* %safepoint) #27, !dbg !460
  fence syncscope("singlethread") seq_cst
  %11 = addrspacecast {} addrspace(10)* %0 to {} addrspace(11)*, !dbg !461
  %12 = bitcast {} addrspace(10)* %0 to i8 addrspace(10)*, !dbg !461
  %13 = addrspacecast i8 addrspace(10)* %12 to i8 addrspace(11)*, !dbg !461
  %14 = getelementptr inbounds i8, i8 addrspace(11)* %13, i64 16, !dbg !461
  %15 = bitcast i8 addrspace(11)* %14 to i64 addrspace(11)*, !dbg !461
  %16 = load i64, i64 addrspace(11)* %15, align 8, !dbg !461, !tbaa !25, !alias.scope !67, !noalias !68
  %17 = call { i64, i1 } @llvm.smul.with.overflow.i64(i64 %16, i64 %16) #27, !dbg !466
  %18 = extractvalue { i64, i1 } %17, 0, !dbg !466
  %19 = extractvalue { i64, i1 } %17, 1, !dbg !466
  %20 = icmp ugt i64 %16, 9223372036854775806, !dbg !476
  %21 = or i1 %20, %19, !dbg !476
  br i1 %21, label %L49, label %L53, !dbg !477

L49:                                              ; preds = %top
  %22 = call noalias nonnull align 8 dereferenceable(8) "enzyme_inactive" "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 8, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279338028512 to {}*) to {} addrspace(10)*)) #28, !dbg !477
  %23 = bitcast {} addrspace(10)* %22 to [1 x {} addrspace(10)*] addrspace(10)*, !dbg !477, !enzyme_inactive !0
  %24 = getelementptr [1 x {} addrspace(10)*], [1 x {} addrspace(10)*] addrspace(10)* %23, i64 0, i64 0, !dbg !477
  store {} addrspace(10)* addrspacecast ({}* inttoptr (i64 126279346398928 to {}*) to {} addrspace(10)*), {} addrspace(10)* addrspace(10)* %24, align 8, !dbg !477, !tbaa !41, !alias.scope !45, !noalias !478
  %25 = addrspacecast {} addrspace(10)* %22 to {} addrspace(12)*, !dbg !477, !enzyme_inactive !0
  call void @ijl_throw({} addrspace(12)* %25) #29, !dbg !477
  unreachable, !dbg !477

L53:                                              ; preds = %top
  %.not = icmp eq i64 %18, 0, !dbg !481
  br i1 %.not, label %L55, label %L57, !dbg !481

L55:                                              ; preds = %L53
  %26 = load atomic {} addrspace(10)*, {} addrspace(10)** inttoptr (i64 126279284195088 to {} addrspace(10)**) unordered, align 16, !dbg !483, !tbaa !22, !alias.scope !39, !noalias !40
  %.not231 = icmp eq {} addrspace(10)* %26, null, !dbg !483
  br i1 %.not231, label %fail, label %L59, !dbg !483

L57:                                              ; preds = %L53
  %27 = call "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* @jl_alloc_genericmemory({} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279284195056 to {}*) to {} addrspace(10)*), i64 %18) #27, !dbg !484
  br label %L59, !dbg !484

L59:                                              ; preds = %L57, %L55
  %value_phi8 = phi {} addrspace(10)* [ %27, %L57 ], [ %26, %L55 ]
  %28 = bitcast {} addrspace(10)* %value_phi8 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !485
  %29 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %28 to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !485
  %30 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %29, i64 0, i32 1, !dbg !485
  %31 = bitcast {} addrspace(10)** addrspace(11)* %30 to i8* addrspace(11)*, !dbg !485
  %32 = load i8*, i8* addrspace(11)* %31, align 8, !dbg !485, !tbaa !25, !alias.scope !96, !noalias !97, !nonnull !0
  %33 = call noalias nonnull align 8 dereferenceable(32) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 32, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279294399424 to {}*) to {} addrspace(10)*)) #28, !dbg !468
  %34 = bitcast {} addrspace(10)* %33 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !468
  %35 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %34 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !468
  %.repack = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %35, i64 0, i32 0, !dbg !468
  store i8* %32, i8* addrspace(11)* %.repack, align 8, !dbg !468, !tbaa !98, !alias.scope !101, !noalias !486
  %.repack232 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %35, i64 0, i32 1, !dbg !468
  store {} addrspace(10)* %value_phi8, {} addrspace(10)* addrspace(11)* %.repack232, align 8, !dbg !468, !tbaa !98, !alias.scope !101, !noalias !486
  %36 = bitcast {} addrspace(10)* %33 to i8 addrspace(10)*, !dbg !468
  %37 = addrspacecast i8 addrspace(10)* %36 to i8 addrspace(11)*, !dbg !468
  %38 = getelementptr inbounds i8, i8 addrspace(11)* %37, i64 16, !dbg !468
  %.sroa.0167.0..sroa_idx = bitcast i8 addrspace(11)* %38 to i64 addrspace(11)*, !dbg !468
  store i64 %16, i64 addrspace(11)* %.sroa.0167.0..sroa_idx, align 8, !dbg !468, !tbaa !25, !alias.scope !103, !noalias !487
  %.sroa.2168.0..sroa_idx169 = getelementptr inbounds i8, i8 addrspace(11)* %37, i64 24, !dbg !468
  %39 = bitcast i8 addrspace(11)* %.sroa.2168.0..sroa_idx169 to i64 addrspace(11)*, !dbg !468
  store i64 %16, i64 addrspace(11)* %39, align 8, !dbg !468, !tbaa !25, !alias.scope !103, !noalias !487
  %40 = addrspacecast {} addrspace(10)* %value_phi8 to {} addrspace(11)*, !dbg !488
  %41 = bitcast {} addrspace(10)* %value_phi8 to i64 addrspace(10)*, !dbg !498
  %42 = addrspacecast i64 addrspace(10)* %41 to i64 addrspace(11)*, !dbg !498
  %43 = load i64, i64 addrspace(11)* %42, align 8, !dbg !498, !tbaa !126, !alias.scope !101, !noalias !128
  %44 = icmp eq i64 %43, 0, !dbg !498
  %45 = icmp eq i64 %16, 0
  %or.cond = select i1 %44, i1 true, i1 %45, !dbg !490
  %46 = bitcast i8 addrspace(10)* %12 to i64 addrspace(10)*, !dbg !490
  br i1 %or.cond, label %L152, label %L102, !dbg !490

L102:                                             ; preds = %L59
  %47 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %30, align 8, !dbg !499, !tbaa !132, !alias.scope !101, !noalias !128, !nonnull !0
  %48 = bitcast {} addrspace(10)* %value_phi8 to {} addrspace(10)* addrspace(10)*, !dbg !499
  %49 = addrspacecast {} addrspace(10)* addrspace(10)* %48 to {} addrspace(10)* addrspace(11)*, !dbg !499
  %50 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %49, i64 2, !dbg !499
  %51 = addrspacecast {} addrspace(10)** %47 to {} addrspace(10)* addrspace(11)*, !dbg !499
  %.not234 = icmp eq {} addrspace(10)* addrspace(11)* %50, %51, !dbg !499
  br i1 %.not234, label %guard_exit, label %guard_pass, !dbg !499

L138:                                             ; preds = %guard_exit15
  %52 = bitcast {} addrspace(10)* %0 to i8* addrspace(10)*, !dbg !501
  %53 = addrspacecast i8* addrspace(10)* %52 to i8* addrspace(11)*, !dbg !501
  %54 = load i8*, i8* addrspace(11)* %53, align 8, !dbg !501, !tbaa !98, !alias.scope !101, !noalias !128
  %55 = ptrtoint i8* %54 to i64, !dbg !504
  %56 = load atomic {} addrspace(10)* ({} addrspace(10)*, i64, i64)*, {} addrspace(10)* ({} addrspace(10)*, i64, i64)** bitcast (void ()** @jlplt_jl_genericmemory_copy_slice_47719_got to {} addrspace(10)* ({} addrspace(10)*, i64, i64)**) unordered, align 8, !dbg !504
  %57 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* %56({} addrspace(10)* %253, i64 %55, i64 %16) #27, !dbg !504
  %58 = bitcast {} addrspace(10)* %57 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !505
  %59 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %58 to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !505
  %60 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %59, i64 0, i32 1, !dbg !505
  %61 = bitcast {} addrspace(10)** addrspace(11)* %60 to i8* addrspace(11)*, !dbg !505
  %62 = load i8*, i8* addrspace(11)* %61, align 8, !dbg !505, !tbaa !22, !alias.scope !39, !noalias !40, !nonnull !0
  %63 = load i64, i64 addrspace(11)* %15, align 8, !dbg !507, !tbaa !145, !alias.scope !101, !noalias !128
  %64 = call noalias nonnull align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 24, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279284194864 to {}*) to {} addrspace(10)*)) #28, !dbg !506
  %65 = addrspacecast {} addrspace(10)* %64 to {} addrspace(11)*, !dbg !506
  %66 = bitcast {} addrspace(10)* %64 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !506
  %67 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %66 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !506
  %.repack240 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %67, i64 0, i32 0, !dbg !506
  store i8* %62, i8* addrspace(11)* %.repack240, align 8, !dbg !506, !tbaa !98, !alias.scope !101, !noalias !486
  %.repack241 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %67, i64 0, i32 1, !dbg !506
  store {} addrspace(10)* %57, {} addrspace(10)* addrspace(11)* %.repack241, align 8, !dbg !506, !tbaa !98, !alias.scope !101, !noalias !486
  %68 = bitcast {} addrspace(10)* %64 to i8 addrspace(10)*, !dbg !506
  %69 = addrspacecast i8 addrspace(10)* %68 to i8 addrspace(11)*, !dbg !506
  %70 = getelementptr inbounds i8, i8 addrspace(11)* %69, i64 16, !dbg !506
  %71 = bitcast i8 addrspace(11)* %70 to i64 addrspace(11)*, !dbg !506
  store i64 %63, i64 addrspace(11)* %71, align 8, !dbg !506, !tbaa !145, !alias.scope !101, !noalias !486
  %.pre203 = load i64, i64 addrspace(11)* %42, align 8, !dbg !508, !tbaa !126, !alias.scope !101, !noalias !128
  %72 = bitcast i8 addrspace(10)* %68 to i64 addrspace(10)*, !dbg !506
  br label %L152, !dbg !506

L152:                                             ; preds = %guard_exit15, %L138, %L59
  %nodecayed..pre-phi = phi i64 addrspace(10)* [ %277, %guard_exit15 ], [ %46, %L59 ], [ %72, %L138 ], !dbg !515
  %nodecayed..pre-phi214 = phi {} addrspace(10)* [ %0, %guard_exit15 ], [ %0, %L59 ], [ %64, %L138 ], !dbg !515
  %73 = phi i64 [ %43, %guard_exit15 ], [ %43, %L59 ], [ %.pre203, %L138 ], !dbg !508
  %74 = phi i64 [ %16, %guard_exit15 ], [ %16, %L59 ], [ %63, %L138 ], !dbg !515
  %75 = addrspacecast i64 addrspace(10)* %nodecayed..pre-phi to i64 addrspace(11)*, !dbg !519
  %76 = bitcast i64 addrspace(11)* %75 to i8 addrspace(11)*, !dbg !519
  %77 = getelementptr i8, i8 addrspace(11)* %76, i64 16, !dbg !519
  %78 = bitcast i8 addrspace(11)* %77 to i64 addrspace(11)*, !dbg !519
  %79 = addrspacecast {} addrspace(10)* %nodecayed..pre-phi214 to {} addrspace(11)*, !dbg !519
  %.not702 = icmp eq i64 %74, 1, !dbg !519
  %80 = icmp eq i64 %73, 0, !dbg !508
  %81 = icmp eq i64 %74, 0
  %or.cond301 = select i1 %80, i1 true, i1 %81, !dbg !510
  br i1 %or.cond301, label %L227, label %L172, !dbg !510

L172:                                             ; preds = %L152
  %82 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %30, align 8, !dbg !523, !tbaa !132, !alias.scope !101, !noalias !128, !nonnull !0
  %83 = bitcast {} addrspace(10)* %value_phi8 to {} addrspace(10)* addrspace(10)*, !dbg !523
  %84 = addrspacecast {} addrspace(10)* addrspace(10)* %83 to {} addrspace(10)* addrspace(11)*, !dbg !523
  %85 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %84, i64 2, !dbg !523
  %86 = addrspacecast {} addrspace(10)** %82 to {} addrspace(10)* addrspace(11)*, !dbg !523
  %.not243 = icmp eq {} addrspace(10)* addrspace(11)* %85, %86, !dbg !523
  br i1 %.not243, label %guard_exit23, label %guard_pass22, !dbg !523

L209:                                             ; preds = %guard_exit28
  %87 = bitcast {} addrspace(10)* %0 to i8* addrspace(10)*, !dbg !525
  %88 = addrspacecast i8* addrspace(10)* %87 to i8* addrspace(11)*, !dbg !525
  %89 = load i8*, i8* addrspace(11)* %88, align 8, !dbg !525, !tbaa !98, !alias.scope !101, !noalias !128
  %90 = ptrtoint i8* %89 to i64, !dbg !529
  %91 = load atomic {} addrspace(10)* ({} addrspace(10)*, i64, i64)*, {} addrspace(10)* ({} addrspace(10)*, i64, i64)** bitcast (void ()** @jlplt_jl_genericmemory_copy_slice_47719_got to {} addrspace(10)* ({} addrspace(10)*, i64, i64)**) unordered, align 8, !dbg !529
  %92 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* %91({} addrspace(10)* %294, i64 %90, i64 %74) #27, !dbg !529
  %93 = bitcast {} addrspace(10)* %92 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !530
  %94 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %93 to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !530
  %95 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %94, i64 0, i32 1, !dbg !530
  %96 = bitcast {} addrspace(10)** addrspace(11)* %95 to i8* addrspace(11)*, !dbg !530
  %97 = load i8*, i8* addrspace(11)* %96, align 8, !dbg !530, !tbaa !22, !alias.scope !39, !noalias !40, !nonnull !0
  %98 = load i64, i64 addrspace(11)* %15, align 8, !dbg !532, !tbaa !145, !alias.scope !101, !noalias !128
  %99 = call noalias nonnull align 8 dereferenceable(24) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 24, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279284194864 to {}*) to {} addrspace(10)*)) #28, !dbg !531
  %100 = addrspacecast {} addrspace(10)* %99 to {} addrspace(11)*, !dbg !531
  %101 = bitcast {} addrspace(10)* %99 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !531
  %102 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %101 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !531
  %.repack249 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %102, i64 0, i32 0, !dbg !531
  store i8* %97, i8* addrspace(11)* %.repack249, align 8, !dbg !531, !tbaa !98, !alias.scope !101, !noalias !486
  %.repack250 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %102, i64 0, i32 1, !dbg !531
  store {} addrspace(10)* %92, {} addrspace(10)* addrspace(11)* %.repack250, align 8, !dbg !531, !tbaa !98, !alias.scope !101, !noalias !486
  %103 = bitcast {} addrspace(10)* %99 to i8 addrspace(10)*, !dbg !531
  %104 = addrspacecast i8 addrspace(10)* %103 to i8 addrspace(11)*, !dbg !531
  %105 = getelementptr inbounds i8, i8 addrspace(11)* %104, i64 16, !dbg !531
  %106 = bitcast i8 addrspace(11)* %105 to i64 addrspace(11)*, !dbg !531
  store i64 %98, i64 addrspace(11)* %106, align 8, !dbg !531, !tbaa !145, !alias.scope !101, !noalias !486
  br label %L227, !dbg !533

L227:                                             ; preds = %guard_exit28, %L209, %L152
  %nodecayed..pre-phi217 = phi {} addrspace(10)* [ %99, %L209 ], [ %0, %L152 ], [ %0, %guard_exit28 ], !dbg !534
  %107 = phi i64 [ %98, %L209 ], [ %74, %L152 ], [ %74, %guard_exit28 ], !dbg !534
  %108 = addrspacecast {} addrspace(10)* %nodecayed..pre-phi217 to {} addrspace(11)*, !dbg !539
  %.not703 = icmp eq i64 %107, 1, !dbg !539
  br i1 %45, label %L451, label %L252.preheader, !dbg !544

L252.preheader:                                   ; preds = %L227
  %109 = bitcast {} addrspace(11)* %108 to { i8*, {} addrspace(10)* } addrspace(11)*
  %110 = bitcast {} addrspace(11)* %108 to i8* addrspace(11)*
  %111 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %109, i64 0, i32 1
  %112 = bitcast i8* %32 to double*
  %113 = bitcast i8* %32 to {} addrspace(10)**
  %114 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %value_phi8, {} addrspace(10)** noundef %113) #27, !dbg !546
  %115 = load i64, i64 addrspace(11)* %78, align 8, !tbaa !145, !alias.scope !101, !noalias !128
  %.not257.peel.not = icmp eq i64 %115, 0
  %exitcond629.peel.not = icmp eq i64 %16, 1
  br i1 %.not257.peel.not, label %L316, label %L252.preheader.split, !dbg !547

L252.preheader.split:                             ; preds = %L252.preheader
  %116 = bitcast {} addrspace(11)* %79 to { i8*, {} addrspace(10)* } addrspace(11)*
  %117 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %116, i64 0, i32 1
  %118 = bitcast {} addrspace(11)* %79 to i8* addrspace(11)*
  %.pre913 = load i8*, i8* addrspace(11)* %118, align 8, !dbg !554, !tbaa !98, !alias.scope !101, !noalias !128
  %.pre914 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %117, align 8, !dbg !554, !tbaa !98, !alias.scope !101, !noalias !128
  %.phi.trans.insert = bitcast {} addrspace(10)* %.pre914 to i64 addrspace(10)*
  %.phi.trans.insert915 = addrspacecast i64 addrspace(10)* %.phi.trans.insert to i64 addrspace(11)*
  %.pre916 = load i64, i64 addrspace(11)* %.phi.trans.insert915, align 8, !dbg !554, !tbaa !126, !range !211, !alias.scope !101, !noalias !128
  %.phi.trans.insert917 = bitcast {} addrspace(10)* %.pre914 to { i64, {} addrspace(10)** } addrspace(10)*
  %.phi.trans.insert918 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %.phi.trans.insert917 to { i64, {} addrspace(10)** } addrspace(11)*
  %.phi.trans.insert919 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %.phi.trans.insert918, i64 0, i32 1
  %.phi.trans.insert920 = bitcast {} addrspace(10)** addrspace(11)* %.phi.trans.insert919 to i8* addrspace(11)*
  %.pre921 = load i8*, i8* addrspace(11)* %.phi.trans.insert920, align 8, !dbg !554, !tbaa !22, !alias.scope !39, !noalias !40
  br label %L252, !dbg !555

L252:                                             ; preds = %L424, %L252.preheader.split
  %iv = phi i64 [ %iv.next, %L424 ], [ 0, %L252.preheader.split ]
  %iv.next = add nuw nsw i64 %iv, 1
  %119 = add i64 %iv.next, -1
  %120 = select i1 %.not703, i64 1, i64 %iv.next
  %121 = add i64 %120, -1
  %122 = mul i64 %119, %16
  %123 = shl nuw nsw i64 %.pre916, 1, !dbg !554
  %.not258.peel = icmp ne i64 %.pre916, 0, !dbg !554
  %124 = bitcast i8* %.pre913 to double*, !dbg !554
  %125 = ptrtoint i8* %.pre921 to i64, !dbg !554
  %126 = ptrtoint i8* %.pre913 to i64, !dbg !554
  %127 = sub i64 %126, %125, !dbg !554
  %128 = shl nuw nsw i64 %.pre916, 3, !dbg !554
  %129 = icmp ult i64 %127, %128, !dbg !554
  %130 = and i1 %.not258.peel, %129, !dbg !554
  br i1 %130, label %load.peel, label %oob.loopexit1, !dbg !554

load.peel:                                        ; preds = %L252
  %131 = bitcast i8* %.pre913 to {} addrspace(10)**, !dbg !554
  %132 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %.pre914, {} addrspace(10)** noundef %131) #27, !dbg !554
  %133 = bitcast {} addrspace(10)* addrspace(13)* %132 to double addrspace(13)*, !dbg !554
  %134 = load double, double addrspace(13)* %133, align 8, !dbg !554, !tbaa !213, !alias.scope !45, !noalias !215
  %.not262.peel = icmp ult i64 %121, %107, !dbg !556
  br i1 %.not262.peel, label %L365.peel, label %L343, !dbg !561

L365.peel:                                        ; preds = %load.peel
  %135 = load i8*, i8* addrspace(11)* %110, align 8, !dbg !566, !tbaa !98, !alias.scope !101, !noalias !128
  %136 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %111, align 8, !dbg !566, !tbaa !98, !alias.scope !101, !noalias !128, !dereferenceable_or_null !237, !align !238
  %137 = bitcast {} addrspace(10)* %136 to i64 addrspace(10)*, !dbg !566
  %138 = addrspacecast i64 addrspace(10)* %137 to i64 addrspace(11)*, !dbg !566
  %139 = load i64, i64 addrspace(11)* %138, align 8, !dbg !566, !tbaa !126, !range !211, !alias.scope !101, !noalias !128
  %140 = shl nuw nsw i64 %139, 1, !dbg !566
  %141 = add i64 %139, %121, !dbg !566
  %.not264.peel = icmp ult i64 %141, %140, !dbg !566
  %142 = bitcast i8* %135 to double*, !dbg !566
  %143 = getelementptr inbounds double, double* %142, i64 %121, !dbg !566
  %144 = bitcast {} addrspace(10)* %136 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !566
  %145 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %144 to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !566
  %146 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %145, i64 0, i32 1, !dbg !566
  %147 = bitcast {} addrspace(10)** addrspace(11)* %146 to i8* addrspace(11)*, !dbg !566
  %148 = load i8*, i8* addrspace(11)* %147, align 8, !dbg !566, !tbaa !22, !alias.scope !39, !noalias !40, !nonnull !0
  %149 = ptrtoint i8* %148 to i64, !dbg !566
  %150 = ptrtoint double* %143 to i64, !dbg !566
  %151 = sub i64 %150, %149, !dbg !566
  %152 = shl nuw nsw i64 %139, 3, !dbg !566
  %153 = icmp ult i64 %151, %152, !dbg !566
  %154 = and i1 %.not264.peel, %153, !dbg !566
  br i1 %154, label %idxend50.peel, label %oob48, !dbg !566

idxend50.peel:                                    ; preds = %L365.peel
  %155 = icmp eq i64 %139, 0, !dbg !566
  br i1 %155, label %oob51, label %load52.peel, !dbg !566

load52.peel:                                      ; preds = %idxend50.peel
  %156 = bitcast i8* %135 to {} addrspace(10)**, !dbg !566
  %157 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %136, {} addrspace(10)** noundef %156) #27, !dbg !566
  %158 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %157, i64 %121, !dbg !566
  %159 = bitcast {} addrspace(10)* addrspace(13)* %158 to double addrspace(13)*, !dbg !566
  %160 = load double, double addrspace(13)* %159, align 8, !dbg !566, !tbaa !213, !alias.scope !45, !noalias !215
  %161 = fmul double %134, %160, !dbg !569
  %162 = load i64, i64 addrspace(11)* %42, align 8, !dbg !572, !tbaa !126, !range !211, !alias.scope !101, !noalias !128
  %163 = shl nuw nsw i64 %162, 1, !dbg !572
  %164 = add i64 %162, %122, !dbg !572
  %.not268.peel = icmp ult i64 %164, %163, !dbg !572
  %165 = getelementptr inbounds double, double* %112, i64 %122, !dbg !572
  %166 = load i8*, i8* addrspace(11)* %31, align 8, !dbg !572, !tbaa !22, !alias.scope !39, !noalias !40, !nonnull !0
  %167 = ptrtoint i8* %166 to i64, !dbg !572
  %168 = ptrtoint double* %165 to i64, !dbg !572
  %169 = sub i64 %168, %167, !dbg !572
  %170 = shl nuw nsw i64 %162, 3, !dbg !572
  %171 = icmp ult i64 %169, %170, !dbg !572
  %172 = and i1 %.not268.peel, %171, !dbg !572
  br i1 %172, label %idxend55.peel, label %oob53.loopexit2, !dbg !572

idxend55.peel:                                    ; preds = %load52.peel
  %173 = icmp eq i64 %162, 0, !dbg !572
  br i1 %173, label %oob56, label %load57.peel, !dbg !572

load57.peel:                                      ; preds = %idxend55.peel
  %174 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %114, i64 %122, !dbg !572
  %175 = bitcast {} addrspace(10)* addrspace(13)* %174 to double addrspace(13)*, !dbg !572
  store double %161, double addrspace(13)* %175, align 8, !dbg !572, !tbaa !213, !alias.scope !45, !noalias !478
  br i1 %exitcond629.peel.not, label %L424, label %L304.preheader, !dbg !574

L304.preheader:                                   ; preds = %load57.peel
  br label %L304, !dbg !547

L304:                                             ; preds = %L304.preheader, %load57
  %iv3 = phi i64 [ 0, %L304.preheader ], [ %iv.next4, %load57 ]
  %iv.next4 = add nuw nsw i64 %iv3, 1, !dbg !575
  %176 = add nuw nsw i64 %iv.next4, 1, !dbg !575
  %177 = select i1 %.not702, i64 1, i64 %176, !dbg !578
  %178 = add nsw i64 %177, -1, !dbg !547
  %.not257 = icmp ult i64 %178, %115, !dbg !547
  br i1 %.not257, label %L319, label %L316.loopexit, !dbg !547

L316.loopexit:                                    ; preds = %L304
  br label %L316, !dbg !547

L316:                                             ; preds = %L316.loopexit, %L252.preheader
  %.lcssa531 = phi i64 [ 1, %L252.preheader ], [ %177, %L316.loopexit ], !dbg !578
  %179 = getelementptr inbounds [1 x i64], [1 x i64]* %3, i64 0, i64 0, !dbg !547
  store i64 %.lcssa531, i64* %179, align 8, !dbg !547, !tbaa !262, !alias.scope !264, !noalias !581
  %180 = addrspacecast [1 x i64]* %3 to [1 x i64] addrspace(11)*, !dbg !547
  call fastcc void @julia_throw_boundserror_47790({} addrspace(10)* nofree noundef nonnull align 8 dereferenceable(24) %nodecayed..pre-phi214, [1 x i64] addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(8) %180) #29, !dbg !547
  unreachable, !dbg !547

L319:                                             ; preds = %L304
  %181 = add i64 %178, %.pre916, !dbg !554
  %.not258 = icmp ult i64 %181, %123, !dbg !554
  %182 = getelementptr inbounds double, double* %124, i64 %178, !dbg !554
  %183 = ptrtoint double* %182 to i64, !dbg !554
  %184 = sub i64 %183, %125, !dbg !554
  %185 = icmp ult i64 %184, %128, !dbg !554
  %186 = and i1 %.not258, %185, !dbg !554
  br i1 %186, label %load52, label %oob.loopexit, !dbg !554

L343:                                             ; preds = %load.peel
  %187 = getelementptr inbounds [2 x i64], [2 x i64]* %1, i64 0, i64 0
  %188 = getelementptr inbounds [2 x i64], [2 x i64]* %1, i64 0, i64 1
  store i64 1, i64* %187, align 8, !dbg !582, !tbaa !262, !alias.scope !264, !noalias !581
  store i64 %120, i64* %188, align 8, !dbg !582, !tbaa !262, !alias.scope !264, !noalias !581
  %value_phi37.fca.0.gep = getelementptr inbounds [1 x {} addrspace(10)*], [1 x {} addrspace(10)*]* %2, i64 0, i64 0, !dbg !561
  store {} addrspace(10)* %nodecayed..pre-phi217, {} addrspace(10)** %value_phi37.fca.0.gep, align 8, !dbg !561, !noalias !583
  %189 = addrspacecast [1 x {} addrspace(10)*]* %2 to [1 x {} addrspace(10)*] addrspace(11)*, !dbg !561
  %190 = addrspacecast [2 x i64]* %1 to [2 x i64] addrspace(11)*, !dbg !561
  call fastcc void @julia_throw_boundserror_47785([1 x {} addrspace(10)*] addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(8) %189, [2 x i64] addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(16) %190) #29, !dbg !561
  unreachable, !dbg !561

L424.loopexit:                                    ; preds = %load57
  br label %L424, !dbg !584

L424:                                             ; preds = %L424.loopexit, %load57.peel
  %191 = add i64 %iv.next, 1, !dbg !584
  %192 = icmp eq i64 %iv.next, %16, !dbg !588
  br i1 %192, label %L451.loopexit, label %L252, !dbg !587

L451.loopexit:                                    ; preds = %L424
  br label %L451, !dbg !591

L451:                                             ; preds = %L451.loopexit, %L227
  switch i64 %18, label %L483 [
    i64 0, label %L562
    i64 1, label %L466
  ], !dbg !591

L466:                                             ; preds = %L451
  %193 = load i64, i64 addrspace(11)* %42, align 8, !dbg !601, !tbaa !126, !alias.scope !101, !noalias !128
  %.not274 = icmp eq i64 %193, 0, !dbg !601
  br i1 %.not274, label %L475, label %L478, !dbg !601

L475:                                             ; preds = %L466
  %194 = getelementptr inbounds [1 x i64], [1 x i64]* %4, i64 0, i64 0, !dbg !601
  store i64 1, i64* %194, align 8, !dbg !601, !tbaa !262, !alias.scope !264, !noalias !581
  %195 = addrspacecast [1 x i64]* %4 to [1 x i64] addrspace(11)*, !dbg !601
  call fastcc void @julia_throw_boundserror_47773({} addrspace(10)* nofree noundef nonnull align 8 dereferenceable(32) %33, [1 x i64] addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(8) %195) #29, !dbg !601
  unreachable, !dbg !601

L478:                                             ; preds = %L466
  %.not275 = icmp sgt i64 %193, 0, !dbg !603
  %196 = load i8*, i8* addrspace(11)* %31, align 8, !dbg !603, !tbaa !22, !alias.scope !39, !noalias !40, !nonnull !0
  %197 = ptrtoint i8* %196 to i64, !dbg !603
  %198 = ptrtoint i8* %32 to i64, !dbg !603
  %199 = sub i64 %198, %197, !dbg !603
  %200 = shl nuw nsw i64 %193, 3, !dbg !603
  %201 = icmp ult i64 %199, %200, !dbg !603
  %202 = and i1 %.not275, %201, !dbg !603
  br i1 %202, label %load73, label %oob69, !dbg !603

L483:                                             ; preds = %L451
  %203 = icmp sgt i64 %18, 15, !dbg !604
  br i1 %203, label %L549, label %L487, !dbg !605

L487:                                             ; preds = %L483
  %204 = load i64, i64 addrspace(11)* %42, align 8, !dbg !606, !tbaa !126, !alias.scope !101, !noalias !128
  %.not279 = icmp eq i64 %204, 0, !dbg !606
  br i1 %.not279, label %L496, label %L499, !dbg !606

L496:                                             ; preds = %L487
  %205 = getelementptr inbounds [1 x i64], [1 x i64]* %7, i64 0, i64 0, !dbg !606
  store i64 1, i64* %205, align 8, !dbg !606, !tbaa !262, !alias.scope !264, !noalias !581
  %206 = addrspacecast [1 x i64]* %7 to [1 x i64] addrspace(11)*, !dbg !606
  call fastcc void @julia_throw_boundserror_47773({} addrspace(10)* nofree noundef nonnull align 8 dereferenceable(32) %33, [1 x i64] addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(8) %206) #29, !dbg !606
  unreachable, !dbg !606

L499:                                             ; preds = %L487
  %207 = shl nuw i64 %204, 1, !dbg !608
  %.not280 = icmp sgt i64 %204, 0, !dbg !608
  %208 = bitcast i8* %32 to double*, !dbg !608
  %209 = load i8*, i8* addrspace(11)* %31, align 8, !dbg !608, !tbaa !22, !alias.scope !39, !noalias !40, !nonnull !0
  %210 = ptrtoint i8* %209 to i64, !dbg !608
  %211 = ptrtoint i8* %32 to i64, !dbg !608
  %212 = sub i64 %211, %210, !dbg !608
  %213 = shl nuw nsw i64 %204, 3, !dbg !608
  %214 = icmp ult i64 %212, %213, !dbg !608
  %215 = and i1 %.not280, %214, !dbg !608
  br i1 %215, label %load79, label %oob75, !dbg !608

L514:                                             ; preds = %load79
  %216 = getelementptr inbounds [1 x i64], [1 x i64]* %6, i64 0, i64 0, !dbg !609
  store i64 2, i64* %216, align 8, !dbg !609, !tbaa !262, !alias.scope !264, !noalias !581
  %217 = addrspacecast [1 x i64]* %6 to [1 x i64] addrspace(11)*, !dbg !609
  call fastcc void @julia_throw_boundserror_47773({} addrspace(10)* nofree noundef nonnull align 8 dereferenceable(32) %33, [1 x i64] addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(8) %217) #29, !dbg !609
  unreachable, !dbg !609

L517:                                             ; preds = %load79
  %218 = add nuw i64 %204, 1, !dbg !611
  %.not284 = icmp ult i64 %218, %207, !dbg !611
  %219 = getelementptr inbounds i8, i8* %32, i64 8, !dbg !611
  %220 = ptrtoint i8* %219 to i64, !dbg !611
  %221 = sub i64 %220, %210, !dbg !611
  %222 = icmp ult i64 %221, %213, !dbg !611
  %223 = and i1 %.not284, %222, !dbg !611
  br i1 %223, label %load84, label %oob80, !dbg !611

L527:                                             ; preds = %L527.preheader, %load91
  %iv5 = phi i64 [ 0, %L527.preheader ], [ %iv.next6, %load91 ]
  %value_phi85384 = phi double [ %372, %load91 ], [ %365, %L527.preheader ]
  %224 = add nuw nsw i64 %iv5, 2, !dbg !612
  %iv.next6 = add nuw nsw i64 %iv5, 1, !dbg !612
  %225 = add nuw nsw i64 %224, 1, !dbg !612
  %exitcond.not = icmp eq i64 %224, %204, !dbg !614
  br i1 %exitcond.not, label %L539, label %L542, !dbg !614

L539:                                             ; preds = %L527
  %226 = getelementptr inbounds [1 x i64], [1 x i64]* %5, i64 0, i64 0, !dbg !614
  store i64 %225, i64* %226, align 8, !dbg !614, !tbaa !262, !alias.scope !264, !noalias !581
  %227 = addrspacecast [1 x i64]* %5 to [1 x i64] addrspace(11)*, !dbg !614
  call fastcc void @julia_throw_boundserror_47773({} addrspace(10)* nofree noundef nonnull align 8 dereferenceable(32) %33, [1 x i64] addrspace(11)* nocapture nofree noundef nonnull readonly align 8 dereferenceable(8) %227) #29, !dbg !614
  unreachable, !dbg !614

L542:                                             ; preds = %L527
  %228 = add i64 %224, %204, !dbg !615
  %.not290 = icmp ult i64 %228, %207, !dbg !615
  %229 = getelementptr inbounds double, double* %208, i64 %224, !dbg !615
  %230 = ptrtoint double* %229 to i64, !dbg !615
  %231 = sub i64 %230, %210, !dbg !615
  %232 = icmp ult i64 %231, %213, !dbg !615
  %233 = and i1 %.not290, %232, !dbg !615
  br i1 %233, label %load91, label %oob87, !dbg !615

L549:                                             ; preds = %L483
  %234 = call fastcc double @julia_mapreduce_impl_47760({} addrspace(10)* noundef nonnull align 8 dereferenceable(32) %33, i64 noundef signext 1, i64 signext %18) #27, !dbg !616
  br label %L562, !dbg !618

L562.loopexit:                                    ; preds = %load91
  br label %L562, !dbg !600

L562:                                             ; preds = %L562.loopexit, %load84, %load73, %L549, %L451
  %value_phi68 = phi double [ %350, %load73 ], [ %234, %L549 ], [ 0.000000e+00, %L451 ], [ %365, %load84 ], [ %372, %L562.loopexit ]
  ret double %value_phi68, !dbg !600

fail:                                             ; preds = %L55
  %235 = load {}*, {}** @jl_undefref_exception, align 8, !dbg !483, !tbaa !22, !alias.scope !39, !noalias !40, !nonnull !0
  %236 = addrspacecast {}* %235 to {} addrspace(12)*, !dbg !483
  call void @ijl_throw({} addrspace(12)* %236) #29, !dbg !483
  unreachable, !dbg !483

guard_pass:                                       ; preds = %L102
  %237 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %29, i64 1, !dbg !499
  %238 = bitcast { i64, {} addrspace(10)** } addrspace(11)* %237 to {} addrspace(10)* addrspace(11)*, !dbg !499
  %239 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %238, align 8, !dbg !499, !tbaa !22, !alias.scope !39, !noalias !40
  %240 = icmp eq {} addrspace(10)* %239, null, !dbg !499
  %241 = select i1 %240, {} addrspace(10)* %value_phi8, {} addrspace(10)* %239, !dbg !499
  %.pre = addrspacecast {} addrspace(10)* %241 to {} addrspace(11)*
  br label %guard_exit, !dbg !499

guard_exit:                                       ; preds = %guard_pass, %L102
  %nodecayed..pre223.pre-phi = phi {} addrspace(10)* [ %241, %guard_pass ], [ %value_phi8, %L102 ]
  %242 = addrspacecast {} addrspace(10)* %nodecayed..pre223.pre-phi to {} addrspace(11)*, !dbg !499
  %243 = call "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* @julia.typeof({} addrspace(10)* nonnull %nodecayed..pre223.pre-phi) #30, !dbg !499
  %244 = addrspacecast {} addrspace(10)* %243 to {} addrspace(11)*, !dbg !499
  %245 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* %244) #31, !dbg !499
  %.not235 = icmp eq {}* %245, inttoptr (i64 126279284195056 to {}*), !dbg !499
  %spec.select = select i1 %.not235, {} addrspace(11)* %242, {} addrspace(11)* %40, !dbg !499
  %246 = bitcast {} addrspace(11)* %spec.select to i8 addrspace(11)*, !dbg !619
  %247 = getelementptr inbounds i8, i8 addrspace(11)* %246, i64 8, !dbg !619
  %248 = bitcast i8 addrspace(11)* %247 to i64 addrspace(11)*, !dbg !621
  %249 = load i64, i64 addrspace(11)* %248, align 8, !dbg !621, !tbaa !324, !alias.scope !101, !noalias !128
  %250 = bitcast {} addrspace(10)* %0 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !622
  %251 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %250 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !622
  %252 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %251, i64 0, i32 1, !dbg !622
  %253 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %252, align 8, !dbg !622, !tbaa !98, !alias.scope !101, !noalias !128, !dereferenceable_or_null !237, !align !238
  %254 = bitcast {} addrspace(10)* %253 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !499
  %255 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %254 to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !499
  %256 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %255, i64 0, i32 1, !dbg !499
  %257 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %256, align 8, !dbg !499, !tbaa !132, !alias.scope !101, !noalias !128, !nonnull !0
  %258 = bitcast {} addrspace(10)* %253 to {} addrspace(10)* addrspace(10)*, !dbg !499
  %259 = addrspacecast {} addrspace(10)* addrspace(10)* %258 to {} addrspace(10)* addrspace(11)*, !dbg !499
  %260 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %259, i64 2, !dbg !499
  %261 = addrspacecast {} addrspace(10)** %257 to {} addrspace(10)* addrspace(11)*, !dbg !499
  %.not237 = icmp eq {} addrspace(10)* addrspace(11)* %260, %261, !dbg !499
  br i1 %.not237, label %guard_exit15, label %guard_pass14, !dbg !499

guard_pass14:                                     ; preds = %guard_exit
  %262 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %255, i64 1, !dbg !499
  %263 = bitcast { i64, {} addrspace(10)** } addrspace(11)* %262 to {} addrspace(10)* addrspace(11)*, !dbg !499
  %264 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %263, align 8, !dbg !499, !tbaa !22, !alias.scope !39, !noalias !40
  %265 = icmp eq {} addrspace(10)* %264, null, !dbg !499
  %266 = select i1 %265, {} addrspace(10)* %253, {} addrspace(10)* %264, !dbg !499
  br label %guard_exit15, !dbg !499

guard_exit15:                                     ; preds = %guard_pass14, %guard_exit
  %267 = phi {} addrspace(10)* [ %253, %guard_exit ], [ %266, %guard_pass14 ], !dbg !499
  %268 = call "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* @julia.typeof({} addrspace(10)* %267) #30, !dbg !499
  %269 = addrspacecast {} addrspace(10)* %268 to {} addrspace(11)*, !dbg !499
  %270 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* %269) #31, !dbg !499
  %.not238 = icmp eq {}* %270, inttoptr (i64 126279284195056 to {}*), !dbg !499
  %spec.select302 = select i1 %.not238, {} addrspace(10)* %267, {} addrspace(10)* %253, !dbg !499
  %271 = bitcast {} addrspace(10)* %spec.select302 to i8 addrspace(10)*, !dbg !619
  %272 = addrspacecast i8 addrspace(10)* %271 to i8 addrspace(11)*, !dbg !619
  %273 = getelementptr inbounds i8, i8 addrspace(11)* %272, i64 8, !dbg !619
  %274 = bitcast i8 addrspace(11)* %273 to i64 addrspace(11)*, !dbg !621
  %275 = load i64, i64 addrspace(11)* %274, align 8, !dbg !621, !tbaa !324, !alias.scope !101, !noalias !128
  %276 = icmp eq i64 %249, %275, !dbg !623
  %277 = bitcast i8 addrspace(10)* %12 to i64 addrspace(10)*, !dbg !491
  br i1 %276, label %L138, label %L152, !dbg !491

guard_pass22:                                     ; preds = %L172
  %278 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %29, i64 1, !dbg !523
  %279 = bitcast { i64, {} addrspace(10)** } addrspace(11)* %278 to {} addrspace(10)* addrspace(11)*, !dbg !523
  %280 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %279, align 8, !dbg !523, !tbaa !22, !alias.scope !39, !noalias !40
  %281 = icmp eq {} addrspace(10)* %280, null, !dbg !523
  %282 = select i1 %281, {} addrspace(10)* %value_phi8, {} addrspace(10)* %280, !dbg !523
  %.pre701 = addrspacecast {} addrspace(10)* %282 to {} addrspace(11)*
  br label %guard_exit23, !dbg !523

guard_exit23:                                     ; preds = %guard_pass22, %L172
  %nodecayed..pre.pre-phi = phi {} addrspace(10)* [ %282, %guard_pass22 ], [ %value_phi8, %L172 ]
  %283 = addrspacecast {} addrspace(10)* %nodecayed..pre.pre-phi to {} addrspace(11)*, !dbg !523
  %284 = call "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* @julia.typeof({} addrspace(10)* nonnull %nodecayed..pre.pre-phi) #30, !dbg !523
  %285 = addrspacecast {} addrspace(10)* %284 to {} addrspace(11)*, !dbg !523
  %286 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* %285) #31, !dbg !523
  %.not244 = icmp eq {}* %286, inttoptr (i64 126279284195056 to {}*), !dbg !523
  %spec.select303 = select i1 %.not244, {} addrspace(11)* %283, {} addrspace(11)* %40, !dbg !523
  %287 = bitcast {} addrspace(11)* %spec.select303 to i8 addrspace(11)*, !dbg !626
  %288 = getelementptr inbounds i8, i8 addrspace(11)* %287, i64 8, !dbg !626
  %289 = bitcast i8 addrspace(11)* %288 to i64 addrspace(11)*, !dbg !628
  %290 = load i64, i64 addrspace(11)* %289, align 8, !dbg !628, !tbaa !324, !alias.scope !101, !noalias !128
  %291 = bitcast {} addrspace(10)* %0 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !629
  %292 = addrspacecast { i8*, {} addrspace(10)* } addrspace(10)* %291 to { i8*, {} addrspace(10)* } addrspace(11)*, !dbg !629
  %293 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(11)* %292, i64 0, i32 1, !dbg !629
  %294 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %293, align 8, !dbg !629, !tbaa !98, !alias.scope !101, !noalias !128, !dereferenceable_or_null !237, !align !238
  %295 = bitcast {} addrspace(10)* %294 to { i64, {} addrspace(10)** } addrspace(10)*, !dbg !632
  %296 = addrspacecast { i64, {} addrspace(10)** } addrspace(10)* %295 to { i64, {} addrspace(10)** } addrspace(11)*, !dbg !632
  %297 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %296, i64 0, i32 1, !dbg !632
  %298 = load {} addrspace(10)**, {} addrspace(10)** addrspace(11)* %297, align 8, !dbg !632, !tbaa !132, !alias.scope !101, !noalias !128, !nonnull !0
  %299 = bitcast {} addrspace(10)* %294 to {} addrspace(10)* addrspace(10)*, !dbg !632
  %300 = addrspacecast {} addrspace(10)* addrspace(10)* %299 to {} addrspace(10)* addrspace(11)*, !dbg !632
  %301 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %300, i64 2, !dbg !632
  %302 = addrspacecast {} addrspace(10)** %298 to {} addrspace(10)* addrspace(11)*, !dbg !632
  %.not246 = icmp eq {} addrspace(10)* addrspace(11)* %301, %302, !dbg !632
  br i1 %.not246, label %guard_exit28, label %guard_pass27, !dbg !632

guard_pass27:                                     ; preds = %guard_exit23
  %303 = getelementptr inbounds { i64, {} addrspace(10)** }, { i64, {} addrspace(10)** } addrspace(11)* %296, i64 1, !dbg !632
  %304 = bitcast { i64, {} addrspace(10)** } addrspace(11)* %303 to {} addrspace(10)* addrspace(11)*, !dbg !632
  %305 = load {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %304, align 8, !dbg !632, !tbaa !22, !alias.scope !39, !noalias !40
  %306 = icmp eq {} addrspace(10)* %305, null, !dbg !632
  %307 = select i1 %306, {} addrspace(10)* %294, {} addrspace(10)* %305, !dbg !632
  br label %guard_exit28, !dbg !632

guard_exit28:                                     ; preds = %guard_pass27, %guard_exit23
  %308 = phi {} addrspace(10)* [ %294, %guard_exit23 ], [ %307, %guard_pass27 ], !dbg !632
  %309 = call "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* @julia.typeof({} addrspace(10)* %308) #30, !dbg !632
  %310 = addrspacecast {} addrspace(10)* %309 to {} addrspace(11)*, !dbg !632
  %311 = call nonnull {}* @julia.pointer_from_objref({} addrspace(11)* %310) #31, !dbg !632
  %.not247 = icmp eq {}* %311, inttoptr (i64 126279284195056 to {}*), !dbg !632
  %spec.select304 = select i1 %.not247, {} addrspace(10)* %308, {} addrspace(10)* %294, !dbg !632
  %312 = bitcast {} addrspace(10)* %spec.select304 to i8 addrspace(10)*, !dbg !633
  %313 = addrspacecast i8 addrspace(10)* %312 to i8 addrspace(11)*, !dbg !633
  %314 = getelementptr inbounds i8, i8 addrspace(11)* %313, i64 8, !dbg !633
  %315 = bitcast i8 addrspace(11)* %314 to i64 addrspace(11)*, !dbg !635
  %316 = load i64, i64 addrspace(11)* %315, align 8, !dbg !635, !tbaa !324, !alias.scope !101, !noalias !128
  %317 = icmp eq i64 %290, %316, !dbg !636
  br i1 %317, label %L209, label %L227, !dbg !511

oob.loopexit:                                     ; preds = %L319
  br label %oob, !dbg !554

oob.loopexit1:                                    ; preds = %L252
  br label %oob, !dbg !554

oob:                                              ; preds = %oob.loopexit1, %oob.loopexit
  %.lcssa532 = phi i64 [ %177, %oob.loopexit ], [ 1, %oob.loopexit1 ], !dbg !578
  %318 = call noalias nonnull align 8 dereferenceable(16) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 16, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279284194960 to {}*) to {} addrspace(10)*)) #28, !dbg !554
  %319 = bitcast {} addrspace(10)* %318 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !554
  %.repack259 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %319, i64 0, i32 0, !dbg !554
  store i8* %.pre913, i8* addrspace(10)* %.repack259, align 8, !dbg !554, !tbaa !41, !alias.scope !45, !noalias !478
  %.repack260 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %319, i64 0, i32 1, !dbg !554
  store {} addrspace(10)* %.pre914, {} addrspace(10)* addrspace(10)* %.repack260, align 8, !dbg !554, !tbaa !41, !alias.scope !45, !noalias !478
  %320 = addrspacecast {} addrspace(10)* %318 to {} addrspace(12)*, !dbg !554
  call void @ijl_bounds_error_int({} addrspace(12)* %320, i64 %.lcssa532) #29, !dbg !554
  unreachable, !dbg !554

oob48:                                            ; preds = %L365.peel
  %321 = call noalias nonnull align 8 dereferenceable(16) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 16, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279284194960 to {}*) to {} addrspace(10)*)) #28, !dbg !566
  %322 = bitcast {} addrspace(10)* %321 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !566
  %.repack265 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %322, i64 0, i32 0, !dbg !566
  store i8* %135, i8* addrspace(10)* %.repack265, align 8, !dbg !566, !tbaa !41, !alias.scope !45, !noalias !478
  %.repack266 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %322, i64 0, i32 1, !dbg !566
  store {} addrspace(10)* %136, {} addrspace(10)* addrspace(10)* %.repack266, align 8, !dbg !566, !tbaa !41, !alias.scope !45, !noalias !478
  %323 = addrspacecast {} addrspace(10)* %321 to {} addrspace(12)*, !dbg !566
  call void @ijl_bounds_error_int({} addrspace(12)* %323, i64 %120) #29, !dbg !566
  unreachable, !dbg !566

oob51:                                            ; preds = %idxend50.peel
  %324 = addrspacecast {} addrspace(10)* %136 to {} addrspace(12)*, !dbg !566
  call void @ijl_bounds_error_int({} addrspace(12)* noundef %324, i64 noundef 1) #29, !dbg !566
  unreachable, !dbg !566

load52:                                           ; preds = %L319
  %325 = add i64 %iv.next4, %122, !dbg !572
  %326 = add i64 %325, %162, !dbg !572
  %.not268 = icmp ult i64 %326, %163, !dbg !572
  %327 = getelementptr inbounds double, double* %112, i64 %325, !dbg !572
  %328 = ptrtoint double* %327 to i64, !dbg !572
  %329 = sub i64 %328, %167, !dbg !572
  %330 = icmp ult i64 %329, %170, !dbg !572
  %331 = and i1 %.not268, %330, !dbg !572
  br i1 %331, label %load57, label %oob53.loopexit, !dbg !572

oob53.loopexit:                                   ; preds = %load52
  br label %oob53, !dbg !639

oob53.loopexit2:                                  ; preds = %load52.peel
  br label %oob53, !dbg !639

oob53:                                            ; preds = %oob53.loopexit2, %oob53.loopexit
  %.lcssa504 = phi i64 [ %176, %oob53.loopexit ], [ 1, %oob53.loopexit2 ], !dbg !575
  %332 = add i64 %.lcssa504, %122, !dbg !639
  %333 = call noalias nonnull align 8 dereferenceable(16) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 16, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279284194960 to {}*) to {} addrspace(10)*)) #28, !dbg !572
  %334 = bitcast {} addrspace(10)* %333 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !572
  %.repack269 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %334, i64 0, i32 0, !dbg !572
  store i8* %32, i8* addrspace(10)* %.repack269, align 8, !dbg !572, !tbaa !41, !alias.scope !45, !noalias !478
  %.repack270 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %334, i64 0, i32 1, !dbg !572
  store {} addrspace(10)* %value_phi8, {} addrspace(10)* addrspace(10)* %.repack270, align 8, !dbg !572, !tbaa !41, !alias.scope !45, !noalias !478
  %335 = addrspacecast {} addrspace(10)* %333 to {} addrspace(12)*, !dbg !572
  call void @ijl_bounds_error_int({} addrspace(12)* %335, i64 %332) #29, !dbg !572
  unreachable, !dbg !572

oob56:                                            ; preds = %idxend55.peel
  %336 = addrspacecast {} addrspace(10)* %value_phi8 to {} addrspace(12)*, !dbg !572
  call void @ijl_bounds_error_int({} addrspace(12)* %336, i64 noundef 1) #29, !dbg !572
  unreachable, !dbg !572

load57:                                           ; preds = %load52
  %337 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %132, i64 %178, !dbg !554
  %338 = bitcast {} addrspace(10)* addrspace(13)* %337 to double addrspace(13)*, !dbg !554
  %339 = load double, double addrspace(13)* %338, align 8, !dbg !554, !tbaa !213, !alias.scope !45, !noalias !215
  %340 = load double, double addrspace(13)* %159, align 8, !dbg !566, !tbaa !213, !alias.scope !45, !noalias !215
  %341 = fmul double %339, %340, !dbg !569
  %342 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %114, i64 %325, !dbg !572
  %343 = bitcast {} addrspace(10)* addrspace(13)* %342 to double addrspace(13)*, !dbg !572
  store double %341, double addrspace(13)* %343, align 8, !dbg !572, !tbaa !213, !alias.scope !45, !noalias !478
  %exitcond629.not = icmp eq i64 %176, %16, !dbg !645
  br i1 %exitcond629.not, label %L424.loopexit, label %L304, !dbg !574, !llvm.loop !646

oob69:                                            ; preds = %L478
  %344 = call noalias nonnull align 8 dereferenceable(16) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 16, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279284194960 to {}*) to {} addrspace(10)*)) #28, !dbg !603
  %345 = bitcast {} addrspace(10)* %344 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !603
  %.repack276 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %345, i64 0, i32 0, !dbg !603
  store i8* %32, i8* addrspace(10)* %.repack276, align 8, !dbg !603, !tbaa !41, !alias.scope !45, !noalias !478
  %.repack277 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %345, i64 0, i32 1, !dbg !603
  store {} addrspace(10)* %value_phi8, {} addrspace(10)* addrspace(10)* %.repack277, align 8, !dbg !603, !tbaa !41, !alias.scope !45, !noalias !478
  %346 = addrspacecast {} addrspace(10)* %344 to {} addrspace(12)*, !dbg !603
  call void @ijl_bounds_error_int({} addrspace(12)* %346, i64 noundef 1) #29, !dbg !603
  unreachable, !dbg !603

load73:                                           ; preds = %L478
  %347 = bitcast i8* %32 to {} addrspace(10)**, !dbg !603
  %348 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %value_phi8, {} addrspace(10)** noundef %347) #27, !dbg !603
  %349 = bitcast {} addrspace(10)* addrspace(13)* %348 to double addrspace(13)*, !dbg !603
  %350 = load double, double addrspace(13)* %349, align 8, !dbg !603, !tbaa !213, !alias.scope !45, !noalias !215
  br label %L562, !dbg !618

oob75:                                            ; preds = %L499
  %351 = call noalias nonnull align 8 dereferenceable(16) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 16, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279284194960 to {}*) to {} addrspace(10)*)) #28, !dbg !608
  %352 = bitcast {} addrspace(10)* %351 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !608
  %.repack281 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %352, i64 0, i32 0, !dbg !608
  store i8* %32, i8* addrspace(10)* %.repack281, align 8, !dbg !608, !tbaa !41, !alias.scope !45, !noalias !478
  %.repack282 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %352, i64 0, i32 1, !dbg !608
  store {} addrspace(10)* %value_phi8, {} addrspace(10)* addrspace(10)* %.repack282, align 8, !dbg !608, !tbaa !41, !alias.scope !45, !noalias !478
  %353 = addrspacecast {} addrspace(10)* %351 to {} addrspace(12)*, !dbg !608
  call void @ijl_bounds_error_int({} addrspace(12)* %353, i64 noundef 1) #29, !dbg !608
  unreachable, !dbg !608

load79:                                           ; preds = %L499
  %354 = bitcast i8* %32 to {} addrspace(10)**, !dbg !608
  %355 = call {} addrspace(10)* addrspace(13)* @julia.gc_loaded({} addrspace(10)* noundef %value_phi8, {} addrspace(10)** noundef %354) #27, !dbg !608
  %356 = bitcast {} addrspace(10)* addrspace(13)* %355 to double addrspace(13)*, !dbg !608
  %357 = load double, double addrspace(13)* %356, align 8, !dbg !608, !tbaa !213, !alias.scope !45, !noalias !215
  %358 = icmp ult i64 %204, 2, !dbg !609
  br i1 %358, label %L514, label %L517, !dbg !609

oob80:                                            ; preds = %L517
  %359 = call noalias nonnull align 8 dereferenceable(16) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 16, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279284194960 to {}*) to {} addrspace(10)*)) #28, !dbg !611
  %360 = bitcast {} addrspace(10)* %359 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !611
  %.repack285 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %360, i64 0, i32 0, !dbg !611
  store i8* %32, i8* addrspace(10)* %.repack285, align 8, !dbg !611, !tbaa !41, !alias.scope !45, !noalias !478
  %.repack286 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %360, i64 0, i32 1, !dbg !611
  store {} addrspace(10)* %value_phi8, {} addrspace(10)* addrspace(10)* %.repack286, align 8, !dbg !611, !tbaa !41, !alias.scope !45, !noalias !478
  %361 = addrspacecast {} addrspace(10)* %359 to {} addrspace(12)*, !dbg !611
  call void @ijl_bounds_error_int({} addrspace(12)* %361, i64 noundef 2) #29, !dbg !611
  unreachable, !dbg !611

load84:                                           ; preds = %L517
  %362 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %355, i64 1, !dbg !611
  %363 = bitcast {} addrspace(10)* addrspace(13)* %362 to double addrspace(13)*, !dbg !611
  %364 = load double, double addrspace(13)* %363, align 8, !dbg !611, !tbaa !213, !alias.scope !45, !noalias !215
  %365 = fadd double %357, %364, !dbg !647
  %.not288383 = icmp sgt i64 %18, 2, !dbg !650
  br i1 %.not288383, label %L527.preheader, label %L562, !dbg !651

L527.preheader:                                   ; preds = %load84
  br label %L527, !dbg !614

oob87:                                            ; preds = %L542
  %366 = call noalias nonnull align 8 dereferenceable(16) "enzyme_type"="{[-1]:Pointer, [-1,-1]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double}" {} addrspace(10)* @julia.gc_alloc_obj({}* nonnull %8, i64 noundef 16, {} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 126279284194960 to {}*) to {} addrspace(10)*)) #28, !dbg !615
  %367 = bitcast {} addrspace(10)* %366 to { i8*, {} addrspace(10)* } addrspace(10)*, !dbg !615
  %.repack291 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %367, i64 0, i32 0, !dbg !615
  store i8* %32, i8* addrspace(10)* %.repack291, align 8, !dbg !615, !tbaa !41, !alias.scope !45, !noalias !478
  %.repack292 = getelementptr inbounds { i8*, {} addrspace(10)* }, { i8*, {} addrspace(10)* } addrspace(10)* %367, i64 0, i32 1, !dbg !615
  store {} addrspace(10)* %value_phi8, {} addrspace(10)* addrspace(10)* %.repack292, align 8, !dbg !615, !tbaa !41, !alias.scope !45, !noalias !478
  %368 = addrspacecast {} addrspace(10)* %366 to {} addrspace(12)*, !dbg !615
  call void @ijl_bounds_error_int({} addrspace(12)* %368, i64 %225) #29, !dbg !615
  unreachable, !dbg !615

load91:                                           ; preds = %L542
  %369 = getelementptr inbounds {} addrspace(10)*, {} addrspace(10)* addrspace(13)* %355, i64 %224, !dbg !615
  %370 = bitcast {} addrspace(10)* addrspace(13)* %369 to double addrspace(13)*, !dbg !615
  %371 = load double, double addrspace(13)* %370, align 8, !dbg !615, !tbaa !213, !alias.scope !45, !noalias !215
  %372 = fadd double %value_phi85384, %371, !dbg !652
  %exitcond628.not = icmp eq i64 %225, %18, !dbg !650
  br i1 %exitcond628.not, label %L562.loopexit, label %L527, !dbg !651
}

Did not have return index set when differentiating function
 call  %57 = call nonnull "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* %56({} addrspace(10)* %253, i64 %55, i64 %16) #27, !dbg !130
 augmentcall  %_augmented63 = call { i8*, {} addrspace(10)* } %80({} addrspace(10)* %279, {} addrspace(10)* %"'il_phi28", i64 %75, i64 %"'ipc62", i64 %18, i64 %18), !dbg !164


Stacktrace:
  [1] copy
    @ ./array.jl:350
  [2] unaliascopy
    @ ./abstractarray.jl:1516
  [3] unalias
    @ ./abstractarray.jl:1500
  [4] broadcast_unalias
    @ ./broadcast.jl:941
  [5] preprocess
    @ ./broadcast.jl:948
  [6] preprocess_args
    @ ./broadcast.jl:950
  [7] preprocess
    @ ./broadcast.jl:947
  [8] copyto!
    @ ./broadcast.jl:964
  [9] copyto!
    @ ./broadcast.jl:920
 [10] copy
    @ ./broadcast.jl:892
 [11] materialize
    @ ./broadcast.jl:867
 [12] f
    @ ~/Work/GitHub/Julia/DifferentiationInterface.jl/DifferentiationInterface/test/playground.jl:16

Stacktrace:
  [1] julia_error(cstr::Cstring, val::Ptr{…}, errtype::Enzyme.API.ErrorType, data::Ptr{…}, data2::Ptr{…}, B::Ptr{…})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Vjlrr/src/compiler.jl:2294
  [2] EnzymeCreatePrimalAndGradient(logic::Enzyme.Logic, todiff::LLVM.Function, retType::Enzyme.API.CDIFFE_TYPE, constant_args::Vector{…}, TA::Enzyme.TypeAnalysis, returnValue::Bool, dretUsed::Bool, mode::Enzyme.API.CDerivativeMode, runtimeActivity::Bool, width::Int64, additionalArg::Ptr{…}, forceAnonymousTape::Bool, typeInfo::Enzyme.FnTypeInfo, uncacheable_args::Vector{…}, augmented::Ptr{…}, atomicAdd::Bool)
    @ Enzyme.API ~/.julia/packages/Enzyme/Vjlrr/src/api.jl:253
  [3] enzyme!(job::GPUCompiler.CompilerJob{…}, mod::LLVM.Module, primalf::LLVM.Function, TT::Type, mode::Enzyme.API.CDerivativeMode, width::Int64, parallel::Bool, actualRetType::Type, wrap::Bool, modifiedBetween::Tuple{…}, returnPrimal::Bool, expectedTapeType::Type, loweredArgs::Set{…}, boxedArgs::Set{…})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Vjlrr/src/compiler.jl:4706
  [4] codegen(output::Symbol, job::GPUCompiler.CompilerJob{…}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Vjlrr/src/compiler.jl:7801
  [5] codegen
    @ ~/.julia/packages/Enzyme/Vjlrr/src/compiler.jl:6638 [inlined]
  [6] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Vjlrr/src/compiler.jl:8909
  [7] _thunk
    @ ~/.julia/packages/Enzyme/Vjlrr/src/compiler.jl:8909 [inlined]
  [8] cached_compilation
    @ ~/.julia/packages/Enzyme/Vjlrr/src/compiler.jl:8950 [inlined]
  [9] thunkbase(ctx::LLVM.Context, mi::Core.MethodInstance, ::Val{…}, ::Type{…}, ::Type{…}, tt::Type{…}, ::Val{…}, ::Val{…}, ::Val{…}, ::Val{…}, ::Val{…}, ::Type{…}, ::Val{…}, ::Val{…})
    @ Enzyme.Compiler ~/.julia/packages/Enzyme/Vjlrr/src/compiler.jl:9082
 [10] #s2067#19118
    @ ~/.julia/packages/Enzyme/Vjlrr/src/compiler.jl:9219 [inlined]
 [11] 
    @ Enzyme.Compiler ./none:0
 [12] (::Core.GeneratedFunctionStub)(::UInt64, ::LineNumberNode, ::Any, ::Vararg{Any})
    @ Core ./boot.jl:707
 [13] autodiff
    @ ~/.julia/packages/Enzyme/Vjlrr/src/Enzyme.jl:473 [inlined]
 [14] autodiff
    @ ~/.julia/packages/Enzyme/Vjlrr/src/Enzyme.jl:512 [inlined]
 [15] macro expansion
    @ ~/.julia/packages/Enzyme/Vjlrr/src/Enzyme.jl:1705 [inlined]
 [16] gradient(::EnzymeCore.ReverseMode{false, false, EnzymeCore.FFIABI, false, false}, ::typeof(f), ::Vector{Float64})
    @ Enzyme ~/.julia/packages/Enzyme/Vjlrr/src/Enzyme.jl:1646
 [17] top-level scope
    @ ~/Work/GitHub/Julia/DifferentiationInterface.jl/DifferentiationInterface/test/playground.jl:18
Some type information was truncated. Use `show(err)` to see complete types.

@gdalle
Copy link
Contributor Author

gdalle commented Oct 16, 2024

In case it helps, on Enzyme v0.13.10 the error has changed for the MWE above:

julia> Enzyme.gradient(Enzyme.Reverse, f, [1.0])

ERROR: 
No augmented forward pass found for jl_alloc_genericmemory
 at context:   %20 = call "enzyme_type"="{[-1]:Pointer}" {} addrspace(10)* @jl_alloc_genericmemory({} addrspace(10)* noundef addrspacecast ({}* inttoptr (i64 124954544733936 to {}*) to {} addrspace(10)*), i64 %11) #14, !dbg !81

Stacktrace:
  [1] GenericMemory
    @ ./boot.jl:516
  [2] new_as_memoryref
    @ ./boot.jl:535
  [3] Array
    @ ./boot.jl:582
  [4] Array
    @ ./boot.jl:592
  [5] Array
    @ ./boot.jl:599
  [6] similar
    @ ./abstractarray.jl:868
  [7] similar
    @ ./abstractarray.jl:867
  [8] similar
    @ ./broadcast.jl:224
  [9] similar
    @ ./broadcast.jl:223
 [10] copy
    @ ./broadcast.jl:892
 [11] materialize
    @ ./broadcast.jl:867
 [12] f
    @ ./REPL[4]:1


Stacktrace:
  [1] GenericMemory
    @ ./boot.jl:516 [inlined]
  [2] new_as_memoryref
    @ ./boot.jl:535 [inlined]
  [3] Array
    @ ./boot.jl:582 [inlined]
  [4] Array
    @ ./boot.jl:592 [inlined]
  [5] Array
    @ ./boot.jl:599 [inlined]
  [6] similar
    @ ./abstractarray.jl:868 [inlined]
  [7] similar
    @ ./abstractarray.jl:867 [inlined]
  [8] similar
    @ ./broadcast.jl:224 [inlined]
  [9] similar
    @ ./broadcast.jl:223 [inlined]
 [10] copy
    @ ./broadcast.jl:892 [inlined]
 [11] materialize
    @ ./broadcast.jl:867 [inlined]
 [12] f
    @ ./REPL[4]:1 [inlined]
 [13] diffejulia_f_14701wrap
    @ ./REPL[4]:0
 [14] macro expansion
    @ ~/.julia/packages/Enzyme/RmraO/src/compiler.jl:8572 [inlined]
 [15] enzyme_call
    @ ~/.julia/packages/Enzyme/RmraO/src/compiler.jl:8138 [inlined]
 [16] CombinedAdjointThunk
    @ ~/.julia/packages/Enzyme/RmraO/src/compiler.jl:7911 [inlined]
 [17] autodiff
    @ ~/.julia/packages/Enzyme/RmraO/src/Enzyme.jl:491 [inlined]
 [18] autodiff
    @ ~/.julia/packages/Enzyme/RmraO/src/Enzyme.jl:512 [inlined]
 [19] macro expansion
    @ ~/.julia/packages/Enzyme/RmraO/src/Enzyme.jl:1719 [inlined]
 [20] gradient(::ReverseMode{false, false, FFIABI, false, false}, ::typeof(f), ::Vector{Float64})
    @ Enzyme ~/.julia/packages/Enzyme/RmraO/src/Enzyme.jl:1660
 [21] top-level scope
    @ REPL[5]:1

@gdalle
Copy link
Contributor Author

gdalle commented Oct 17, 2024

On Enzyme v0.13.11 this MWE works!!! Congrats!

julia> Enzyme.gradient(Enzyme.Reverse, f, [1.0])
([2.0],)

@gdalle
Copy link
Contributor Author

gdalle commented Oct 18, 2024

Slowly tracking the progress on 1.11 on my DI test suite, and now I'm encountering this bug. Do tell me if me creating MWEs here is annoying, I'll stop ^^

julia> using Enzyme

julia> f(x) = vcat(x, x)
f (generic function with 1 method)

julia> autodiff(Forward, f, Duplicated(rand(2), zeros(2)))
ERROR: Enzyme execution failed.
Enzyme: The original primal code hits this error condition, thus differentiating it does not make sense

Stacktrace:
  [1] unsafe_copyto!
    @ ./genericmemory.jl:117 [inlined]
  [2] unsafe_copyto!
    @ ./array.jl:284 [inlined]
  [3] vcat
    @ ./array.jl:2222
  [4] f
    @ ./REPL[8]:1 [inlined]
  [5] fwddiffejulia_f_27066wrap
    @ ./REPL[8]:0
  [6] macro expansion
    @ ~/.julia/packages/Enzyme/vgArw/src/compiler.jl:8136 [inlined]
  [7] enzyme_call
    @ ~/.julia/packages/Enzyme/vgArw/src/compiler.jl:7702 [inlined]
  [8] ForwardModeThunk
    @ ~/.julia/packages/Enzyme/vgArw/src/compiler.jl:7491 [inlined]
  [9] autodiff
    @ ~/.julia/packages/Enzyme/vgArw/src/Enzyme.jl:647 [inlined]
 [10] autodiff
    @ ~/.julia/packages/Enzyme/vgArw/src/Enzyme.jl:537 [inlined]
 [11] autodiff(mode::ForwardMode{false, FFIABI, false, false}, f::typeof(f), args::Duplicated{Vector{Float64}})
    @ Enzyme ~/.julia/packages/Enzyme/vgArw/src/Enzyme.jl:504
 [12] top-level scope
    @ REPL[9]:1

@wsmoses
Copy link
Member

wsmoses commented Oct 23, 2024

with a new patch release, i think your most recent error should be fixed

@gdalle
Copy link
Contributor Author

gdalle commented Oct 23, 2024

Thanks! The DI test suite was able to progress a little further with Enzyme v0.13.12. Here is the first error message I get now (see CI log):

Enzyme compilation failed due to illegal type analysis.
  Current scope: 
  ; Function Attrs: mustprogress noinline willreturn
define private fastcc nonnull align 8 dereferenceable(80) {} addrspace(10)* @preprocess_augmented_julia_mapreduce_impl_698449({} addrspace(10)* noundef nonnull align 8 dereferenceable(32) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer}" "enzymejl_parmtype"="139756481903696" "enzymejl_parmtype_ref"="2" %0, {} addrspace(10)* nocapture nonnull readonly align 8 dereferenceable(32) "enzyme_type"="{[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Pointer, [-1,8,0]:Integer, [-1,8,1]:Integer, [-1,8,2]:Integer, [-1,8,3]:Integer, [-1,8,4]:Integer, [-1,8,5]:Integer, [-1,8,6]:Integer, [-1,8,7]:Integer, [-1,8,8]:Pointer, [-1,8,8,-1]:Float@double, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer}" "enzymejl_parmtype"="139756481903696" "enzymejl_parmtype_ref"="2" %"'", i64 signext "enzyme_inactive" "enzyme_type"="{[-1]:Integer}" "enzymejl_parmtype"="139756526830384" "enzymejl_parmtype_ref"="0" %1, i64 signext "enzyme_inactive" "enzyme_type"="{[-1]:Integer}" "enzymejl_parmtype"="139756526830384" "enzymejl_parmtype_ref"="0" %2) unnamed_addr #28 !dbg !3327 {

...

Illegal updateAnalysis prev:{[-1]:Pointer, [-1,-1]:Pointer, [-1,0,-1]:Integer} new: {[-1]:Pointer, [-1,0]:Pointer, [-1,0,0]:Float@double}
  val:   %27 = getelementptr inbounds { {} addrspace(10)*, i8*, {} addrspace(10)*, i8*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)**, {} addrspace(10)** }, { {} addrspace(10)*, i8*, {} addrspace(10)*, i8*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)*, {} addrspace(10)**, {} addrspace(10)** } addrspace(10)* %7, i64 0, i32 1, !dbg !59 origin=  store i8* %"'ipl10", i8* addrspace(10)* %27, align 8, !dbg !59, !noalias !29
  MethodInstance for Base.mapreduce_impl(::typeof(identity), ::typeof(Base.add_sum), ::Matrix{Float64}, ::Int64, ::Int64, ::Int64)
  
  
  Caused by:
  Stacktrace:
   [1] length
     @ ./essentials.jl:12
   [2] getindex
     @ ./essentials.jl:916
   [3] mapreduce_impl
     @ ./reduce.jl:256
Stacktrace:
    [1] julia_error(cstr::Cstring, val::Ptr{LLVM.API.LLVMOpaqueValue}, errtype::Enzyme.API.ErrorType, data::Ptr{Nothing}, data2::Ptr{LLVM.API.LLVMOpaqueValue}, B::Ptr{LLVM.API.LLVMOpaqueBuilder})
      @ Enzyme.Compiler ~/.julia/packages/Enzyme/BRtTP/src/compiler.jl:1508
    [2] EnzymeCreateForwardDiff(logic::Enzyme.Logic, todiff::LLVM.Function, retType::Enzyme.API.CDIFFE_TYPE, constant_args::Vector{Enzyme.API.CDIFFE_TYPE}, TA::Enzyme.TypeAnalysis, returnValue::Bool, mode::Enzyme.API.CDerivativeMode, runtimeActivity::Bool, width::Int64, additionalArg::Ptr{Nothing}, typeInfo::Enzyme.FnTypeInfo, uncacheable_args::Vector{Bool})
      @ Enzyme.API ~/.julia/packages/Enzyme/BRtTP/src/api.jl:319
    [3] enzyme!(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, mod::LLVM.Module, primalf::LLVM.Function, TT::Type, mode::Enzyme.API.CDerivativeMode, width::Int64, parallel::Bool, actualRetType::Type, wrap::Bool, modifiedBetween::NTuple{6, Bool}, returnPrimal::Bool, expectedTapeType::Type, loweredArgs::Set{Int64}, boxedArgs::Set{Int64})
      @ Enzyme.Compiler ~/.julia/packages/Enzyme/BRtTP/src/compiler.jl:4039
    [4] codegen(output::Symbol, job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, toplevel::Bool, strip::Bool, validate::Bool, only_entry::Bool, parent_job::Nothing)
      @ Enzyme.Compiler ~/.julia/packages/Enzyme/BRtTP/src/compiler.jl:7099
    [5] codegen
      @ ~/.julia/packages/Enzyme/BRtTP/src/compiler.jl:5932 [inlined]
    [6] _thunk(job::GPUCompiler.CompilerJob{Enzyme.Compiler.EnzymeTarget, Enzyme.Compiler.EnzymeCompilerParams}, postopt::Bool)
      @ Enzyme.Compiler ~/.julia/packages/Enzyme/BRtTP/src/compiler.jl:8207
    [7] _thunk
      @ ~/.julia/packages/Enzyme/BRtTP/src/compiler.jl:8207 [inlined]
    [8] cached_compilation
      @ ~/.julia/packages/Enzyme/BRtTP/src/compiler.jl:8248 [inlined]
    [9] thunkbase(ctx::LLVM.Context, mi::Core.MethodInstance, ::Val{0x000000000000685f}, ::Type{EnzymeCore.Const{typeof(DifferentiationInterface.shuffled_gradient)}}, ::Type{EnzymeCore.Duplicated{Vector{Float64}}}, tt::Type{Tuple{EnzymeCore.Duplicated{Vector{Float64}}, EnzymeCore.Const{DifferentiationInterfaceTest.MultiplyByConstant{:out, typeof(DifferentiationInterfaceTest.arr_to_num_linalg)}}, EnzymeCore.Const{AutoEnzyme{Nothing, Nothing}}, EnzymeCore.Const{DifferentiationInterface.Rewrap{1, Tuple{typeof(DifferentiationInterface.constant_maker)}}}, EnzymeCore.Const{Float64}}}, ::Val{Enzyme.API.DEM_ForwardMode}, ::Val{1}, ::Val{(false, false, false, false, false, false)}, ::Val{false}, ::Val{false}, ::Type{EnzymeCore.FFIABI}, ::Val{true}, ::Val{false})
      @ Enzyme.Compiler ~/.julia/packages/Enzyme/BRtTP/src/compiler.jl:8380
   [10] #s2074#19132
      @ ~/.julia/packages/Enzyme/BRtTP/src/compiler.jl:8517 [inlined]
   [11] var"#s2074#19132"(FA::Any, A::Any, TT::Any, Mode::Any, ModifiedBetween::Any, width::Any, ReturnPrimal::Any, ShadowInit::Any, World::Any, ABI::Any, ErrIfFuncWritten::Any, RuntimeActivity::Any, ::Any, ::Any, ::Any, ::Any, tt::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any, ::Any)
      @ Enzyme.Compiler ./none:0
   [12] (::Core.GeneratedFunctionStub)(::UInt64, ::LineNumberNode, ::Any, ::Vararg{Any})
      @ Core ./boot.jl:707
   [13] autodiff
      @ ~/.julia/packages/Enzyme/BRtTP/src/Enzyme.jl:633 [inlined]
   [14] autodiff
      @ ~/.julia/packages/Enzyme/BRtTP/src/Enzyme.jl:537 [inlined]
   [15] autodiff
      @ ~/.julia/packages/Enzyme/BRtTP/src/Enzyme.jl:504 [inlined]
   [16] pushforward(::typeof(DifferentiationInterface.shuffled_gradient), ::DifferentiationInterface.NoPushforwardPrep, ::AutoEnzyme{Nothing, Nothing}, ::Vector{Float64}, ::Tuple{Vector{Float64}}, ::Constant{DifferentiationInterfaceTest.MultiplyByConstant{:out, typeof(DifferentiationInterfaceTest.arr_to_num_linalg)}}, ::Constant{AutoEnzyme{Nothing, Nothing}}, ::Constant{DifferentiationInterface.Rewrap{1, Tuple{typeof(DifferentiationInterface.constant_maker)}}}, ::Constant{Float64})
      @ DifferentiationInterfaceEnzymeExt ~/work/DifferentiationInterface.jl/DifferentiationInterface.jl/DifferentiationInterface/ext/DifferentiationInterfaceEnzymeExt/forward_onearg.jl:58
   [17] hvp
      @ ~/work/DifferentiationInterface.jl/DifferentiationInterface.jl/DifferentiationInterface/src/second_order/hvp.jl:214 [inlined]
   [18] hvp(f::DifferentiationInterfaceTest.MultiplyByConstant{:out, typeof(DifferentiationInterfaceTest.arr_to_num_linalg)}, backend::AutoEnzyme{Nothing, Nothing}, x::Vector{Float64}, seed::Tuple{Vector{Float64}}, contexts::Constant{Float64})

Does it sound familiar enough to be fixed without an MWE?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants