Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use recursion to fix inference failure #78

Merged
merged 4 commits into from
Mar 5, 2024

Conversation

charleskawczynski
Copy link
Contributor

@charleskawczynski charleskawczynski commented Mar 2, 2024

This PR inlines, and uses recursion to fix cases where Adapt.adapt performs runtime dispatch due to either recursion limits, or inability to unroll map on Tuples. I have and will post a reproducer shortly.

Closes #79.

@charleskawczynski
Copy link
Contributor Author

I'm verifying (here) that this branch is solely responsible, but this seems to reduce gpu allocations for us by a factor of 15:
From build https://buildkite.com/clima/climaatmos-ci/builds/17118#018e020d-d9a0-4f26-b8c7-04013be0b169

[ Info: `allocs (flame_gpu_implicit_barowave_moist)`: 3637320

to build https://buildkite.com/clima/climaatmos-ci/builds/17105#018e004b-661b-404c-96e9-fafa28d875ed:

[ Info: `allocs (flame_gpu_implicit_barowave_moist)`: 241464

This basically fixes runtime dispatch issues that we're seeing (mentioned here, but in a different context).

@charleskawczynski charleskawczynski marked this pull request as ready for review March 3, 2024 04:01
@charleskawczynski
Copy link
Contributor Author

charleskawczynski commented Mar 3, 2024

Here is a reproducer:

git clone https://github.com/CliMA/ClimaCore.jl
cd ClimaCore.jl/
julia --project=examples
import ClimaCore;
import ClimaComms;
using Test
import ClimaCore.Fields
import ClimaCore.Operators
import ClimaCore.Spaces
import ClimaCore.Geometry
using BenchmarkTools;
@isdefined(TU) || include(joinpath(pkgdir(ClimaCore), "test", "TestUtilities", "TestUtilities.jl"));
import .TestUtilities as TU;
FT = Float64;
zelem=25
cspace = TU.CenterExtrudedFiniteDifferenceSpace(FT;zelem, helem=10);
fspace = Spaces.FaceExtrudedFiniteDifferenceSpace(cspace);
const CT3 = Geometry.Contravariant3Vector
const C12 = ClimaCore.Geometry.Covariant12Vector
@show ClimaComms.device(cspace)
ᶜx = fill((;uₕ=zero(C12{FT}) ,ρ=FT(0)), cspace);
ᶠx = fill((;ᶠuₕ³=zero(CT3{FT})), fspace);

const ᶠwinterp = Operators.WeightedInterpolateC2F(
    bottom = Operators.Extrapolate(),
    top = Operators.Extrapolate(),
)

function set_ᶠuₕ³!(ᶜx, ᶠx)
    ᶜJ = Fields.local_geometry_field(ᶜx).J
    @. ᶠx.ᶠuₕ³ = ᶠwinterp(ᶜx.ρ * ᶜJ, CT3(ᶜx.uₕ))
    return nothing
end

set_ᶠuₕ³!(ᶜx, ᶠx) # compile
p_allocated = @allocated set_ᶠuₕ³!(ᶜx, ᶠx)
@show p_allocated

using CUDA
using JET
using Adapt
# @test_opt ignored_modules = (CUDA,) set_ᶠuₕ³!(ᶜx, ᶠx)
import Base.Broadcast: broadcasted
ᶜJ = Fields.local_geometry_field(ᶜx).J
bc = broadcasted(ᶠwinterp, broadcasted(*, ᶜx.ρ, ᶜJ), broadcasted(CT3, ᶜx.uₕ));
Adapt.adapt(CUDA.KernelAdaptor(), bc.args);
# Call chain:
@test_opt ignored_modules = (CUDA,) Adapt.adapt(CUDA.KernelAdaptor(), bc.args);
@test_opt ignored_modules = (CUDA,) Adapt.adapt_structure(CUDA.KernelAdaptor(), bc.args);
@test_opt map(Adapt.adapt(CUDA.KernelAdaptor()), bc.args);

. You may need to have JET in your local environment for this reproducer to work. The JET tests fail on main, and pass in this branch.

Any chance we can get this merged? cc @maleadt

@charleskawczynski
Copy link
Contributor Author

Ok, an update on this: this PR is about half responsible for the allocations mentioned above. So, I'd really appreciate if we can merge this.

@maleadt
Copy link
Member

maleadt commented Mar 5, 2024

Bluntly applying @inline everywhere is not particularly satisfying; most of the time the Julia's inliner is smart enough to do the right thing. Do you have an idea which methods specifically are problematic?

@charleskawczynski
Copy link
Contributor Author

Bluntly applying @inline everywhere is not particularly satisfying; most of the time the Julia's inliner is smart enough to do the right thing. Do you have an idea which methods specifically are problematic?

It’s the call to map that is problematic. Should I try to only apply it to that method?

@maleadt
Copy link
Member

maleadt commented Mar 5, 2024

Should I try to only apply it to that method?

If that resolves the problem, I think that would be a cleaner solution. The other methods should normally all be inlined already.

@charleskawczynski charleskawczynski changed the title Inline and use recursion Use recursion to fix inference failure Mar 5, 2024
@charleskawczynski
Copy link
Contributor Author

Ok, I've updated the PR. It turns out that @inline is not even necessary. So, I've only replaced map with recursion, which still fixes the issue. I've also updated the reproducer, which I think should "work" better (I tried it just now and ran into version incompatibilities).

src/base.jl Outdated Show resolved Hide resolved
@charleskawczynski
Copy link
Contributor Author

charleskawczynski commented Mar 5, 2024

I don't understand why, but it seems that the compiler is suddenly having issues with adapt_structure(to, xs::NamedTuple) = map(x->adapt(to,x), xs) once adapt_structure(to, xs::Tuple) = map(x->adapt(to,x), xs) is replaced with recursion. Changing adapt_structure(to, xs::NamedTuple) = map(x->adapt(to,x), xs) to

adapt_structure(to, xs::NamedTuple{names}) where {names} =
    NamedTuple{names}(adapt_structure(to, Tuple(xs)))

fixes things (and it still infers in my original example). I've pushed this fix. Is this alright @maleadt?

Copy link

codecov bot commented Mar 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.02%. Comparing base (83fc6c6) to head (9b91675).

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #78      +/-   ##
==========================================
+ Coverage   93.65%   94.02%   +0.37%     
==========================================
  Files           6        6              
  Lines          63       67       +4     
==========================================
+ Hits           59       63       +4     
  Misses          4        4              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@maleadt
Copy link
Member

maleadt commented Mar 5, 2024

You used map(x->adapt(to,x), xs) while there used to be adapt(to) aka. Base.Fix1(adapt, to). Not sure why that makes a difference, but it fixes tests here.

@charleskawczynski
Copy link
Contributor Author

charleskawczynski commented Mar 5, 2024

Interesting! This begs the question: does adapt_structure(to, xs::Union{Tuple,NamedTuple}) = map(x->adapt(to, x), xs) work more generally, and fix my original issue? I'll try that now. I misread that (I had it reversed). Ok, changes look good.

@maleadt maleadt merged commit e1ef21b into JuliaGPU:master Mar 5, 2024
18 checks passed
maleadt added a commit that referenced this pull request Mar 8, 2024
maleadt added a commit that referenced this pull request Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inference failure in adapt
2 participants