Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an experimental opaque closure type. #1853

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from
Draft

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Apr 4, 2023

Opaque closures don't make that much sense in the CUDA environment (where we don't have a linker and need to recompile when the OC changes, also because we require specsig), but it can be useful to inline typed IR into a kernel and call it.

Current state of this PR is 1.12+ only (because of JuliaLang/julia#53219). It could be made more general if and when we decide to merge this.

Depends on JuliaGPU/GPUCompiler.jl#572

@codecov
Copy link

codecov bot commented Apr 4, 2023

Codecov Report

Attention: Patch coverage is 0% with 94 lines in your changes are missing coverage. Please review.

Project coverage is 59.97%. Comparing base (a011e73) to head (3f9cde3).
Report is 2 commits behind head on master.

❗ Current head 3f9cde3 differs from pull request most recent head 7782216. Consider uploading reports for the commit 7782216 to get more accurate results

Files Patch % Lines
src/compiler/compilation.jl 0.00% 94 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #1853       +/-   ##
===========================================
- Coverage   71.83%   59.97%   -11.87%     
===========================================
  Files         155      155               
  Lines       15013    14979       -34     
===========================================
- Hits        10785     8984     -1801     
- Misses       4228     5995     +1767     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pepijndevos
Copy link

How much of this is actually CUDA specific?

@maleadt
Copy link
Member Author

maleadt commented Apr 5, 2023

Not much. Only the fact that we need to be able to create a compiler invocation, which depends on backend-specific state (which device is active, etc): https://github.com/JuliaGPU/CUDA.jl/pull/1853/files#diff-ecbfaf5b99ab10dcafff5717c7cc5f856768e4313446fa3fb58b839a25b17cfcR317

@maleadt
Copy link
Member Author

maleadt commented May 24, 2023

Seems to cause some IR verification errors:

ERROR: LLVM error: Called function is not the same type as the call!
  call void bitcast (
      void ([1 x i64],
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            [3 x float]*,
            double)* @julia_opaque_gpu_closure_9593
    to
      void ([1 x i64],
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            { double, float },
            double)*)(
      [1 x i64] %state,
      { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }* nonnull %5,
      { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }* nonnull %6,
      { double, float } %.fca.1.insert,
      double %25), !dbg !300

id = length(GPUCompiler.deferred_codegen_jobs) + 1
GPUCompiler.deferred_codegen_jobs[id] = job
quote
ptr = ccall("extern deferred_codegen", llvmcall, Ptr{Cvoid}, (Int,), $id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #582 you should be able to just emit a gpuc.lookup(mi, oc, args...) or maybe gpuc.deferred(oc, args...)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda kernels Stuff about writing CUDA kernels. enhancement New feature or request speculative Not sure about this one yet.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants