Add an experimental opaque closure type. #1853

maleadt · 2023-04-04T11:25:59Z

Opaque closures don't make that much sense in the CUDA environment (where we don't have a linker and need to recompile when the OC changes, also because we require specsig), but it can be useful to inline typed IR into a kernel and call it.

Current state of this PR is 1.12+ only (because of JuliaLang/julia#53219). It could be made more general if and when we decide to merge this.

Depends on JuliaGPU/GPUCompiler.jl#572

codecov · 2023-04-04T12:07:36Z

Codecov Report

Attention: Patch coverage is 0% with 94 lines in your changes are missing coverage. Please review.

Project coverage is 59.97%. Comparing base (a011e73) to head (3f9cde3).
Report is 2 commits behind head on master.

❗ Current head 3f9cde3 differs from pull request most recent head 7782216. Consider uploading reports for the commit 7782216 to get more accurate results

Files	Patch %	Lines
src/compiler/compilation.jl	0.00%	94 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           master    #1853       +/-   ##
===========================================
- Coverage   71.83%   59.97%   -11.87%     
===========================================
  Files         155      155               
  Lines       15013    14979       -34     
===========================================
- Hits        10785     8984     -1801     
- Misses       4228     5995     +1767

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pepijndevos · 2023-04-05T11:32:50Z

How much of this is actually CUDA specific?

maleadt · 2023-04-05T12:09:37Z

Not much. Only the fact that we need to be able to create a compiler invocation, which depends on backend-specific state (which device is active, etc): https://github.com/JuliaGPU/CUDA.jl/pull/1853/files#diff-ecbfaf5b99ab10dcafff5717c7cc5f856768e4313446fa3fb58b839a25b17cfcR317

maleadt · 2023-05-24T12:05:57Z

Seems to cause some IR verification errors:

ERROR: LLVM error: Called function is not the same type as the call!
  call void bitcast (
      void ([1 x i64],
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            [3 x float]*,
            double)* @julia_opaque_gpu_closure_9593
    to
      void ([1 x i64],
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }*,
            { double, float },
            double)*)(
      [1 x i64] %state,
      { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }* nonnull %5,
      { { i8 addrspace(1)*, i64, [2 x i64], i64 }, { [1 x [1 x i64]], i64 }, i64, i64 }* nonnull %6,
      { double, float } %.fca.1.insert,
      double %25), !dbg !300

vchuravy · 2024-07-18T15:08:14Z

src/compiler/compilation.jl

+    id = length(GPUCompiler.deferred_codegen_jobs) + 1
+    GPUCompiler.deferred_codegen_jobs[id] = job
+    quote
+        ptr = ccall("extern deferred_codegen", llvmcall, Ptr{Cvoid}, (Int,), $id)


After #582 you should be able to just emit a gpuc.lookup(mi, oc, args...) or maybe gpuc.deferred(oc, args...)?

maleadt force-pushed the tb/opaque_closures branch from 8538870 to dab7f8a Compare April 14, 2023 11:15

maleadt mentioned this pull request Jun 14, 2023

Export jl_method_set_source. JuliaLang/julia#49236

Merged

maleadt force-pushed the tb/opaque_closures branch from ae85f2e to 9cfd249 Compare June 15, 2023 18:02

maleadt force-pushed the tb/opaque_closures branch from 9cfd249 to 92632e9 Compare June 29, 2023 14:09

maleadt force-pushed the master branch from c97bc77 to d57e020 Compare September 8, 2023 20:12

maleadt force-pushed the master branch from 1cb1f53 to 1a1d127 Compare September 18, 2023 16:28

maleadt force-pushed the master branch from aef3298 to 4b017c6 Compare January 18, 2024 12:09

maleadt force-pushed the tb/opaque_closures branch from 92632e9 to 4511f85 Compare April 9, 2024 15:21

maleadt added enhancement New feature or request cuda kernels Stuff about writing CUDA kernels. speculative Not sure about this one yet. labels Apr 9, 2024

maleadt and others added 7 commits April 17, 2024 13:40

Add an experimental opaque closure type.

d9f431e

Fix construction from CodeInfo.

243dbac

Generalize to arbitrary args.

a9a3be2

Work around SciML bug.

774eb4a

Updates for 1.12.

e8fe2a9

Fixes.

c4d7db3

Add support for dynamically-constructed opaque closures.

7782216

maleadt force-pushed the tb/opaque_closures branch from 3f9cde3 to 7782216 Compare April 17, 2024 11:40

vchuravy reviewed Jul 18, 2024

View reviewed changes

leios mentioned this pull request Jul 18, 2024

Removing @generated functions leios/Fable.jl#66

Open

maleadt mentioned this pull request Jul 18, 2024

Device function pointers #2450

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an experimental opaque closure type. #1853

Add an experimental opaque closure type. #1853

maleadt commented Apr 4, 2023 •

edited

Loading

codecov bot commented Apr 4, 2023 •

edited

Loading

pepijndevos commented Apr 5, 2023

maleadt commented Apr 5, 2023

maleadt commented May 24, 2023

vchuravy Jul 18, 2024

Add an experimental opaque closure type. #1853

Are you sure you want to change the base?

Add an experimental opaque closure type. #1853

Conversation

maleadt commented Apr 4, 2023 • edited Loading

codecov bot commented Apr 4, 2023 • edited Loading

Codecov Report

pepijndevos commented Apr 5, 2023

maleadt commented Apr 5, 2023

maleadt commented May 24, 2023

vchuravy Jul 18, 2024

Choose a reason for hiding this comment

maleadt commented Apr 4, 2023 •

edited

Loading

codecov bot commented Apr 4, 2023 •

edited

Loading