Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Important: Recent changes break code generation #124

Open
PearCoding opened this issue May 12, 2022 · 10 comments
Open

Important: Recent changes break code generation #124

PearCoding opened this issue May 12, 2022 · 10 comments

Comments

@PearCoding
Copy link
Contributor

JIT code generation triggers a lot of errors and produces (for some targets) artifacts while rendering, some targets (nvvm) do not even run. The bug appeared after updating AnyDSL (from a version before March 2022), not by modifying Ignis.

AVX2 artifacts

You can repeat the bug with any scene in Ignis when using recent AnyDSL and whatever target you want. The generic, avx2 and nvvm all fail and produce a similar error output. Generic does not produce artifacts and renders totally fine. AVX2 produces artifacts as shown in the image. NVVM produces a CUDA_ERROR_LAUNCH_FAILED and crashes.
The problem has to be in thorin (or artic). Tried a previous version of rv (which definitely worked before) and the bug still exists.

Here are some errors messages triggered by thorin for avx2: errors.txt
and the (large!) dump produced by thorin (or artic): debug.txt
The debug.txt contains multiple jit shaders and some Ignis log output as well, as it was just grabbed from the stdout.

I will have to see how to single out the problem, but that will take some time. I will also look which commit broke the code generation. Nevertheless, I will post the issue now. Maybe you guys already know where the problem is?

The bug happened on Linux, Ubuntu 22.04. Did not try Windows yet.

@Hugobros3
Copy link
Contributor

We recently merged a large refactoring of Thorin, could be related. When is the last time you updated your AnyDSL install before ? I will be away for the weekend sadly, but I will make sure this is solved asap when I am back.

@PearCoding
Copy link
Contributor Author

The only thing I quite remember is that there was around 80 new commits in thorin. That large number was the reason I updated it :D

@PearCoding
Copy link
Contributor Author

So, I further tested standard compilation (no JIT) and no errors seem to be triggered. It very likely is a JIT only problem. Maybe the jit.cpp in AnyDSL/runtime has to be adapted to the recent changes?

@PearCoding
Copy link
Contributor Author

All the errors can be reproduces (for AVX2) with the following code:
v0_missShader_full.txt

The code is artic source code and still quite large. I try to minimize it.

I used the following artic line to compile it:

~/dev/anydsl/artic/build/bin/artic ~/dev/anydsl/runtime/platforms/artic/intrinsics*.impala ~/dev/anydsl/runtime/platforms/artic/runtime.impala v0_missShader_full.art --emit-llvm --log-level info -Wall

Adapt it to your needs.

Knowing it can be produced without JIT means its not a JIT, but a weird code error.

@PearCoding PearCoding changed the title Important: Recent changes break JIT generation Important: Recent changes break code generation May 13, 2022
@PearCoding
Copy link
Contributor Author

PearCoding commented May 13, 2022

More information:

Found a single line which triggers the error messages (the artifacts still remain).

if mat.bsdf.is_specular || num_lights == 0 {
   return(ShadowRay::None)
}

The line is from Ignis https://github.com/PearCoding/Ignis/blob/4a0f08e4bcc62a0a8b6fa2cfda27eb40c847ae37/src/artic/impl/technique/pathtracer.art#L73
If num_lights is removed, the compilation process is fine, only the artifacts remain and nvvm still crashes.
A very important factor is that num_lights is always a compile time constant. The function make_path_renderer is called with a literal value for num_lights, in contrary to max_path_len which is a runtime variable.

Looking at the artifacts and nvvm I fear this is more than a single bug. :/

@PearCoding
Copy link
Contributor Author

These are the last log entries which definitely worked:

Thorin:

commit 719833f675dc98452d432444bba0b0bb6ff3e30e (HEAD -> master, origin/master, origin/HEAD)
Date:   Fri Feb 18 20:22:33 2022 +0100
    Whitespace, formatting.

Artic:

commit 36e8f452583d50a20256ee021282bad6188a5c76 (HEAD -> master, origin/master, origin/HEAD)
Date:   Tue Jan 25 19:29:40 2022 +0100
    fix doc for ExprStmt

Runtime:

Merge: 4822907 9f91939
Date:   Thu Mar 31 00:08:13 2022 +0200
    Merge pull request #30 from AnyDSL/add-cuda-cpp-version-option
    Add AnyDSL_runtime_CUDA_CXX_STANDARD option

@PearCoding
Copy link
Contributor Author

Yet another quick update. The artifacts with AVX2 are an issue with Ignis not thorin (most likely) as they still occur within the very old commits, but the error messages do not.

@PearCoding
Copy link
Contributor Author

The following file contains a mini-dump on Windows, with a crash due to a stack overflow. Starting at the line

nops[i] = import(odef->op(i));

and (after a recursion depth of > 64) crashing inside
auto ndef = odef->rebuild(world(), ntype, nops);
exactly at
const Def* World::arithop(ArithOpTag tag, const Def* a, const Def* b, Debug dbg) {

The information might become handy or helpful for debugging.
igview.zip

PearCoding added a commit to PearCoding/Ignis that referenced this issue May 19, 2022
@PearCoding
Copy link
Contributor Author

Fixed with daa6523

@PearCoding
Copy link
Contributor Author

The following file contains a mini-dump on Windows, with a crash due to a stack overflow. Starting at the line

nops[i] = import(odef->op(i));

and (after a recursion depth of > 64) crashing inside

auto ndef = odef->rebuild(world(), ntype, nops);

exactly at

const Def* World::arithop(ArithOpTag tag, const Def* a, const Def* b, Debug dbg) {

The information might become handy or helpful for debugging.
igview.zip

This bug still exists. Therefore, I am reopening the issue

@PearCoding PearCoding reopened this Jul 14, 2022
SLebailly pushed a commit to SLebailly/Ignis-volumetric that referenced this issue Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants