Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a different IR than Cranelift #19

Open
chc4 opened this issue Sep 20, 2021 · 3 comments
Open

Use a different IR than Cranelift #19

chc4 opened this issue Sep 20, 2021 · 3 comments

Comments

@chc4
Copy link
Owner

chc4 commented Sep 20, 2021

move |a: usize| {
    let val: usize;
    if Wrapping(a) + Wrapping(0x5) < Wrapping(a) {
        val = 1;
    } else {
       val = 2;
    }
    val
}
---- cmp rsi, -0x5
---- mov eax, 0x2
---- adc rax, -0x1
---- ret
function u0:0(i64, i64, i64) -> i64 system_v {
block0(v0: i64, v1: i64, v2: i64):
    jump block1

block1:
    v3 = iconst.i64 -5
    v4, v5 = isub_ifbout.i64 v2, v3
    v6 = iconst.i64 2
    v7 = iconst.i64 -1
    v8, v9 = iadd_ifcarry v6, v7, v5
    return v8
}

(Ignore the weird block0)
This errors when compiling with
thread 'test::test_passthrough_usize' panicked at 'ALU+imm and ALU+carry ops should not appear here!', /home/charlie/.cargo/registry/src/github.com-1ecc6299db9ec823/cranelift-codegen-0.75.0/src/isa/x64/lower.rs:5977:13

Technically for this specific case I could turn the cmp into a Cranelift icmp_imm instruction, but later could could be depending on the rsi (v4) value. I'd have to re-emit an isub instruction right before its use, which I can do in this specific case, but really this is just a sign that Cranelift's IFLAGS mechanism plain doesn't work for me.
The bigger issue is that Cranelift assumes that all math ops clobber all IFLAGS, so you can't have, for example

---- cmp rsi, -0x5
---- lea eax, [rsi+5]
---- mov eax, 0x2
---- adc rax, -0x1
---- ret

since it would say that the flag produced by the cmp would be invalid after the iadd from the LEA. I'd have to do my own code movement and instruction picking to make it happy, and even then Cranelift flag rules are hard enough to use it's probably impossible.

I'm not entirely sure where to go from here: I probably should've just been using LLVM this entire time, since that's what a lot of other x86_64 lifters target for a backend. Alternatively, I could try to use dynasm-rs: it doesn't have any optimizations, but then Cranelift barely does either (and it's codegen is often worse than just re-emitting the same instructions!). By lifting and lowering to the exact same instructions, nothing would clobber flags - at worse I'd have to emit stc; adc if I constant fold the add part of an add; adc, and could probably do something smarter instead.

@chc4 chc4 mentioned this issue Sep 20, 2021
@chc4
Copy link
Owner Author

chc4 commented Sep 20, 2021

This has it's own set of issues - the most important being that I have to do my own register allocation now instead of being able to steal Cranelift's (which was the main motivation of using it!). dynasm-rs does support dynamic registers for operands, but you need to give it numbers for which one to use. This probably means I need my own IR as well, since register allocation is usually done in a bottom up manner to recover live ranges, instead of one-pass top down (you can technically do a greedy register allocation, where everything is in a register by default and spilled in LRU manner when that's full, but it will have horrible performance characteristics on long functions or any loops...). We also have the problem where we can't spill values to the stack naively, since the assembly will be using that too and require contiguous ranges for stack buffers! So we need to know the full spill list when we start code-gen, so we can pass the same stack buffers to functions the same way.

Some people in the langdev discord channel have been talking about RVSDG, which is a hybrid CFG/DFG representation. We care about the DFG side more than the CFG side (since we already have a function layout graph from what we lift, we'd just like to optimize to a better one if it's possible), so it might be a good fit.

@chc4
Copy link
Owner Author

chc4 commented Sep 26, 2021

Started development on the tangle branch.

@chc4
Copy link
Owner Author

chc4 commented Feb 17, 2022

I merged the tangle branch into mainline; it isn't isn't nearly good enough to switch to yet, but it does implement a register allocator (without spilling support) and Technically Works(tm).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant