Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeException with out of bounds memory access. #11208

Closed
VladSerhiienko opened this issue May 21, 2020 · 12 comments
Closed

RuntimeException with out of bounds memory access. #11208

VladSerhiienko opened this issue May 21, 2020 · 12 comments
Labels

Comments

@VladSerhiienko
Copy link

VladSerhiienko commented May 21, 2020

Currently, for getting familiar with emscripten and its capabilities, I'm trying to implement some stuff related to video decoding using FFmpeg. My code crashes under -O2 setup, because of a raised RuntimeException. Interestingly, it works like a charm with no optimizations enabled (-O0).

I've managed to find a workaround, and it appeared that the compiler overthinks. Code:

const uint8_t ff_h264_cabac_tables[...] = {...}; // A large static byte buffer.
const uint8_t* ff_h264_mlps_state = ff_h264_cabac_tables + ...; // A pointer to the middle of the buffer.

int s = ...; // Some int, that can be either positive or negative one, probably within [-128, 127].

// get_cabac_inline()
// Two logically identical lines, note that code is valid:
*state= (ff_h264_mlps_state+128)[s]; // Original code: s = -6, crash, out of bounds access.
*state = (ff_h264_mlps_state)[s+128]; // Fixed code: s = -6, good.
// Note, original line also seems to work properly with -s SAFE_HEAP=1, except for much slower times.

The problem seems to be critical since the behavior is completely valid from the C++ side.

My current setup:

System Version: macOS 10.15.4 (19E287)
Kernel Version: Darwin 19.4.0

(*)    releases-upstream-048cf9424790cc525a7ea6da340820aae226f3b9-64bit     INSTALLED
(*)    releases-upstream-3b8cff670e9233a6623563add831647e8689a86b-64bit     INSTALLED
(*)    node-12.9.1-64bit            INSTALLED

These issues may be related:

Please let me know if there is anything else I can help with.

@bvibber
Copy link
Collaborator

bvibber commented May 21, 2020

I haven't been able to reproduce a minimal test case from your examples. Do you have a cut-down test case that compiles and reproduces the problem? Or alternately can you provide the .js+.wasm output built with -O2 --profiling?

Disassembling the affected function with wasm-dis might give some insight into what's going wrong. My first thought was that something was going on where it's trying to use -6 as the base address with a positive offset on the actual memory opcode, but as far as I know that should still work...

@VladSerhiienko
Copy link
Author

VladSerhiienko commented May 21, 2020

@Brion Yup! I've prepared a branch for you, and that's how to reproduce the bug:

brew install git-lfs
git clone -b vserhiienko/reproduce-error-emscripten-11208 [email protected]:VladSerhiienko/GifTools.git
cd GifTools
npm install -y
git lfs install
git lfs pull
git submodule update --init --recursive
sh emsdk-reinstall.sh
sh giftools-build-for-web-single-file-with-local-emsdk.sh
npm run build
npm run test

The error will look like this:

RuntimeError: memory access out of bounds
    at get_cabac_noinline (wasm-function[384]:237)
    at decode_cabac_mb_chroma_pre_mode (wasm-function[388]:193)
    at ff_h264_decode_mb_cabac (wasm-function[379]:4402)
    at decode_slice (wasm-function[609]:1019)
    at ff_h264_execute_decode_slices (wasm-function[600]:291)
    at decode_nal_units (wasm-function[672]:1385)
    at h264_decode_frame (wasm-function[669]:477)
    at decode_simple_internal (wasm-function[351]:361)
    at decode_simple_receive_frame (wasm-function[320]:62)
    at decode_receive_frame_internal (wasm-function[319]:152)
    at avcodec_send_packet (wasm-function[318]:325)
    at try_decode_frame (wasm-function[2557]:821)
    at avformat_find_stream_info (wasm-function[2552]:5250)
    at giftools::ffmpegVideoStreamOpen(giftools::FFmpegInputStream const*) (wasm-function[80]:726)
    at ffmpegVideoStreamOpen(int) (wasm-function[242]:64)
    at emscripten::internal::Invoker<int, int>::invoke(int (*)(int), int) (wasm-function[255]:5)
    at dynCall_iii (wasm-function[4661]:7)
    at Object.<anonymous>.Module.dynCall_iii (/Users/vserhiienko/Projects/GifTools/bin/web_amalgamated_ffmpeg/GifTools.js:4161:79)
    at dynCall_iii_79 (eval at makeDynCaller (/Users/vserhiienko/Projects/GifTools/bin/web_amalgamated_ffmpeg/GifTools.js:2466:19), <anonymous>:4:12)
    at Object.ffmpegVideoStreamOpen (eval at new_ (/Users/vserhiienko/Projects/GifTools/bin/web_amalgamated_ffmpeg/GifTools.js:2270:27), <anonymous>:9:10)
    at GifTools.Object.<anonymous>.GifTools.videoDecoderOpenVideoStream (/Users/vserhiienko/Projects/GifTools/src/index.ts:175:49)
    at /Users/vserhiienko/Projects/GifTools/tests/composeGif.test.ts:26:37
    at Array.forEach (<anonymous>)
    at /Users/vserhiienko/Projects/GifTools/tests/composeGif.test.ts:21:26

To check the fix you should open FFmpeg/libavcodec/cabac_functions.h file, comment line 124 and uncomment line 127 and re-build/re-test:

sh giftools-build-for-web-single-file-with-local-emsdk.sh
npm run build
npm run test

You should see how decoded video frames are stored as PNGs in tests/bin/results_actual dir.

Also, I've included all the demangling/profiling/tracing into the build, but in case you need something extra, feel free to edit the CMake file.

Please let me know in case of any progress.

@bvibber
Copy link
Collaborator

bvibber commented May 21, 2020

Perfect! I'll take a look and see if I can find anything clear.

@VladSerhiienko
Copy link
Author

@Brion Yeah, I assume it either checks whether the offset is negative and throws or reinterprets the value as unsigned one, goes too far and throws.

@VladSerhiienko
Copy link
Author

@Brion Thanks!

@bvibber
Copy link
Collaborator

bvibber commented May 21, 2020

Ok, here's the failing bit in the disassembly at -O2:

   (i32.load8_u offset=14440
    (i32.load offset=20
     (local.get $2)
    )
   )

The inner i32.load is fetching s from the stack; the outer is using s as a pointer to load from, with the offset on the instruction being the hardcoded memory address plus 128.

Memory loads fail (at least in node) when the base value is less than 0, which can be easily confirmed in a standalone Wasm module looking like this:

(module
    (export "setter" $setter)
    (export "positive_pointer" $positive_pointer)
    (export "negative_pointer" $negative_pointer)

    (memory 1)

    (func $setter (param i32)
        (i32.store8
            (i32.const 1)
            (local.get $0)
        )
    )

    (func $negative_pointer (result i32)
        (return
            (i32.load8_s offset=2
                (i32.const -1)
            )
        )
    )

    (func $positive_pointer (result i32)
        (return
            (i32.load8_s offset=1
                (i32.const 0)
            )
        )
    )
)
let fs = require('fs');

let buf = fs.readFileSync('small.wasm');

WebAssembly.instantiate(buf)
.then(({module, instance}) => {

    instance.exports.setter(42);

    console.log('This will work:');
    console.log(instance.exports.positive_pointer());

    console.log('This will fail:');
    console.log(instance.exports.negative_pointer());
});

I am not certain if this is correct behavior for the Wasm virtual machine; the spec says:

The static address offset is added to the dynamic address operand, yielding a 33 bit effective address that is the zero-based index at which the memory is accessed. All values are read and written in little endian byte order. A trap results if any of the accessed memory bytes lies outside the address range implied by the memory’s current size.

Since all bytes of the effective address are in range, I don't think it should throw, and the code generation seems to assume it's safe to do this technique.

Can anyone more familiar with the spec give some insight as to whether this is expected behavior per the spec (in which case codegen may need to change) or if it's a problem with the V8 implementation? I'm not familiar enough with the VM implementation to know if the way offsets are mapped to low-level instructions is causing this behavior as an unintended side effect or anything...

@kripken
Copy link
Member

kripken commented May 21, 2020

The inner i32.load is fetching s from the stack; the outer is using s as a pointer to load from, with the offset on the instruction being the hardcoded memory address plus 128.

Is s equal to -6 here as mentioned earlier?

Then this should trap, as that -6 as an unsigned value is almost at the 4GB point (and the end of valid 32-bit memory), and wasm semantics say to add the offset without wrapping, in effect. So the final address is over 4GB, and not valid.

Next I'd check if this is a bug in llvm or binaryen. If you compile with -O2 but link with -O0, it won't run the binaryen optimizer.

@bvibber
Copy link
Collaborator

bvibber commented May 21, 2020

@kripken ahhhhhhhhhh, it's interpreted as an unsigned value with no wrapping. Ok that explains why it fails. Yeah definitely a codegen problem, will see if can narrow it down a little.

@bvibber
Copy link
Collaborator

bvibber commented May 21, 2020

@kripken ok with building with -O2 and linking at -O0 I see this:

  (local.set $57
   (i32.load8_u offset=14440
    (local.get $56)
   )
  )

which does the same thing and fails the same way when run. So I suspect something in the LLVM side is assuming that pointers and load/store offsets wrap when they add together, and needs to be fixed so it doesn't make incorrect optimizations.

I can attempt to narrow down a minimal C/C++ test case if that'll help in identifying the problem. :)

@ghost
Copy link

ghost commented May 22, 2020

Could this be related to issues I've been experiencing in the last day or so? Sorry to be vague, but there have been a few issues submitted recently and faulty code generation could be responsible for any number of things.

EDIT: Nope. My Emscripten is a year out of date. Still watching this thread out of curiosity.

@kripken
Copy link
Member

kripken commented Jun 3, 2020

Sorry for the late response @Brion - I missed your comment earlier.

This doesn't sound like a known issue. Load/store offsets should be heavily tested, and we don't have any other bug reports that sound similar. So this is worrying - getting a reduced testcase would be very helpful! (I suspect you may need to start from the failing code and reduce from there, as I'm guessing writing a new testcase from scratch won't be easy, as nothing else has hit this bug in our testing and fuzzing.)

@stale
Copy link

stale bot commented Jun 4, 2021

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants