Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile WASM release also only for generic CPU capabilities (without SIMD). #418

Open
polkovnikov opened this issue Jun 4, 2022 · 7 comments

Comments

@polkovnikov
Copy link

polkovnikov commented Jun 4, 2022

Feature request. I think "topic" will be very useful, because if I open your worker in a browser (like this page) it says
Uncaught (in promise) RuntimeError: abort(CompileError: WebAssembly.instantiate(): Wasm SIMD unsupported @+1087). Build with -s ASSERTIONS=1 for more info. at abort (bergamot-translator-worker.js:651:10) at bergamot-translator-worker.js:724:4.

It means my CPU is too old and doesn't support desired CPU capabilities, maybe AVX1 or something. My CPU has SSSE3 at most (it has no SSE4...).

I think it would be benefitial (even if it is slow) to compile for older CPUs, only for SSE2, because any x64 is guaranteed to have SSE2. Not relying on having anything bigger than SSE2.

Maybe even to compile without any SIMD at all (without SSE/SSE2 too), only generic CPU instructions.

@wazoox
Copy link

wazoox commented Nov 28, 2022

I concur, most of my machines are Phenom2 or equivalent, and I have zero need to upgrade to better CPUs so far, they simply are fast enough. Think about the planet, we must keep running old machines as long as possible.

@XapaJIaMnu
Copy link
Collaborator

Old machines use more electricity to do the same amount of work. Same is true for using the non simd path on newer CPUs. Having a legacy option is always good but this code path shouldn't be forced for the newer machines, which have much more efficient hardware path.

@wazoox
Copy link

wazoox commented Nov 28, 2022

We're not asking to force non-optimal code on new machines, we ask for a legacy option for older machines that generally do the job. I can run current OSes and applications on my 8 to 10 year-old machines in a perfectly sufficient way; this particular code is currently the only one that I'm aware of that doesn't work at all, even slowly. If I was proficient in C++ and understood a thing about WASM I'd have a shot at it but unfortunately it's way out of my league :)

This look like it could help: https://www.libvolk.org/

@wazoox
Copy link

wazoox commented Nov 28, 2022

It looks like it can be done, after this LLVM documentation: https://releases.llvm.org/7.1.0/tools/clang/docs/AttributeReference.html#target-gnu-target

@XapaJIaMnu
Copy link
Collaborator

You can use translateLocally which has generic x86 builds, just download any of the compat releases: https://translatelocally.com/ What exactly do you need? Do you need a python compiled for generix x86 archtecture or the wasm module itself?

@graemenail @jelmervdl for translateLocally, we do this by providing -DBUILD_ARCH=x86-64, but afaik wasm is x86? Do we use the x86 as a target?

@jelmervdl
Copy link
Member

I think intgemm needs at least SSE2 instructions to even compile, and to enable SSE2 instructions in emscripten you need to compile the wasm binary with WASM SIMD instructions. And that's the problem: on older hardware Chrome and Firefox do not support WASM SIMD instructions at all; and there's not a supported subset of WASM SIMD for SSE21.

For us to be able to compile the wasm version without WASM SIMD instructions, we'd need to add kernel implementations to intgemm that don't rely on SSE2.

Footnotes

  1. WASM (and WASM SIMD) don't map directly to x86 instructions. See https://emscripten.org/docs/porting/simd.html for an (inverse) list of how WASM instructions map to native ones.

@XapaJIaMnu
Copy link
Collaborator

A slight correction, we need ssse3, as sse2 doesn't have 8 bit instructions... I was wondering if wasm could do some unrolling emulation like macos does, but I guess it doesn't.

The other possibility is to use onnx gemm compilation path, which will be really really slow...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants