-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A built-in wasm 8-bit matrix multiply primitive #205
Comments
Hello, just wondering if some work I'm doing at arm-playground relates/overlaps with the todos here. Could you let know if the links mentioned below are related?
|
It is possible to compile intgemm into wasm, yes. But it will not be backed up by AVX stuff, thus no desired performance. The ARM solution is not expected to be compilable via emscripten into wasm since there will no be a performance win. The ARM64 solution will be used as part of MozIntGemm (notice that it is native code so we can use AVX or Neon directly).
WASM -> JS -> WASM is temporary and "Package this portable implementation with bergamot binary to serve as a fallback solution" looks like an item to remove that. The "fallback" wasm module implements functionality of MozIntGemm, but in wasm (in comparison MozIntGemm will use AVX stuff which are not available in wasm) -- emscripten will be used here, I guess. |
@jerinphilip Your short-term goal is to provide an implementation of Abhishek's C API using ruy so that it can be integrated into Gecko (part of Firefox) for ARM support. This native code will then be exposed as an intrinsic MozIntGemm that Marian perceives as a C function. Please try to avoid being a bull in a China shop. There is no requirement for a SIMD-free implementation. My understanding is Firefox supports WebAssembly on ARM and x86. And both of these have SIMD. On an arch without WebAssembly we're sunk anyway. On browsers without the MozIntGemm intrinsic, Mozilla's proposal is:
There is a problem though in that the library will also have to not expose symbols (to avoid conflicting with residual intgemm in Marian since we've only abstracted the parts used in Bergamot models), will take up more space for the artifact, and I don't know if there's some overhead for the linked function. Therefore I am proposing we just make the separate WASM library a detectable dummy implementation that Marian can avoid calling and handle internally. We already have this path for native builds; they don't jump through the C API. |
Updated the issue to reflect the current status of the tasks. |
A summary of all the tasks that need to be done in this repository (and its submodules) to import matrix multiply function (based on 4x8-bit-to-32-bit dot product primitive) from a separate wasm module.
JS code will be able to instantiate a separate wasm module that exports a matrix multiply function and Bergamot code can, then, link against that instance to access that function. (Some context here: https://github.com/mozilla-extensions/firefox-translations/issues/75, corresponding bugzilla issue)
Tasks Stage-1:
Tasks Stage-2:
Tasks Stage-3:
cc @yurydelendik @andrenatal @lonnen @kpu @eqrion @lars-t-hansen @julian-seward1
Please suggest if I missed anything. I have created separate issues for 1 of the task to have the specific discussions there. Same can be done for other tasks as well as we go ahead.
P.S.: Stage-2 and Stage-3 can be ordered differently. Stage-1 has to be executed first.
The text was updated successfully, but these errors were encountered: