Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port marian to WASM for inference #24

Merged
merged 46 commits into from
Mar 24, 2021
Merged

Port marian to WASM for inference #24

merged 46 commits into from
Mar 24, 2021

Conversation

abhi-agg
Copy link

@abhi-agg abhi-agg commented Mar 4, 2021

Description

This PR enables capability to build and use marian as a WASM module for doing inference.
It is related to issues: #23

List of changes:

  • Enable building only decoder (only for inference)
  • Added onnxjs submodule (to provide a wasm compatible SGEMM routine)
  • Updates intgemm submodule to use fast Integer 8-bit inference + simd shuffle pattern on WASM
  • Updates SentencePiece submodule (to build stuff that is required just for inference)
  • Build scripts (local machine and on Docker)
  • Github workflow for WASM builds

Added dependencies: onnxjs submodule

How to test

To run and test, follow the README

Checklist

  • I have tested the code manually
  • I have run regression tests
  • I have read and followed CONTRIBUTING.md
  • I have updated CHANGELOG.md

@abhi-agg abhi-agg requested a review from kpu March 4, 2021 10:32
@abhi-agg abhi-agg linked an issue Mar 4, 2021 that may be closed by this pull request
12 tasks
Copy link
Member

@ugermann ugermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before submitting pull requests please make sure that

  • all changes are properly documented in CHANGELOG.md
  • all regression tests pass
  • all files with changes have clutter / commented-out code removed (or contain a convincing explanation why the commented-out code should be kept around

Without at least these three conditions having been met, chances of approval are very slim.

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
docker/Makefile Outdated Show resolved Hide resolved
docker/Makefile Outdated Show resolved Hide resolved
docker/Makefile Outdated Show resolved Hide resolved
src/CMakeLists.txt Outdated Show resolved Hide resolved
src/CMakeLists.txt Outdated Show resolved Hide resolved
src/common/cli_helper.cpp Outdated Show resolved Hide resolved
src/common/file_stream.cpp Outdated Show resolved Hide resolved
src/tensors/cpu/prod.cpp Outdated Show resolved Hide resolved
@XapaJIaMnu
Copy link
Collaborator

XapaJIaMnu commented Mar 9, 2021 via email

Copy link
Collaborator

@XapaJIaMnu XapaJIaMnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left 2 comments.
Could you please answer all @ugermann's concerns about the disabled exceptions before merging.

@ugermann
Copy link
Member

Could you please answer all @ugermann's concerns about the disabled exceptions before merging.

Also, all regression tests should pass before a merge into master, and all changes need to be documented in CHANGELOG.md. Otherwise, there's no chance in hell we'll ever get this past the upstream code masters for an upstream merge.

@abhi-agg abhi-agg force-pushed the wasm-merge-main branch 2 times, most recently from dab5435 to 5486ba4 Compare March 18, 2021 14:56
@abhi-agg
Copy link
Author

@ugermann @kpu @XapaJIaMnu I have made lot of improvements, answered most of your concerns in the comments and added commits to resolve them. I just need to run regression tests now. It would be great if you can review again. It looks in much better shape now 👍

@abhi-agg abhi-agg requested a review from ugermann March 18, 2021 15:43
abhi-agg and others added 15 commits March 22, 2021 13:39
 - unistd.h should be included for unix system as well

 - Otherwise, the compilation fails on WASM
 - Disabled code (by compile definition DECODER_ONLY) that is not
   required for inference and doesn't compile on wasm

 - Introduced cmake option "COMPILE_DECODER_ONLY" to build
   only marian decoder
  - CMAKE modifications
   -- onnxjs is added as a target to be linked if
      a wasm compatible sgemm routine is to be used
   -- It's compilation is protected by cmake option
      USE_WASM_COMPATIBLE_BLAS (by default off)

  - source file modifications to call appropriate sgemm
    routine of onnxjs
 - It compiles successfully as wasm target using emscripten toolchain
 - It contains changes to use wasm wormhole
 - wasm compilation can be enabled by cmake option COMPILE_WASM
 - cmake changes
 - source changes
…ian-nmt#779)

* copy changes from commit 4df92f2
* add comments for better understanding
* restore the newline at the end of file and add this changes in changelog.md
 - COMPILE_WITH_PTHREADS cmake option to enable/disable compiling
   the multi-threaded version of marian
 - It is deprecated begining C++17
 - It avoids building unwanted targets
 - Compile only static library required for inference
@abhi-agg
Copy link
Author

I have rebased this PR on master and ran the regression tests on:

  1. Latest master (commit)
  2. On this PR

I am getting the exact same status for both of them which is following:

Failed:
  - tests/examples/unit-tests/test_unit_tests.sh
  - tests/scorer/scores/test_scores_cpu.sh  

1st test is failing because I have CUDA-11 and cusparseSgemmi is not available in CUDA VERSION >= 11 (see the logs below). I believe this will pass for CUDA-10.

2nd test failure is already known.

For the sake of completeness, I am sharing the failure logs for the 1st test case 👍

Failure logs for test_unit_tests.sh
+ set -e
+ cd /data/rw/pit/src/build-original
+ env CTEST_OUTPUT_ON_FAILURE=1 ctest --force-new-ctest-process
Test project /data/rw/pit/src/build-original
   Start 1: graph_tests
1/6 Test #1: graph_tests ......................   Passed    0.36 sec
   Start 2: operator_tests
2/6 Test #2: operator_tests ...................***Exception: Child aborted  0.99 sec
[2021-03-22 17:28:32] Error: cusparseSgemmi is not available in CUDA VERSION >= 11.
[2021-03-22 17:28:32] Error: Aborted from cusparseStatus_t marian::gpu::cusparseSgemmiEx(cusparseHandle_t, int, int, int, int, const float*, const float*, int, const float*, const int*, const int*, const float*, float*, int) in /data/rw/pit/src/src/tensors/gpu/prod.cpp:367

[CALL STACK]
[0x55cb46eed9c5]    marian::gpu::  CSRProd  (IntrusivePtr<marian::TensorBase>,  std::shared_ptr<marian::Allocator>,  IntrusivePtr<marian::TensorBase> const&,  IntrusivePtr<marian::TensorBase> const&,  IntrusivePtr<marian::TensorBase> const&,  IntrusivePtr<marian::TensorBase> const&,  bool,  bool,  float) + 0x1d65
[0x55cb4507057d]                                                       + 0x40d57d
[0x55cb450a1fa7]    std::_Function_handler<void (),marian::CSRDotNodeOp::forwardOps()::{lambda()#1}>::  _M_invoke  (std::_Any_data const&) + 0x207
[0x55cb4514f6d1]    marian::Node::  forward  ()                        + 0x211
[0x55cb45062d3b]    marian::ExpressionGraph::  forward  (std::__cxx11::list<IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>,std::allocator<IntrusivePtr<marian::Chainable<IntrusivePtr<marian::TensorBase>>>>>&,  bool) + 0x22b
[0x55cb450647bc]    marian::ExpressionGraph::  forwardNext  ()         + 0x23c
[0x55cb44f9a197]    void  tests  <float>(marian::DeviceType,  marian::Type) + 0xb5b7
[0x55cb44e9db8f]    Catch::RunContext::  invokeActiveTestCase  ()      + 0x2f
[0x55cb44ead18d]    Catch::RunContext::  runCurrentTest  (std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>&,  std::__cxx11::basic_string<char,std::char_traits<char>,std::allocator<char>>&) + 0x47d
[0x55cb44ec2030]    Catch::RunContext::  runTest  (Catch::TestCase const&) + 0x1a0
[0x55cb44ec8b5c]    Catch::Session::  runInternal  ()                  + 0xa3c
[0x55cb44ec9ada]    Catch::Session::  run  ()                          + 0x1a
[0x55cb44e69186]    main                                               + 0x86
[0x7f13b6a18b97]    __libc_start_main                                  + 0xe7
[0x55cb44e9539a]    _start                                             + 0x2a


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
run_operator_tests is a Catch v2.10.1 host application.
Run with -? for options

-------------------------------------------------------------------------------
Expression graph supports basic math operations (gpu)
 csr-dot product
-------------------------------------------------------------------------------
/data/rw/pit/src/src/tests/units/operator_tests.cpp:456
...............................................................................

/data/rw/pit/src/src/tests/units/operator_tests.cpp:456: FAILED:
 {Unknown expression after the reported line}
due to a fatal error condition:
 SIGABRT - Abort (abnormal termination) signal

===============================================================================
test cases:  1 |  0 passed | 1 failed
assertions: 81 | 80 passed | 1 failed


   Start 3: rnn_tests
3/6 Test #3: rnn_tests ........................   Passed    0.79 sec
   Start 4: attention_tests
4/6 Test #4: attention_tests ..................   Passed    0.69 sec
   Start 5: fastopt_tests
5/6 Test #5: fastopt_tests ....................   Passed    0.02 sec
   Start 6: utils_tests
6/6 Test #6: utils_tests ......................   Passed    0.02 sec

83% tests passed, 1 tests failed out of 6

Total Test time (real) =   3.03 sec

The following tests FAILED:
         2 - operator_tests (Child aborted)
Errors while running CTest

TL;DR: This PR doesn't introduce any additional regression test failures to current master.

@ugermann @kpu Please review the PR as all the changes that you requested are in. I would like to merge it asap.

@jerinphilip
Copy link

jerinphilip commented Mar 23, 2021

TL;DR: This PR doesn't introduce any additional regression test failures to current master.

It does not strictly suffice here. Plenty of (CMake) switches in this branch has previously led to failure in linkages (quantizer.cpp for example), from bergamot-translator switches. Someone trying to build shouldn't be able to mess with switches and get a non-working build. Normally Makefiles for separate architectures (Makefile, Makefile.mingw) etc are supposed to be loaded or unloaded as a whole, but this PR makes it goes through a lot many switches making it painful to review. I would personally be scared to branch off browsermt/marian-dev for development if this gets merged in (due to the sheer complexity it induces on the build files which will be tricky to edit after).

Your use case is only target WASM architecture (which you're testing and reporting), but the build logic change seems to generalize many other combinations - decoder only, no exceptions, etc etc.

I would like to merge it asap.

For all complaints of big PR elsewhere with even tinier set of changes, this is quite interesting.

CMakeLists.txt Outdated Show resolved Hide resolved
src/layers/generic.cpp Outdated Show resolved Hide resolved
 - Dockerfile cleanup
 - README cleanup
 - benchmark script cleanup
 - COMPILE_WASM encapsulates all cmake options specific to wasm builds
@abhi-agg
Copy link
Author

abhi-agg commented Mar 23, 2021

TL;DR: This PR doesn't introduce any additional regression test failures to current master.

It does not strictly suffice here. Plenty of (CMake) switches in this branch has previously led to failure in linkages (quantizer.cpp for example), from bergamot-translator switches. Someone trying to build shouldn't be able to mess with switches and get a non-working build. Normally Makefiles for separate architectures (Makefile, Makefile.mingw) etc are supposed to be loaded or unloaded as a whole, but this PR makes it goes through a lot many switches making it painful to review. I would personally be scared to branch off browsermt/marian-dev for development if this gets merged in (due to the sheer complexity it induces on the build files which will be tricky to edit after).

Your use case is only target WASM architecture (which you're testing and reporting), but the build logic change seems to generalize many other combinations - decoder only, no exceptions, etc etc.

@jerinphilip The recent commits in this PR addresses your concerns 👍

For all complaints of big PR elsewhere with even tinier set of changes, this is quite interesting.

The changes in this PR has been a result of the wasm exploration that started 5 months ago. I had no option but to create a single PR as the technical debt is too much. Additionally, this PR doesn't add implementation in source files (as happened "elsewhere"). It mostly adds bunch of pre-processor directives (which are easier to review) and some cmake changes.

PS: I don't like this PR myself but I have no option here 👍

@abhi-agg abhi-agg requested a review from kpu March 23, 2021 12:59
CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
set(BLAS_VENDOR "ONNX-SGEMM")
add_compile_definitions(COMPILE_CPU BLAS_FOUND WASM_COMPATIBLE_BLAS)
else(USE_WASM_COMPATIBLE_BLAS)
set(EXT_LIBS ${EXT_LIBS} intgemm)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit could have been outside the if statement

src/3rd_party/CMakeLists.txt Outdated Show resolved Hide resolved
@@ -5,7 +5,9 @@
#ifndef LIBCNPY_H_
#define LIBCNPY_H_

#if !defined(WASM)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun fact: now that @XapaJIaMnu has binary working, you can get rid of cnpy entirely.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but for compilation we will still have to keep this ifdef. Right?

@abhi-agg abhi-agg requested a review from kpu March 23, 2021 15:51
@abhi-agg
Copy link
Author

@kpu I made the final changes addressing all your concerns. A quick final review and approval will be much appreciated 👍

@abhi-agg
Copy link
Author

Before submitting pull requests please make sure that

* all changes are properly documented in CHANGELOG.md

* all regression tests pass

* all files with changes have clutter / commented-out code removed (or contain a convincing explanation why the commented-out code should be kept around

Without at least these three conditions having been met, chances of approval are very slim.

@ugermann Could you please approve the PR as I already addressed all your review comments?

@ugermann
Copy link
Member

I'd suggest a squashed merge.

@abhi-agg
Copy link
Author

This PR is also connected to the #5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Port marian to WASM for inference
8 participants