bergamot-translator enables client-side machine translation on the consumer-grade machine. Developed as part of the Bergamot project, the library builds on top of:
- Marian: Neural Machine Translation (NMT) library. This repository uses the fork browsermt/marian-dev, which optimizes for faster inference on intel CPUs and adds WebAssembly support.
- student models: Compressed neural models that enable translation on consumer-grade devices.
bergamot-translator wraps marian to add sentence splitting, on-the-fly
batching, HTML markup translation, and a more suitable API to develop
applications. Development continuously tests the functionality on Windows,
MacOS and Linux operating systems on x86_64
, and WebAssembly cross-platform
target. aarch64
native support is available for Android and Mac M1 (early
stages).
bergamot-translator uses the CMake build system. Use the library target
bergamot-translator
in projects that intend to build applications on top of
the library. Latest developer documentation is available at
browser.mt/docs/main.
We provide bindings to Python and JavaScript through WebAssembly.
This repository provides a python module which also comes with a command-line interface to use available models. The module is available to install through PyPI for Linux and MacOS at the moment.
python3 -m pip install bergamot
Find an example for a quick-start on Colab below:
For more comprehensive documentation of using the python as a library see browser.mt/docs/main/python.html.
WebAssembly and JavaScript support is developed for an offline-translation
browser extension intended for use in Mozilla Firefox web-browser. emscripten
is used to compile C/C++ sources to WebAssembly. You may use the pre-built
bergamot-translator-worker.js
and bergamot-translator-worker.wasm
available
from releases.
WebAssembly is available in Firefox and Google Chrome. It is also possible to use the components through NodeJS. For an example of how to use this, please look at this Hello World example. For a complete demo that works locally in your modern browser see mozilla.github.io/translate.
WebAssembly is slower due to lack of optimized matrix-multiply primitives. Nightly builds of Mozilla Firefox have faster GEMM (Generalized Matrix Multiplication) capabilities and are expected to be slightly faster. The browser environment can also use Native Messaging as a third option to translate web-pages locally, which is the fastest at the moment.
The following chart created from jelmervdl/firefox-translations/pull#19 shows how each method compares against one another in terms of words-per-second (wps).
For a cross platform batteries included GUI application that builds on top of bergamot-translator, checkout translateLocally. translateLocally provides model downloading from a repository and curates models.
Mozilla, as part of Bergamot Project builds and maintains firefox-translations. The official Firefox extension uses WebAssembly.
See jelmervdl/firefox-translations for Chrome extension (Manifest V2), which in addition to WebAssembly, supports faster local translation via Native Messaging supported by translateLocally.
We appreciate all contributions. There are several ways to contribute to this project.
- Code: Improvements to the source are always welcome. If you are planning to contribute back bug-fixes to this repository, please do so without any further discussion. If you plan to contribute new features, utility functions, or extensions to the core, please discuss the feature with us first.
- Models: Bergamot, being a wrapper on marian should comfortably work with models trained using marian. We prefer models that are trained following the recipe in browsermt/students so that they are smaller in size and enable fast inference on the consumer-grade machine.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825303. |
The builds generate library that can be integrated to any project. All the public header files are specified in src
folder.
A short example of how to use the APIs is provided in app/bergamot.cpp
file.