SDK for the development of Automatic Speech Recognition (ASR) and Text-to-Speech Synthesis (TTS) applications
- Hotword Wakeup
- Online speech interaction, which includes
- ASR
- Natural Language Understanding
- Dialogue Management
- Vertical Search
- TTS
- Offline ASR
- Online/Offline mixed ASR
- Multi-Keywords Activation
- Online TTS
- Offline TTS
- Online/Offline mixed TTS
- Mandarin Chinese
- American English (Coming soon)
- Cantonese (Coming soon)
The SDK is validated on the following platforms:
- Ubuntu Linux on x86_64
- Linux on ARMv7 (Raspberry Pi)
- Linux on ARMv8 (Raspberry Pi)
File/Directoy | Purpose |
---|---|
doc | Contains SDK documentation |
include | Contains the SDK header file (speech_sdk.h) |
lib | Contains the library (libmobvoisdk.so) for different platforms |
.mobvoi | Contains the configurations for the SDK. It is a hidden directory |
samples | Sample code and binaries built based on the SDK |
- The .mobvoi (hidden directory) contains info for SDK to run. SDK also writes to the directory. So you should install it to a writable directory
- Pass the location (.mobvoi directory's parent directory) to mobvoi_sdk_init() in your program
- Create your program according to the SDK documentation and the sample code in samples/src/
- When building your program, link libmobvoisdk.so provided in lib/{arch}/
- When running your program, specify the location for libmobvoisdk.so to LD_LIBRARY_PATH environment variable
Several sample programs are provided in the samples/ directory:
Program | Purpose |
---|---|
asr | Shows how to do hotword wakeup and speech recognition |
mix_tts | Shows how to make use of the TTS function |
multi_keywords | Shows how to make use of the multi-keywords activation function |
The binaries on different platforms are also provided.
To run the binaries, specify the location of the libmobvoisdk.so to LD_LIBRARY_PATH. The following shows how to run the x86_64 version asr:
cd samples/bin LD_LIBRARY_PATH=../../lib/x86_64 ./x86_64_asr online
Note:
- The wakeup word is "Ni Hao Wen Wen" (你好问问)
Hints for SDK trouble shooting:
- SDK generates logs when it runs. So you can examine the logs to get clues
- You can get more detailed logs by invoking mobvoi_set_vlog_level()
- Invoking mobvoi_set_vlog_level(3) also saves the received PCM audio streams to .mobvoi/audio_dump/record.pcm
Please refer to the online SDK documentation.