Mobvoi SDK for Speech Recognition and Text-to-Speech Synthesis

SDK for the development of Automatic Speech Recognition (ASR) and Text-to-Speech Synthesis (TTS) applications

Features

Hotword Wakeup
Online speech interaction, which includes
- ASR
- Natural Language Understanding
- Dialogue Management
- Vertical Search
- TTS
Offline ASR
Online/Offline mixed ASR
Multi-Keywords Activation
Online TTS
Offline TTS
Online/Offline mixed TTS

Language support

Mandarin Chinese
American English (Coming soon)
Cantonese (Coming soon)

Supported platforms

The SDK is validated on the following platforms:

Ubuntu Linux on x86_64
Linux on ARMv7 (Raspberry Pi)
Linux on ARMv8 (Raspberry Pi)

Directory hierarchy

File/Directoy	Purpose
doc	Contains SDK documentation
include	Contains the SDK header file (speech_sdk.h)
lib	Contains the library (libmobvoisdk.so) for different platforms
.mobvoi	Contains the configurations for the SDK. It is a hidden directory
samples	Sample code and binaries built based on the SDK

Usage

The .mobvoi (hidden directory) contains info for SDK to run. SDK also writes to the directory. So you should install it to a writable directory
Pass the location (.mobvoi directory's parent directory) to mobvoi_sdk_init() in your program
Create your program according to the SDK documentation and the sample code in samples/src/
When building your program, link libmobvoisdk.so provided in lib/{arch}/
When running your program, specify the location for libmobvoisdk.so to LD_LIBRARY_PATH environment variable

Samples

Several sample programs are provided in the samples/ directory:

Program	Purpose
asr	Shows how to do hotword wakeup and speech recognition
mix_tts	Shows how to make use of the TTS function
multi_keywords	Shows how to make use of the multi-keywords activation function

The binaries on different platforms are also provided.

To run the binaries, specify the location of the libmobvoisdk.so to LD_LIBRARY_PATH. The following shows how to run the x86_64 version asr:

cd samples/bin
LD_LIBRARY_PATH=../../lib/x86_64 ./x86_64_asr online

Note:

The wakeup word is "Ni Hao Wen Wen" (你好问问)

Trouble shooting

Hints for SDK trouble shooting:

SDK generates logs when it runs. So you can examine the logs to get clues
You can get more detailed logs by invoking mobvoi_set_vlog_level()
- Invoking mobvoi_set_vlog_level(3) also saves the received PCM audio streams to .mobvoi/audio_dump/record.pcm

Documentation

Please refer to the online SDK documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mobvoi SDK for Speech Recognition and Text-to-Speech Synthesis

Features

Language support

Supported platforms

Directory hierarchy

Usage

Samples

Trouble shooting

Documentation

About

Releases 2

Packages

Contributors 5

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
.mobvoi		.mobvoi
doc		doc
include		include
lib		lib
samples		samples
README.md		README.md

mobvoi/speech_sdk

Folders and files

Latest commit

History

Repository files navigation

Mobvoi SDK for Speech Recognition and Text-to-Speech Synthesis

Features

Language support

Supported platforms

Directory hierarchy

Usage

Samples

Trouble shooting

Documentation

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 5

Languages

Packages