-
Notifications
You must be signed in to change notification settings - Fork 14
Speech Recognizer
This page contains the instructions to setup and run speech recognition module in Greta.
The Speech Recognizer component identifies full phrase in the spoken language as a person is speaking, and converts them into machine readable format. This module mainly uses the google automatic speech recognition and rely on selenium.WebDriver which follows the W3C Recommendation https://www.w3.org/TR/webdriver1/ .
The module can send the recognized utterance to other modules using ActiveMQ messages.
(Please go at https://kb.firedaemon.com/support/solutions/articles/4000121705-openssl-3-0-and-1-1-1-binary-distributions-for-microsoft-windows and use the Windows installer in the "Download OpenSSL 3.0 Windows Installer" section)
The following command can be used to create the certificate. When asked for
the certificate information, enter localhost
for the
Common Name/CN; the other values do not matter as much.
openssl req -new -newkey rsa:4096 -days 5000 -nodes -x509 -sha512 -out cert.crt -keyout cert.key
The resulting files must then be converted into a PKCS #12 file. This can be done by issuing the following command and entering an empty password.
openssl pkcs12 -export -in cert.crt -inkey cert.key -out $GRETA_HOME/bin/Common/Data/ASRResources/cert.p12 -passout pass:
Once the certificate is generated, it is required to get Chrome to accept the certificate.
This component communicates with other modules using ActiveMQ messaging. The activeMQ broker service must be running befor the speech recognition component is instantiated. The activeMQ broker service can be launched through moduler by adding NetworkConnections->ActiveMQ->Broker.
Once the speech recognizer component is launched, the Chrome web browser will start automatically with the URL https://localhost:8088. It will then ask for the permission to use microphone and it must be granted.
BY default the module recognizes English(UK) language. It is possible to change the speech recognition language directly in the Browser. For the moment, the currently available options are English (UK), English (US) and French languages.
#Output Format The speech recognizer produces the output in JSON string format.
jsonString = {
NumWords: transcript.trim().split(/\s+/).length, //Number of words
inputDuration: input_dur, //input duration
inputStartTime: 0, //input start time (default)
inputEndTime: input_dur, //input End time
TRANSCRIPT: transcript.toUpperCase() //transcript in upper case letters
};
This module communicates with activeMQ server with the following default configuration:
Host : localhost
Port : 61616
Request Topic : GRETA/ASR/REQUEST
Response Topic : GRETA/ASR/RESPONSE
Thus, any module that wants to listen to the output of the speech recognizer, must subscribe to the port 61616 with the input topic GRETA/ASR/RESPONSE.
Advanced
- Generating New Facial expressions
- Generating New Gestures
- Generating new Hand configurations
- Torso Editor Interface
- Creating an Instance for Interaction
- Create a new virtual character
- Creating a Greta Module in Java
- Modular Application
- Basic Configuration
- Signal
- Feedbacks
- From text to FML
- Expressivity Parameters
- Text-to-speech, TTS
-
AUs from external sources
-
Large language model (LLM)
-
Automatic speech recognition (ASR)
-
Extentions
-
Integration examples
Nothing to show here