Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation of PocketSphinx and program quitting issues #607

Open
wants to merge 25 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions .coveragerc

This file was deleted.

4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*.pyc

# C extensions
*.so
Expand Down Expand Up @@ -65,3 +66,6 @@ target/

# SublimeLinter config file
.sublimelinterrc

# Atom editor remote sync config
.remote-sync.json
22 changes: 0 additions & 22 deletions .travis.yml

This file was deleted.

6 changes: 5 additions & 1 deletion AUTHORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,8 @@ Jasper. Thanks a lot!
*Please alphabetize new entries*

We'd also like to thank all the people who reported bugs, helped
answer newbie questions, and generally made Jasper better.
answer newbie questions, and generally made Jasper better.

Judy's author:

Nick Lee
49 changes: 0 additions & 49 deletions CONTRIBUTING.md

This file was deleted.

265 changes: 246 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,263 @@
jasper-client
=============
## I no longer maintain this project. It is left here for historical purposes.

[![Build Status](https://travis-ci.org/jasperproject/jasper-client.svg?branch=master)](https://travis-ci.org/jasperproject/jasper-client) [![Coverage Status](https://img.shields.io/coveralls/jasperproject/jasper-client.svg)](https://coveralls.io/r/jasperproject/jasper-client) [![Codacy Badge](https://www.codacy.com/project/badge/3a50e1bc2261419894d76b7e2c1ac694)](https://www.codacy.com/app/jasperproject/jasper-client)
# Judy - Simplified Voice Control on Raspberry Pi

Client code for the Jasper voice computing platform. Jasper is an open source platform for developing always-on, voice-controlled applications.
Judy is a simplified sister of [Jasper](http://jasperproject.github.io/),
with a focus on education. It is designed to run on:

Learn more at [jasperproject.github.io](http://jasperproject.github.io/), where we have assembly and installation instructions, as well as extensive documentation. For the relevant disk image, please visit [SourceForge](http://sourceforge.net/projects/jasperproject/).
**Raspberry Pi 3**
**Raspbian Jessie**
**Python 2.7**

## Contributing
Unlike Jasper, Judy does *not* try to be cross-platform, does *not* allow you to
pick your favorite Speech-to-Text engine or Text-to-Speech engine, does *not*
come with an API for pluggable modules. Judy tries to keep things simple, lets
you experience the joy of voice control with as little hassle as possible.

If you'd like to contribute to Jasper, please read through our **[Contributing Guide](CONTRIBUTING.md)**, which outlines the philosophies to preserve, tests to run, and more. We highly recommend reading through this guide before writing any code.
A **Speech-to-Text engine** is a piece of software that interprets human voice into
a string of text. It lets the computer know what is being said. Conversely,
a **Text-to-Speech engine** converts text into sound. It allows the computer to
speak, probably as a response to your command.

The Contributing Guide also outlines some prospective features and areas that could use love. However, for a more thorough overview of Jasper's direction and goals, check out the **[Product Roadmap](https://github.com/jasperproject/jasper-client/wiki/Roadmap)**.
Judy uses:

Thanks in advance for any and all work you contribute to Jasper!
- **[Pocketsphinx](http://cmusphinx.sourceforge.net/)** as the Speech-to-Text engine
- **[Pico](https://github.com/DougGore/picopi)** as the Text-to-Speech engine

## Support
Additionally, you need:

If you run into an issue or require technical support, please first look through the closed and open **[GitHub Issues](https://github.com/jasperproject/jasper-client/issues)**, as you may find a solution there (or some useful advice, at least).
- a **Speaker** to plug into Raspberry Pi's headphone jack
- a **USB Microphone**

If you're still having trouble, the next place to look would be the new **[Google Group support forum](https://groups.google.com/forum/#!forum/jasper-support-forum)** or join the `#jasper` IRC channel on **chat.freenode.net**. If your problem remains unsolved, feel free to create a post there describing the issue, the steps you've taken to debug it, etc.
**Plug them in.** Let's go.

## Contact
## Know the Sound Cards

Jasper's core developers are [Shubhro Saha](http://www.shubhro.com), [Charles Marsh](http://www.crmarsh.com) and [Jan Holthuis](http://homepage.ruhr-uni-bochum.de/Jan.Holthuis/). All of them can be reached by email at [[email protected]](mailto:[email protected]), [[email protected]](mailto:[email protected]) and [[email protected]](mailto:[email protected]) respectively. However, for technical support and other problems, please go through the channels mentioned above.
```
$ more /proc/asound/cards
0 [ALSA ]: bcm2835 - bcm2835 ALSA
bcm2835 ALSA
1 [Device ]: USB-Audio - USB PnP Audio Device
USB PnP Audio Device at usb-3f980000.usb-1.4, full speed
```

For a complete list of code contributors, please see [AUTHORS.md](AUTHORS.md).
The first is Raspberry Pi's built-in sound card. It has an index of 0. (Note
the word `ALSA`. It means *Advanced Linux Sound Architecture*. Simply put, it
is the sound driver on many Linux systems.)

## License
The second is the USB device's sound card. It has an index of 1.

*Copyright (c) 2014-2015, Charles Marsh, Shubhro Saha & Jan Holthuis. All rights reserved.*
Your settings might be different. But if you are using Pi 3 with Jessie and have
not changed any sound settings, the above situation is likely.
For the rest of discussions, I am going to assume:

Jasper is covered by the MIT license, a permissive free software license that lets you do anything you want with the source code, as long as you provide back attribution and ["don't hold \[us\] liable"](http://choosealicense.com). For the full license text see the [LICENSE.md](LICENSE.md) file.
- Built-in sound card, **index 0** → headphone jack → speaker
- USB sound card, **index 1** → microphone

*Note that this licensing only refers to the Jasper client code (i.e., the code on GitHub) and not to the disk image itself (i.e., the code on SourceForge).*
The index is important. It is how you tell Raspberry Pi where the speaker and
microphone is.

*If your sound card indexes are different, adjust command arguments
accordingly in the rest of this page.*

## Make sure sound output to headphone jack

Sound may be output via HDMI or headphone jack. We want to use the headphone
jack.

Enter `sudo raspi-config`. Select **Advanced Options**, then **Audio**. You are
presented with three options:

- `Auto` should work
- `Force 3.5mm (headphone) jack` should definitely work
- `Force HDMI` won't work

## Turn up the volume

A lot of times when sound applications seem to fail, it is because we forget to
turn up the volume.

Volume adjustment can be done with `alsamixer`. This program makes use of some
function keys (`F1`, `F2`, etc). For function keys to function properly on
PuTTY, we need to change some settings (click on the top-left corner of the PuTTY
window, then select **Change Settings ...**):

1. Go to **Terminal** / **Keyboard**
2. Look for section **The Function keys and keypad**
3. Select **Xterm R6**
4. Press button **Apply**

Now, we are ready to turn up the volume, for both the speaker and the mic:

```
$ alsamixer
```
`F6` to select between sound cards
`F3` to select playback volume (for speaker)
`F4` to select capture volume (for mic)
`⬆` `⬇` arrow keys to adjust
`Esc` to exit

*If you unplug the USB microphone at any moment, all volume settings
(including that of the speaker) may be reset. Make sure to check the volume
again.*

Hardware all set, let's test them.

## Test the speaker

```
$ speaker-test -t wav
```

Press `Ctrl-C` when done.

## Record a WAV file

Enter this command, then speak to the mic, press `Ctrl-C` when you are
finished:

```
$ arecord -D plughw:1,0 abc.wav
```

`-D plughw:1,0` tells `arecord` where the device is. In this case, device is
the mic. It is at index 1.

`plughw:1,0` actually refers to "Sound Card index 1, Subdevice 0", because a
sound card may house many subdevices. Here, we don't care about subdevices and
always give it a `0`. The only important index is the sound card's.

## Play a WAV file

```
$ aplay -D plughw:0,0 abc.wav
```

Here, we tell `aplay` to play to `plughw:0,0`, which refers to "Sound Card index 0,
Subdevice 0", which leads to the speaker.

If you `aplay` and `arecord` successfully, that means the speaker and microphone
are working properly. We can move on to add more capabilities.

## Install Pico, the Text-to-Speech engine

```
$ sudo apt-get install libttspico-utils
$ pico2wave -w abc.wav "Good morning. How are you today?"
$ aplay -D plughw:0,0 abc.wav
```

## Install Pocketsphinx, the Speech-to-Text engine

```
$ sudo apt-get install pocketsphinx # Jessie
$ sudo apt-get install pocketsphinx pocketsphinx-en-us # Stretch

$ pocketsphinx_continuous -adcdev plughw:1,0 -inmic yes
```

`pocketsphinx_continuous` interprets speech in *real-time*. It will spill out
a lot of stuff, ending with something like this:

```
Warning: Could not find Capture element
READY....
```

Now, **speak into the mic**, and note the results. At first, you may find it
funny. After a while, you realize it is horribly inaccurate.

For it to be useful, we have to make it more accurate.

## Configure Pocketsphinx

We can make it more accurate by restricting its vocabulary. Think of a bunch of
phrases or words you want it to recognize, and save them in a text file.
For example:
```
How are you today
Good morning
night
afternoon
```

Go to Carnegie Mellon University's [lmtool page](http://www.speech.cs.cmu.edu/tools/lmtool-new.html),
upload the text file, and compile the "knowledge base". The "knowledge base" is
nothing more than a bunch of files. Download and unzip them:

```
$ wget <URL of the TAR????.tgz FILE>
$ tar zxf <TAR????.tgz>
```

Among the unzipped products, there is a `.lm` and `.dic` file. They basically
define a vocabulary. Pocketsphinx cannot know any words outside of this vocabulary.
Supply them to `pocketsphinx_continuous`:

```
$ pocketsphinx_continuous -adcdev plughw:1,0 -lm </path/to/1234.lm> -dict </path/to/1234.dic> -inmic yes
```

Speak into the mic again, *only those words you have given*. A much better
accuracy should be achieved. Pocketsphinx finally knows what you are talking
about.

## Install Judy

```
$ sudo pip install jasper-judy
```

Judy brings Pocketsphinx's listening ability and Pico's speaking ability
together. A Judy program, on hearing her name being called, can verbally answer
your voice command. Imagine this:

You: Judy!
*Judy: [high beep]*
You: Weather next Monday?
*Judy: [low beep] 23 degrees, partly cloudy*

She can be as smart as you program her to be.

To get a Judy program running, you need to prepare a few *resources*:

- a `.lm` and `.dic` file to increase listening accuracy
- a folder in which the [beep] audio files reside

[Here are some sample resources.](https://github.com/nickoala/judy/tree/master/resources)
Download them if you want.

A Judy program follows these steps:

1. Create a `VoiceIn` object. Supply it with the microphone device,
and the `.lm` and `.dic` file.
2. Create a `VoiceOut` object. Supply it with the speaker device, and the folder
in which the [beep] audio files reside.
3. Define a function to handle voice commands.
4. Call the function `listen()`.

Here is an example that **echoes whatever you say**. Remember, you have to call
"Judy" to get her attention. After a high beep, you can say something (stay
within the vocabulary, please). A low beep indicates she heard you.
Then, she echoes what you have said.

```python
import judy

vin = judy.VoiceIn(adcdev='plughw:1,0',
lm='/home/pi/judy/resources/lm/0931.lm',
dict='/home/pi/judy/resources/lm/0931.dic')

vout = judy.VoiceOut(device='plughw:0,0',
resources='/home/pi/judy/resources/audio')

def handle(phrase):
print 'Heard:', phrase
vout.say(phrase)

judy.listen(vin, vout, handle)
```

It's that simple! Put more stuff in `handle()`. She can be as smart as you want
her to be.
Loading