Whisper Assistant: A Voice-Powered Assistant and Transcriber

Made by Jørgen Kristiansen Sandhaug and Henrik Skog

Whisper Assistant: A Voice-Powered Assistant and Transcriber

Whisper Assistant is a highly efficient and customizable voice-powered assistant designed to boost your productivity by seamlessly converting speech to text and performing a variety of tasks through voice commands. Built on the cutting-edge OpenAI Whisper API and the flexible LangChain framework, Whisper Assistant offers an easy-to-use interface for real-time voice transcription. It is also possible to integrate with large language models to execute complex tasks in a voice conversation fashion by creating custom agents in LangChain.

Features

Highly Accurate Voice Transcription: Utilizes the OpenAI Whisper API for the most accurate, real-time speech-to-text conversion for any language.
Customizable Shortcuts: Easily set up your own keyboard shortcuts to activate the transcription feature or start talking to different assistants, making it effortlessly accessible anytime.
Versatile Voice Commands: Execute tasks, manipulate text, or control your terminal directly with your voice using the LangChain framework.
Optimized for Efficiency: Smartly stitches together audio segments to minimize costs, reduce transcription time, and improve accuracy, even in speech with gaps.
Visual Feedback: A status icon in the menu bar shows the app's current state—whether it's recording, processing, or ready for your next command.
Designed for macOS: Tailored for use on Apple Silicon Macs, ensuring smooth operation and compatibility.

Getting Started

Prerequisites

Python 3.10 (or earlier possibly earlier) (cchardet does not seem to work well with Python 3.11 on Apple Silicon Macs).
An OpenAI API key for using the Whisper API.

Installation

Clone the repository:

git clone https://github.com/Futhark-AS/whisper_assistant.git
cd whisper_assistant

Setup Conda Environment: (This is the recommended way to set up the environment, as there has been many problems with the py2app build when not using conda)
- Create a new conda environment:
```
conda create -n whisper-assistant python=3.10
conda activate whisper-assistant
conda install pip
```
Install Dependencies:
```
pip install -r requirements.txt
```
Resolve Dependencies:
- For Apple Silicon Mac users encountering libffi issues:
```
brew install libffi
```
  Follow the post-installation instructions from Homebrew to set up the necessary environment variables.
Setup the environment:
- Create a .env file in the project root.
- Add your OpenAI API key: OPEN_AI_API_KEY=your_api_key_here.
Configure Your Shortcut:
- Copy the shortcuts.py.template file from the config folder.
- Rename it to shortcuts.py and customize it with your preferred keyboard shortcut. Place this file in the config folder.
Build the Application:
```
python setup.py py2app -A
```
Allow your terminal to monitor input and microphone:
- If on Mac, go to System Preferences -> Security & Privacy -> Privacy -> Accessibility.
- Add your terminal to the list of apps.

Run Whisper Assistant:

./dist/whisperGPT.app/Contents/MacOS/whisperGPT

How to Use

Press one of your configured shortcuts to start recording.
Whisper Assistant will transcribe your speech and copy the text to your clipboard
If the shortcut you pressed is for an action/assistant, the will run with the transcribed text as input.

Contributing

We welcome contributions and suggestions! Feel free to fork the repository, make your changes, and submit a pull request. For major changes or questions, please open an issue first to discuss what you would like to change.

License

Distributed under the MIT License. See LICENSE for more information.

Acknowledgments

OpenAI Whisper API for the speech-to-text engine.
LangChain framework for enabling complex command executions.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
actions		actions
audio		audio
config		config
prompts		prompts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
lock_requirements.sh		lock_requirements.sh
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py
utils.py		utils.py
whisperGPT.py		whisperGPT.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper Assistant: A Voice-Powered Assistant and Transcriber

Features

Getting Started

Prerequisites

Installation

How to Use

Contributing

License

Acknowledgments

About

Releases

Packages

Contributors 2

Languages

License

Futhark-AS/whisper_assistant

Folders and files

Latest commit

History

Repository files navigation

Whisper Assistant: A Voice-Powered Assistant and Transcriber

Features

Getting Started

Prerequisites

Installation

How to Use

Contributing

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages