Verbilobot is a Telegram bot written in Go that transcribes voice messages, video notes, and any other media files. It uses the Groq API to transcribe the audio, and ffmpeg to convert any incoming audio to a format that Groq is happiest to convert.
Important
However you plan to run the bot, make sure to rename the .env.example
file to .env
and fill in your Telegram Bot Token and Groq API Token.
To run the project with Docker, you need to have Docker installed on your machine. From then on, it only takes a couple of seconds.
git clone "https://github.com/bytebone/verbilobot.git" && cd verbilobot/
cp .env.example docker/.env
cd docker && nano .env
# edit the env file with your tokens and chat IDs as needed
docker compose up
To build and run the project locally, you will need to have Go and FFmpeg installed on your machine.
On Linux:
git clone "https://github.com/bytebone/verbilobot.git" && cd verbilobot/
go build -v -o verbilobot .
./verbilobot
Or on Windows:
git clone "https://github.com/bytebone/verbilobot.git" && cd verbilobot/
go build -v -o verbilobot.exe .
start verbilobot.exe
The bot usually takes around 2 seconds to come online. Once the bot is running, you can forward any audio or video files to it to start the transcription process. Thanks to the high speeds at Groq, a minute of incoming audio takes only a few moments to transcribe and return to your chat. The main bottleneck you might notice is the local transcoding, which can take a noticeable amount of time to complete.