This is the code for the AI painting that talks to you. The following AI generated painting is framed in our office.
The plan is to put a tablet behind the painting and have it talks when it detects a person infront of it, yes! like the paintings in harrypotter movies.
- Clone the repo
- Create a
ops/.env
file and add the following variables
OPENAI_API_KEY=<openai-api-key>
DUBVERSE_API_KEY=<dub-verse-api-key>
DUBVERSE_URL=https://macaque.dubverse.ai/api/merlin/services/tts/text-to-speech
LLAVA_URL=https://localhost:11434/api/generate
STATIC_FILE_URL=http://localhost:8000/static
API_URL=http://localhost:8000/respond_voice/
- Install the dependencies
pip install -r requirements.txt
- Run the following commands to get the yolov3 weights
wget https://pjreddie.com/media/files/yolov3.weights -O coco_model/yolov3.weights
- Run the web server
uvicorn app:app --reload
-
Open the browser and go to
http://localhost:8000/
-
Move in and move out of the camera to see the painting talks to you.
- We use YOLO to detect people infront of the camera
- LLAVA model for making a comment about the person based on the prompt
- OpenAI TTS and other TTS models to make the painting talk
- The frontend code will keep capturing the video frames and send it to the AI API for processing.
- Once the AI API return an audio, the frontend code will pause sending the video frames and play the audio.
-
Modify the frontend code to replace the video feed with robot face animation based on the requirements in this documemt
-
Modify the prompt and test to find the right one that works for the painting.