This repo contains the code for a conversational robot toy.
The capabilities of the robot, as of this commit, are:
- Speech to text conversion of microphone audio in English to text. (Whisper)
- Sending the text to an LLM and getting a conversational response. (Claude)
- Converting the LLM's response to audio and playing it. (ElevenLabs)
- Performing sentiment analysis on the LLM response and lighting a green or red LED for positive or negative sentiment. (DistilBERT)
- Animating a small OLED display to illustrate whether the robot is currently listening, thinking, or speaking.
- Based on camera input, locating any faces in the frame and moving pan/tilt servos to point at the face. (OpenCV, Haar cascade)