There are plenty of ways to chat with AI, but an open-source project that lets you talk to a Live2D avatar with actual voice? This is basically the only decent one.
Open-LLM-VTuber hit GitHub Trending today. 7,546 stars, 978 forks, 912 commits.
What it is
In one sentence: use any LLM as the backend, Live2D for the face, microphone for ears, speakers for mouth — build an AI virtual streamer that runs locally.
Core features:
- Hands-free voice interaction: just talk, no button pressing
- Voice interruption: cut in mid-conversation without waiting for it to finish
- Cross-platform local: Windows, macOS, Linux all supported
- Any OpenAI-compatible API: Ollama, LM Studio, cloud models all work
Architecture
The pipeline is a classic voice dialogue flow:
Microphone → ASR (Whisper) → LLM → TTS → Speakers
↓
Live2D expression driver
ASR uses Whisper (sherpa-onnx supports multiple engines), LLM backend is compatible with all OpenAI-format APIs, TTS connects to various synthesis services.
Live2D converts text responses into expressions and lip-sync animations — this is the soul of the project. Without it, you just have a voice assistant. With it, your AI has a "face."
Activity level
912 commits, 19 tags, 88 open issues, 32 PRs. Not top-tier activity, but a steady maintenance rhythm.
Interestingly, the repo has .cursor/rules and .gemini directories — the developers are using AI to build this project too.
Use cases
- Personal entertainment: an AI companion at home
- Live streaming: 24/7 AI virtual streamer, auto-replying to chat
- Content creation: AI-driven virtual character short videos
- Language learning: practice speaking with an infinitely patient virtual character
Reality check
Running the full stack locally needs decent hardware — ASR, LLM inference, TTS, and Live2D rendering running simultaneously taxes both CPU and GPU. Using a cloud LLM API reduces the local burden, but then latency and privacy become different questions.
It is called "Open-LLM-VTuber," but honestly, it is far from Neuro-sama-level AI streamers. But Neuro-sama is closed-source and took extensive custom training. Open-LLM-VTuber gives you the infrastructure — build whatever you want on top of it.
Main sources: