Sub-Second Latency
Streaming STT → LLM → TTS pipeline. The AI starts speaking before finishing its thought. Typical first-audio in ~800ms.
Build, deploy, and manage AI voice agents with sub-second latency. SaaS dashboard, multi-tenant auth, relational provider architecture, plan-based access control.
| Section | Description |
|---|---|
| Getting Started | Install, configure, seed data, and run locally in 10 minutes |
| Architecture | System overview, voice pipeline, data flow diagrams |
| Database Schema | All collections, fields, indexes, relationships, and defaults |
| Authentication | Signup/signin flow, JWT tokens, API keys, org status lifecycle |
| Providers | Unified provider system — global vs client, encryption, registry |
| Plans & Billing | Plan tiers, model access, features, Redis caching, pricing |
| Frontend | Next.js dashboard — pages, components, contexts, voice UI |
| REST API | All HTTP endpoints with request/response examples |
| WebSocket API | Voice protocol, message types, audio format, reconnection |
| Environment Variables | Every env var with description, defaults, and examples |
| Admin Scripts | Seed scripts, migrations, mongosh commands, bash helpers |
| Deployment | Docker, Nginx, scaling, production checklist |
| Client Integration | Embed voice in your app — WebSocket + REST from client code |
| Twilio | Phone call integration via Twilio Media Streams |
git clone <your-repo-url> voicex
cd voicex
pnpm install
# Configure
cp backend/.env.example backend/.env.local
# Edit backend/.env.local with your API keys
# Seed database (plans, providers, test data)
bash scripts/seed-plans.sh
bash scripts/seed-global-providers.sh
bash scripts/seed.sh
# Run
pnpm devOpen http://localhost:3000 → Sign in with a@a.dev / 12345678 (test account).
User speaks → Deepgram STT → LLM (Groq/OpenAI/Ollama) → TTS (ElevenLabs/OpenAI/Edge) → User hears AIThe entire pipeline streams in real-time. Each sentence is spoken as soon as it's generated — while the LLM continues producing the next sentence.