Skip to content

VoicexReal-time AI Voice Agent Platform

Build, deploy, and manage AI voice agents with sub-second latency. SaaS dashboard, multi-tenant auth, relational provider architecture, plan-based access control.

Documentation

SectionDescription
Getting StartedInstall, configure, seed data, and run locally in 10 minutes
ArchitectureSystem overview, voice pipeline, data flow diagrams
Database SchemaAll collections, fields, indexes, relationships, and defaults
AuthenticationSignup/signin flow, JWT tokens, API keys, org status lifecycle
ProvidersUnified provider system — global vs client, encryption, registry
Plans & BillingPlan tiers, model access, features, Redis caching, pricing
FrontendNext.js dashboard — pages, components, contexts, voice UI
REST APIAll HTTP endpoints with request/response examples
WebSocket APIVoice protocol, message types, audio format, reconnection
Environment VariablesEvery env var with description, defaults, and examples
Admin ScriptsSeed scripts, migrations, mongosh commands, bash helpers
DeploymentDocker, Nginx, scaling, production checklist
Client IntegrationEmbed voice in your app — WebSocket + REST from client code
TwilioPhone call integration via Twilio Media Streams

Quick Start

bash
git clone <your-repo-url> voicex
cd voicex
pnpm install

# Configure
cp backend/.env.example backend/.env.local
# Edit backend/.env.local with your API keys

# Seed database (plans, providers, test data)
bash scripts/seed-plans.sh
bash scripts/seed-global-providers.sh
bash scripts/seed.sh

# Run
pnpm dev

Open http://localhost:3000 → Sign in with a@a.dev / 12345678 (test account).

How It Works

User speaks → Deepgram STT → LLM (Groq/OpenAI/Ollama) → TTS (ElevenLabs/OpenAI/Edge) → User hears AI

The entire pipeline streams in real-time. Each sentence is spoken as soon as it's generated — while the LLM continues producing the next sentence.

Built with Deepgram, Groq, and ElevenLabs.