xAI's Grok Realtime API provides speech-to-speech conversation with <700ms latency. Unlike traditional voice AI (STT → LLM → TTS), Grok processes audio directly in a single model.
Navigate to Integrations → API Keys tab and add your xAI API key:
curl -X POST https://api.flireo.com/api/v1/byok
-H "Authorization: Bearer YOUR_API_KEY"
-H "Content-Type: application/json"
-d '{
"provider": "xai",
"api_key": "xai-..."
}'
When creating or editing an agent with xAI configured:
grok-realtime-v1ara (or other available voices)Note: When using xAI Realtime, separate STT/TTS providers are ignored.
Your backend agent will use:
from livekit.plugins import xai
session = AgentSession(
llm=xai.realtime.RealtimeModel(
api_key=xai_api_key, # From Vault
voice="ara",
)
)
xAI Realtime uses BYOK pricing:
| Feature | Traditional (STT+LLM+TTS) | xAI Realtime |
|---|---|---|
| Latency | ~1-2 seconds | <700ms |
| Providers | 3 separate (Deepgram + OpenAI + ElevenLabs) | Single (xAI) |
| Voice Quality | Depends on TTS provider | Native to model |
| Custom Tools | Supported via llm_config.tools | Check xAI docs for support |
| BYOK Keys Required | 3 keys (STT, LLM, TTS) | 1 key (xAI) |
See BYOK API Reference for managing xAI API keys.