
Quick Start
Model Selection
The router returns just the model name. Use it directly with your provider’s SDK:Available Models
| Provider | Fast/Cheap | Powerful |
|---|---|---|
| Anthropic | claude-haiku-4-5-20251001 | claude-sonnet-4-5-20250929 |
| OpenAI | gpt-5-mini | gpt-5-low, gpt-5-medium, gpt-5-high |
| Gemini | gemini-2.5-flash | gemini-2.5-pro |
Modes
balanced (default) - Balances cost and quality
aggressive - Aggressively optimizes for cost (cheaper models)
Real-World Example
Route dynamically in production to cut costs while maintaining quality:When to Use
Use router when:- Processing varied user requests (simple to complex)
- You want to minimize API costs automatically
- Building cost-conscious AI products
- All tasks need the same model tier
- The ~430ms routing latency matters more than cost savings
- You need maximum predictability
API Reference
openai | anthropic | gemini
Error Handling
Always provide a fallback model:Performance
- Latency: ~430ms average
- Parallel: Run routing while preparing your request
- HTTP/2: Connection reuse for subsequent calls