Models

3 abstract model aliases. Each routes to the optimal underlying provider.

Why abstract names? We don't expose real provider models. This lets us swap underlying providers for better performance or pricing without breaking your integration. Your code always uses cheap-model, fast-model, or best-model.

cheap-model💰 Budget

Maximum cost efficiency. Best for high-volume tasks, summarization, and simple Q&A.

128K
Context
$0.14
Input /1M
~800ms
Latency
fast-model⚡ Speed

Optimized for low latency. Ideal for real-time chat and interactive applications.

64K
Context
$0.50
Input /1M
~200ms
Latency
best-model🧠 Quality

Maximum quality and reasoning. For complex analysis, code generation, deep understanding.

128K
Context
$0.22
Input /1M
~1.2s
Latency

Which model should I use?

💰 cheap-model

Batch processing, summarization, classification

⚡ fast-model

Real-time chat, interactive UX, low-latency APIs

🧠 best-model

Code generation, complex reasoning, multilingual