Models

3 abstract model aliases. Each routes to the optimal underlying provider.

Why abstract names? We don't expose real provider models. This lets us swap underlying providers for better performance or pricing without breaking your integration. Your code always uses cheap-model, fast-model, or best-model.

Model	Provider	Context	Tag	Input /1M	Output /1M	Latency
cheap-model	DeepSeek	128K	💰 Budget	$0.14	$0.18	~800ms
fast-model	OpenAI	64K	⚡ Speed	$0.50	$0.65	~200ms
best-model	Qwen	128K	🧠 Quality	$0.22	$0.29	~1.2s

cheap-model💰 Budget

Maximum cost efficiency. Best for high-volume tasks, summarization, and simple Q&A.

128K

Context

$0.14

Input /1M

~800ms

Latency

fast-model⚡ Speed

Optimized for low latency. Ideal for real-time chat and interactive applications.

64K

Context

$0.50

Input /1M

~200ms

Latency

best-model🧠 Quality

Maximum quality and reasoning. For complex analysis, code generation, deep understanding.

128K

Context

$0.22

Input /1M

~1.2s

Latency

Which model should I use?

💰 cheap-model

Batch processing, summarization, classification

⚡ fast-model

Real-time chat, interactive UX, low-latency APIs

🧠 best-model

Code generation, complex reasoning, multilingual