AI Models

The underlying AI models that power our agents, each with different capabilities.

Auto

Auto

Automatic model routing through the Vercel AI Gateway. Picks the best available model for each request and falls back to alternates if the primary is overloaded or unavailable.

Auto
Auto
Free Smart default

Automatically picks the best available model with seamless fallback if any provider is overloaded.

Anthropic Claude

Anthropic Claude

Anthropic is an AI safety company and public benefit corporation that builds the Claude family of large language models. Founded in 2021, Anthropic focuses on developing reliable, steerable AI designed to be helpful, harmless, and honest.

Claude Haiku 4.5
Claude Haiku 4.5
Free Fastest

Fast and intelligent model for quick tasks

Context: 200K
Max Output: 64K
Cutoff: February 2025
Claude Sonnet 4.6
Claude Sonnet 4.6
Premium Recommended

The best combination of speed and intelligence, with adaptive thinking and long context support

Context: 1M
Max Output: 64K
Cutoff: August 2025
Claude Sonnet 4.5
Claude Sonnet 4.5
Premium

Previous-generation Sonnet with long context and extended thinking

Context: 1M
Max Output: 64K
Cutoff: January 2025
Claude Opus 4.8
Claude Opus 4.8
Premium Most intelligent

Most capable Claude model with a step-change in agentic coding and complex reasoning

Context: 1M
Max Output: 128K
Cutoff: January 2026
Claude Opus 4.7
Claude Opus 4.7
Premium

Previous-generation Opus with a step-change in agentic coding and complex reasoning

Context: 1M
Max Output: 128K
Cutoff: January 2026
Claude Opus 4.6
Claude Opus 4.6
Premium

Previous-generation Opus with adaptive thinking for complex reasoning and agentic workflows

Context: 1M
Max Output: 128K
Cutoff: May 2025
Google Gemini

Google Gemini

Google DeepMind's Gemini models offer multimodal understanding with large context windows, strong reasoning, and efficient performance across tasks.

Gemini 3.5 Flash
Gemini 3.5 Flash
Free Fast general use

Near-Pro reasoning and coding at Flash-tier cost and speed. Dynamic thinking is on by default.

Context: 1.05M
Max Output: 65.5K
Cutoff: January 2025
Gemini 3.1 Pro
Gemini 3.1 Pro
Premium Experimental

Google's most advanced Gemini 3.1 model with powerful agentic capabilities, multimodal understanding, and state-of-the-art reasoning.

Context: 1.05M
Max Output: 65.5K
Cutoff: January 2025
Mistral AI

Mistral AI

Mistral AI builds efficient, open-weight language models. Known for strong multilingual support, fast inference, and competitive performance at lower cost.

Mistral Small 4
Mistral Small 4
Free

Hybrid model optimized for general chat, coding, agentic tasks, and complex reasoning with text and image input support

Context: 256K
Max Output: 4.1K
Mistral Medium 3.5
Mistral Medium 3.5
Free

Balanced model for most tasks with good performance and cost efficiency

Context: 256K
Max Output: 4.1K
Mistral Large 3
Mistral Large 3
Free Advanced reasoning

Mistral's most capable model with strong multilingual support and advanced reasoning for complex tasks

Context: 256K
Max Output: 4.1K
OpenAI GPT

OpenAI GPT

OpenAI's GPT models deliver strong general-purpose intelligence with broad knowledge, creative writing, and code generation capabilities.

GPT 5.5
GPT 5.5
Premium Most capable

OpenAI's frontier model — fully-retrained base since GPT-4.5, with 1M context window and strong general reasoning. Priced like Claude Opus; available on the Pro plan and above.

Context: 1M
Max Output: 272K
Cutoff: December 2025
GPT 5.4
GPT 5.4
Premium

GPT-5.4 with a 1M context window, built-in computer-use capabilities, and optional reasoning. Sonnet-equivalent OpenAI option for Basic+ users.

Context: 1.05M
Max Output: 128K
Cutoff: August 2025
GPT 5.4 mini
GPT 5.4 mini
Free Fast

GPT 5.4 mini is an optimized version of GPT 5.4 — fast and efficient for everyday tasks.

Context: 400K
Max Output: 128K
Cutoff: August 2025
GPT 4.1
GPT 4.1
Premium 1m context

Last version of GPT 4

Context: 1.05M
Max Output: 32.8K
GPT OSS 120B
GPT OSS 120B
Free

OpenAI's open-weights 120B-parameter model.

Context: 128K
Max Output: 4.1K
Cutoff: June 2024
xAI

xAI

xAI's Grok models — frontier reasoning with very large context windows.

Grok 4.1 Fast
Grok 4.1 Fast
Free 2M context

xAI's fast model with an unprecedented 2 million token context window.

Context: 2M
Max Output: 4.1K
Grok 4.3
Grok 4.3
Premium 1M context

xAI's advanced reasoning model with strong coding and agentic capabilities and a 1 million token context window.

Context: 1M
Max Output: 32.8K
DeepSeek

DeepSeek

DeepSeek's efficient open models with strong coding and math reasoning.

DeepSeek V3.2
DeepSeek V3.2
Free

DeepSeek's efficient reasoning model with strong coding and math capabilities.

Context: 163.8K
Max Output: 4.1K
DeepSeek V4 Flash
DeepSeek V4 Flash
Premium 1M context

DeepSeek's fast and efficient V4 model optimized for speed with a 1 million token context window.

Context: 1M
Max Output: 16.4K
DeepSeek V4 Pro
DeepSeek V4 Pro
Premium 1M context

DeepSeek's flagship V4 reasoning model with strong coding and math capabilities and a 1 million token context window.

Context: 1M
Max Output: 32.8K
Moonshot AI

Moonshot AI

Moonshot AI's Kimi models — long-context reasoning.

Kimi K2.5
Kimi K2.5
Free

Moonshot AI's advanced reasoning model with strong performance on complex tasks.

Context: 262.1K
Max Output: 4.1K
Z-AI

Z-AI

Z-AI's GLM models — fast, multilingual general-purpose intelligence.

GLM-5 Turbo
GLM-5 Turbo
Free

Z-AI's fast and efficient general-purpose model with strong multilingual capabilities.

Context: 202.8K
Max Output: 4.1K
MiniMax

MiniMax

MiniMax's general-purpose models with large context support.

MiniMax M2.7
MiniMax M2.7
Free

MiniMax's latest general-purpose model with strong reasoning and large context support.

Context: 204.8K
Max Output: 4.1K
Qwen

Qwen

Alibaba's Qwen models — high-performance reasoning with up to 1M-token context.

Qwen 3.7 Max
Qwen 3.7 Max
Premium 1M context

Qwen's flagship model with state-of-the-art reasoning and a 1 million token context window.

Context: 1M
Max Output: 32.8K
Qwen 3.5 Plus
Qwen 3.5 Plus
Free 1M context

Qwen's high-performance model with 1 million token context window.

Context: 1M
Max Output: 4.1K
Qwen 3.5 Flash
Qwen 3.5 Flash
Free Fast, 1M context

Qwen's fast and efficient model optimized for speed with 1M context.

Context: 1M
Max Output: 4.1K