AI Models

The underlying AI models that power our agents, each with different capabilities.

Auto

Automatic model routing through the Vercel AI Gateway. Picks the best available model for each request and falls back to alternates if the primary is overloaded or unavailable.

Auto

Free Smart default

Automatically picks the best available model with seamless fallback if any provider is overloaded.

Anthropic Claude

Anthropic is an AI safety company and public benefit corporation that builds the Claude family of large language models. Founded in 2021, Anthropic focuses on developing reliable, steerable AI designed to be helpful, harmless, and honest.

Claude Haiku 4.5

Free Fastest

Fast and intelligent model for quick tasks

Context: 200K

Max Output: 64K

Cutoff: February 2025

Claude Sonnet 4.6

Premium Recommended

The best combination of speed and intelligence, with adaptive thinking and long context support

Context: 1M

Max Output: 64K

Cutoff: August 2025

Claude Sonnet 4.5

Premium

Previous-generation Sonnet with long context and extended thinking

Context: 1M

Max Output: 64K

Cutoff: January 2025

Claude Opus 4.8

Premium Most intelligent

Most capable Claude model with a step-change in agentic coding and complex reasoning

Context: 1M

Max Output: 128K

Cutoff: January 2026

Claude Opus 4.7

Premium

Previous-generation Opus with a step-change in agentic coding and complex reasoning

Context: 1M

Max Output: 128K

Cutoff: January 2026

Claude Opus 4.6

Premium

Previous-generation Opus with adaptive thinking for complex reasoning and agentic workflows

Context: 1M

Max Output: 128K

Cutoff: May 2025

Google Gemini

Google DeepMind's Gemini models offer multimodal understanding with large context windows, strong reasoning, and efficient performance across tasks.

Gemini 3.5 Flash

Free Fast general use

Near-Pro reasoning and coding at Flash-tier cost and speed. Dynamic thinking is on by default.

Context: 1.05M

Max Output: 65.5K

Cutoff: January 2025

Gemini 3.1 Pro

Premium Experimental

Google's most advanced Gemini 3.1 model with powerful agentic capabilities, multimodal understanding, and state-of-the-art reasoning.

Context: 1.05M

Max Output: 65.5K

Cutoff: January 2025

Mistral AI

Mistral AI builds efficient, open-weight language models. Known for strong multilingual support, fast inference, and competitive performance at lower cost.

Mistral Small 4

Free

Hybrid model optimized for general chat, coding, agentic tasks, and complex reasoning with text and image input support

Context: 256K

Max Output: 4.1K

Mistral Medium 3.5

Free

Balanced model for most tasks with good performance and cost efficiency

Context: 256K

Max Output: 4.1K

Mistral Large 3

Free Advanced reasoning

Mistral's most capable model with strong multilingual support and advanced reasoning for complex tasks

Context: 256K

Max Output: 4.1K

OpenAI GPT

OpenAI's GPT models deliver strong general-purpose intelligence with broad knowledge, creative writing, and code generation capabilities.

GPT 5.5

Premium Most capable

OpenAI's frontier model — fully-retrained base since GPT-4.5, with 1M context window and strong general reasoning. Priced like Claude Opus; available on the Pro plan and above.

Context: 1M

Max Output: 272K

Cutoff: December 2025

GPT 5.4

Premium

GPT-5.4 with a 1M context window, built-in computer-use capabilities, and optional reasoning. Sonnet-equivalent OpenAI option for Basic+ users.

Context: 1.05M

Max Output: 128K

Cutoff: August 2025

GPT 5.4 mini

Free Fast

GPT 5.4 mini is an optimized version of GPT 5.4 — fast and efficient for everyday tasks.

Context: 400K

Max Output: 128K

Cutoff: August 2025

GPT 4.1

Premium 1m context

Last version of GPT 4

Context: 1.05M

Max Output: 32.8K

GPT OSS 120B

Free

OpenAI's open-weights 120B-parameter model.

Context: 128K

Max Output: 4.1K

Cutoff: June 2024

xAI

xAI's Grok models — frontier reasoning with very large context windows.

Grok 4.1 Fast

Free 2M context

xAI's fast model with an unprecedented 2 million token context window.

Context: 2M

Max Output: 4.1K

Grok 4.3

Premium 1M context

xAI's advanced reasoning model with strong coding and agentic capabilities and a 1 million token context window.

Context: 1M

Max Output: 32.8K

DeepSeek

DeepSeek's efficient open models with strong coding and math reasoning.

DeepSeek V3.2

Free

DeepSeek's efficient reasoning model with strong coding and math capabilities.

Context: 163.8K

Max Output: 4.1K

DeepSeek V4 Flash

Premium 1M context

DeepSeek's fast and efficient V4 model optimized for speed with a 1 million token context window.

Context: 1M

Max Output: 16.4K

DeepSeek V4 Pro

Premium 1M context

DeepSeek's flagship V4 reasoning model with strong coding and math capabilities and a 1 million token context window.

Context: 1M

Max Output: 32.8K

Moonshot AI

Moonshot AI's Kimi models — long-context reasoning.

Kimi K2.5

Free

Moonshot AI's advanced reasoning model with strong performance on complex tasks.

Context: 262.1K

Max Output: 4.1K

Z-AI

Z-AI's GLM models — fast, multilingual general-purpose intelligence.

GLM-5 Turbo

Free

Z-AI's fast and efficient general-purpose model with strong multilingual capabilities.

Context: 202.8K

Max Output: 4.1K

MiniMax

MiniMax's general-purpose models with large context support.

MiniMax M2.7

Free

MiniMax's latest general-purpose model with strong reasoning and large context support.

Context: 204.8K

Max Output: 4.1K

Qwen

Alibaba's Qwen models — high-performance reasoning with up to 1M-token context.

Qwen 3.7 Max

Premium 1M context

Qwen's flagship model with state-of-the-art reasoning and a 1 million token context window.

Context: 1M

Max Output: 32.8K

Qwen 3.5 Plus

Free 1M context

Qwen's high-performance model with 1 million token context window.

Context: 1M

Max Output: 4.1K

Qwen 3.5 Flash

Free Fast, 1M context

Qwen's fast and efficient model optimized for speed with 1M context.

Context: 1M

Max Output: 4.1K

Auto

Automatic model routing through the Vercel AI Gateway. Picks the best available model for each request and falls back to alternates if the primary is overloaded or unavailable.

Auto

Free Smart default

Automatically picks the best available model with seamless fallback if any provider is overloaded.

Model ID: auto

Anthropic Claude

Claude Haiku 4.5

Free Fastest

Fast and intelligent model for quick tasks

Context Window: 200K tokens

Max Output: 64K tokens

Knowledge Cutoff: February 2025

Model ID: claude-haiku-4-5

Claude Sonnet 4.6

Premium Recommended

The best combination of speed and intelligence, with adaptive thinking and long context support

Context Window: 1M tokens

Max Output: 64K tokens

Knowledge Cutoff: August 2025

Model ID: claude-sonnet-4-6

Claude Sonnet 4.5

Premium

Previous-generation Sonnet with long context and extended thinking

Context Window: 1M tokens

Max Output: 64K tokens

Knowledge Cutoff: January 2025

Model ID: claude-sonnet-4-5

Claude Opus 4.8

Premium Most intelligent

Most capable Claude model with a step-change in agentic coding and complex reasoning

Context Window: 1M tokens

Max Output: 128K tokens

Knowledge Cutoff: January 2026

Model ID: claude-opus-4-8

Claude Opus 4.7

Premium

Previous-generation Opus with a step-change in agentic coding and complex reasoning

Context Window: 1M tokens

Max Output: 128K tokens

Knowledge Cutoff: January 2026

Model ID: claude-opus-4-7

Claude Opus 4.6

Premium

Previous-generation Opus with adaptive thinking for complex reasoning and agentic workflows

Context Window: 1M tokens

Max Output: 128K tokens

Knowledge Cutoff: May 2025

Model ID: claude-opus-4-6

Google Gemini

Google DeepMind's Gemini models offer multimodal understanding with large context windows, strong reasoning, and efficient performance across tasks.

Gemini 3.5 Flash

Free Fast general use

Near-Pro reasoning and coding at Flash-tier cost and speed. Dynamic thinking is on by default.

Context Window: 1.05M tokens

Max Output: 65.5K tokens

Knowledge Cutoff: January 2025

Model ID: google/gemini-3.5-flash

Gemini 3.1 Pro

Premium Experimental

Google's most advanced Gemini 3.1 model with powerful agentic capabilities, multimodal understanding, and state-of-the-art reasoning.

Context Window: 1.05M tokens

Max Output: 65.5K tokens

Knowledge Cutoff: January 2025

Model ID: google/gemini-3.1-pro-preview

Mistral AI

Mistral AI builds efficient, open-weight language models. Known for strong multilingual support, fast inference, and competitive performance at lower cost.

Mistral Small 4

Free

Hybrid model optimized for general chat, coding, agentic tasks, and complex reasoning with text and image input support

Context Window: 256K tokens

Max Output: 4.1K tokens

Model ID: mistral-small-latest

Mistral Medium 3.5

Free

Balanced model for most tasks with good performance and cost efficiency

Context Window: 256K tokens

Max Output: 4.1K tokens

Model ID: mistral-medium-latest

Mistral Large 3

Free Advanced reasoning

Mistral's most capable model with strong multilingual support and advanced reasoning for complex tasks

Context Window: 256K tokens

Max Output: 4.1K tokens

Model ID: mistral-large-latest

OpenAI GPT

OpenAI's GPT models deliver strong general-purpose intelligence with broad knowledge, creative writing, and code generation capabilities.

GPT 5.5

Premium Most capable

OpenAI's frontier model — fully-retrained base since GPT-4.5, with 1M context window and strong general reasoning. Priced like Claude Opus; available on the Pro plan and above.

Context Window: 1M tokens

Max Output: 272K tokens

Knowledge Cutoff: December 2025

Model ID: gpt-5.5

GPT 5.4

Premium

GPT-5.4 with a 1M context window, built-in computer-use capabilities, and optional reasoning. Sonnet-equivalent OpenAI option for Basic+ users.

Context Window: 1.05M tokens

Max Output: 128K tokens

Knowledge Cutoff: August 2025

Model ID: gpt-5.4

GPT 5.4 mini

Free Fast

GPT 5.4 mini is an optimized version of GPT 5.4 — fast and efficient for everyday tasks.

Context Window: 400K tokens

Max Output: 128K tokens

Knowledge Cutoff: August 2025

Model ID: gpt-5.4-mini

GPT 4.1

Premium 1m context

Last version of GPT 4

Context Window: 1.05M tokens

Max Output: 32.8K tokens

Model ID: gpt-4.1

GPT OSS 120B

Free

OpenAI's open-weights 120B-parameter model.

Context Window: 128K tokens

Max Output: 4.1K tokens

Knowledge Cutoff: June 2024

Model ID: gpt-oss-120b

xAI

xAI's Grok models — frontier reasoning with very large context windows.

Grok 4.1 Fast

Free 2M context

xAI's fast model with an unprecedented 2 million token context window.

Context Window: 2M tokens

Max Output: 4.1K tokens

Model ID: xai/grok-4.1-fast

Grok 4.3

Premium 1M context

xAI's advanced reasoning model with strong coding and agentic capabilities and a 1 million token context window.

Context Window: 1M tokens

Max Output: 32.8K tokens

Model ID: xai/grok-4.3

DeepSeek

DeepSeek's efficient open models with strong coding and math reasoning.

DeepSeek V3.2

Free

DeepSeek's efficient reasoning model with strong coding and math capabilities.

Context Window: 163.8K tokens

Max Output: 4.1K tokens

Model ID: deepseek/deepseek-v3.2

DeepSeek V4 Flash

Premium 1M context

DeepSeek's fast and efficient V4 model optimized for speed with a 1 million token context window.

Context Window: 1M tokens

Max Output: 16.4K tokens

Model ID: deepseek/deepseek-v4-flash

DeepSeek V4 Pro

Premium 1M context

DeepSeek's flagship V4 reasoning model with strong coding and math capabilities and a 1 million token context window.

Context Window: 1M tokens

Max Output: 32.8K tokens

Model ID: deepseek/deepseek-v4-pro

Moonshot AI

Moonshot AI's Kimi models — long-context reasoning.

Kimi K2.5

Free

Moonshot AI's advanced reasoning model with strong performance on complex tasks.

Context Window: 262.1K tokens

Max Output: 4.1K tokens

Model ID: moonshotai/kimi-k2.5

Z-AI

Z-AI's GLM models — fast, multilingual general-purpose intelligence.

GLM-5 Turbo

Free

Z-AI's fast and efficient general-purpose model with strong multilingual capabilities.

Context Window: 202.8K tokens

Max Output: 4.1K tokens

Model ID: zai/glm-5-turbo

MiniMax

MiniMax's general-purpose models with large context support.

MiniMax M2.7

Free

MiniMax's latest general-purpose model with strong reasoning and large context support.

Context Window: 204.8K tokens

Max Output: 4.1K tokens

Model ID: minimax/minimax-m2.7

Qwen

Alibaba's Qwen models — high-performance reasoning with up to 1M-token context.

Qwen 3.7 Max

Premium 1M context

Qwen's flagship model with state-of-the-art reasoning and a 1 million token context window.

Context Window: 1M tokens

Max Output: 32.8K tokens

Model ID: alibaba/qwen3.7-max

Qwen 3.5 Plus

Free 1M context

Qwen's high-performance model with 1 million token context window.

Context Window: 1M tokens

Max Output: 4.1K tokens

Model ID: alibaba/qwen3.5-plus-02-15

Qwen 3.5 Flash

Free Fast, 1M context

Qwen's fast and efficient model optimized for speed with 1M context.

Context Window: 1M tokens

Max Output: 4.1K tokens

Model ID: alibaba/qwen3.5-flash-02-23