LLM API Pricing Cheat Sheet: Every Model, Every Provider (April 2026)
⚠️ Deprecation alert: Claude 4 Opus and Claude Sonnet 4 are retiring on June 15, 2026. If you're using these models, see our migration guide for step-by-step instructions.
🚨 June 15 deadline: See all 39 alternatives, calculate your savings, and get migration code on our Claude 4 Deprecation Hub.
Stop jumping between pricing pages. Here's every major LLM API priced side by side — input costs, output costs, context windows, and real cost-per-use examples. Bookmark this page and check back when providers update their rates.
Complete Pricing Table
Try It Live — Instant Cost Calculator
See exactly what this model costs for your workload. No signup needed.
All prices are per 1M tokens. Data verified .
| Provider | Model | Input | Output | Context | Tier |
|---|---|---|---|---|---|
| OpenAI | GPT-4o | $2.50 | $10.00 | 128K | Premium |
| OpenAI | GPT-4o mini | $0.15 | $0.60 | 128K | Budget |
| Anthropic | Claude Sonnet 4 | $3.00 | $15.00 | 200K | Premium |
| Anthropic | Claude Haiku 4.5 | $1.00 | $5.00 | 200K | Budget |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | Premium | |
| Gemini 2.0 Flash | $0.10 | $0.40 | 1M | Budget | |
| Mistral | Large | $2.00 | $6.00 | 128K | Premium |
| Mistral | Small | $0.10 | $0.30 | 32K | Budget |
| Cohere | Command R+ | $2.50 | $10.00 | 128K | Premium |
| Cohere | Command R | $0.15 | $0.60 | 128K | Budget |
| Meta (Together.ai) | Llama 3.1 70B | $0.88 | $0.88 | 128K | Budget |
| Meta (Together.ai) | Llama 3.1 8B | $0.18 | $0.18 | 128K | Budget |
| AI21 | Jamba 1.5 Large | $2.00 | $8.00 | 256K | Premium |
Cheapest Models by Tier
Budget Tier (Under $1/M input)
Premium Tier ($1+/M input)
Real-World Cost Examples
Here's what you'd actually pay for common workloads. Assumes 1,000 requests/day with 500 input tokens and 200 output tokens per request.
Chatbot (1K requests/day)
Code Generation (1K requests/day)
Document Analysis (100 requests/day)
Context Window Comparison
| Context Window | Models | Best For |
|---|---|---|
| 32K | Mistral Small 4 | Short prompts, classification, simple Q&A |
| 128K | GPT-4o, GPT-4o mini, Mistral Large 3, Cohere Command R/R+, Llama 3.1 | Most use cases, multi-turn chat, code generation |
| 200K | Claude Sonnet 4, Claude Haiku 4.5 | Long documents, large codebases, book-length analysis |
| 256K | AI21 Jamba 1.5 Large | Very long documents, legal contracts, research papers |
| 1M | Gemini 2.5 Pro, Gemini 2.0 Flash | Entire codebases, video analysis, massive datasets |
Quick Decision Guide
- Cheapest overall: Mistral Small 4 ($0.10/$0.30) — but only 32K context
- Cheapest with decent context: Gemini 2.0 Flash ($0.10/$0.40) — 1M context at budget price
- Best quality per dollar (premium): Gemini 2.5 Pro ($1.25/$10.00) — cheapest premium with 1M context
- Best for code: Claude Sonnet 4 ($3.00/$15.00) — strongest coding benchmarks
- Best for chat: GPT-4o ($2.50/$10.00) — most natural conversation
- Best open-source option: Llama 3.1 70B via Together.ai ($0.88/$0.88) — symmetric pricing
- Best for long documents: Gemini 2.5 Pro — 1M context window eliminates chunking
How to Use This Data
Don't just pick the cheapest model. Use the APIpulse Calculator to model your specific usage pattern. The right model depends on your input/output ratio, request volume, and quality requirements.
A model that costs 5x more but produces results that need no editing can actually be cheaper than a budget model that requires human review.
Calculate your exact monthly cost with your real usage numbers.
Try the APIpulse CalculatorRelated Reading
- AI API Pricing June 2026: Complete Guide to All 39 Models
- LLM API Pricing Report Q2 2026: Every Model, Every Provider
- The Cheapest LLM APIs in 2026: A Complete Ranking
- How to Reduce Your AI API Costs by 40%
- Compare any two models side by side →
- AI Agent Cost Calculator — Estimate Your Agent's Spend →
Get notified when API prices change
No spam. Only pricing updates and new features. Unsubscribe anytime.