AI Pricing Calculator

Estimate your monthly cost across all major AI APIs. Adjust the sliders, see what each model would cost.

Monthly Cost by Model

Sorted cheapest → most expensive

Gemini 3.1 Flash Lite

Fast & Cheap★ Best Value

Google · $0.1/M in · $0.4/M out

$15.6

per month

$0.52/day

DeepSeek V4

Frontier

DeepSeek · $0.27/M in · $1.1/M out

Best price/perf ratio in frontier tier

$42.6

per month

$1.42/day

GPT-5.5 mini

Fast & Cheap

OpenAI · $0.3/M in · $1.2/M out · $0.03/M cached

$46.8

per month

$1.56/day

Mistral Medium 3.5

Balanced

Mistral · $0.4/M in · $2/M out

$72

per month

$2.4/day

Llama 4 70B

Balanced

Meta (via Together) · $0.88/M in · $0.88/M out

$73.92

per month

$2.46/day

Gemini 3.1 Flash

Balanced

Google · $0.3/M in · $2.5/M out · $0.08/M cached

$78

per month

$2.6/day

Grok 4 mini

Balanced

xAI · $0.5/M in · $2/M out

$78

per month

$2.6/day

Claude Haiku 4.5

Fast & Cheap

Anthropic · $0.8/M in · $4/M out · $0.08/M cached

$144

per month

$4.8/day

o4-mini

Reasoning

OpenAI · $1.1/M in · $4.4/M out

Affordable reasoning model

$171.6

per month

$5.72/day

Mistral Large 3

Frontier

Mistral · $2/M in · $6/M out

$264

per month

$8.8/day

Gemini 3.1 Pro

Frontier

Google · $1.25/M in · $10/M out · $0.31/M cached

1M token context

$315

per month

$10.5/day

Llama 4 405B

Frontier

Meta (via Together) · $5/M in · $5/M out

Open weights — also runnable locally for free

$420

per month

$14/day

Claude Sonnet 4.6

Balanced

Anthropic · $3/M in · $15/M out · $0.3/M cached

$540

per month

$18/day

Grok 4

Frontier

xAI · $5/M in · $15/M out

$660

per month

$22/day

GPT-5.5

Frontier

OpenAI · $5/M in · $20/M out · $0.5/M cached

$780

per month

$26/day

o3

Reasoning

OpenAI · $10/M in · $40/M out

Reasoning model — slower, much more capable on math/logic

$1,560

per month

$52/day

GPT-5.5 Pro

Frontier

OpenAI · $15/M in · $60/M out

Higher reasoning, Pro/Business/Enterprise

$2,340

per month

$78/day

Claude Opus 4.7

Frontier

Anthropic · $15/M in · $75/M out · $1.5/M cached

$2,700

per month

$90/day

How to use this calculator

Input tokens = everything the model reads (your prompt, attached files, conversation history). Output tokens = what the model writes back.

Rule of thumb: 1 token ≈ 0.75 words. A 1,000-word document is ~1,300 tokens.

Cached input ratio matters because most providers offer steep discounts (10x) on tokens that hit their cache. For applications with consistent system prompts or repeated context, cache hit rates of 50-90% are achievable.

Pricing data is from May 2026. Providers update prices regularly — check official pricing pages before committing to production budgets. Open-weight models like Llama 4 405B and DeepSeek V4 can also be self-hosted for unlimited usage if you have the hardware.