▸ Live calculator · No signup

AI Infra Cost Calculator

Tokens × volume × model mix + compute + storage. Defaults sourced from real client data, November 2025. Tweak anything.

Workload

e.g. 200,000 = ~6,500/day
35% — typical with prompt caching for repeat-system-prompts

Model mix

20% of requests · ~$15 input / $75 output per Mtok
55% of requests · ~$3 input / $15 output per Mtok
25% of requests · ~$0.80 input / $4 output per Mtok

Infra

Compute baseline = your orchestration, gateways, observability stack.
Estimated monthly run cost
$0
Excludes engineering time. Includes inference, caching savings, vector DB, and compute baseline.

Breakdown

Frontier model spend$0
Mid-tier spend$0
Small model spend$0
Vector DB$0
Compute / observability$0
Cache savings−$0
Heads up: Model mix percentages should sum to 100. We rebalance proportionally.

These prices change. As of Nov 2025 we use vendor list prices, not committed-spend discounts. For a cost-tuned architecture, talk to us.