DeepSeek V4 Flash
Budgetby DeepSeek
DeepSeek V4 Flash is the efficiency model in the V4 family, with 284 billion total parameters and 13 billion active per token. At $0.14 per million input tokens with a 1M context window under the MIT license, it is one of the cheapest capable models available anywhere.
Input Price
$0.14
per 1M tokens
Output Price
$0.28
per 1M tokens
Context Window
1M
tokens
Released
2026-04
Open source
Capabilities
Key Strengths
- ✓Ultra-low pricing ($0.14/1M input)
- ✓1M context window
- ✓MIT open source
- ✓Strong efficiency benchmarks
Best For
- ▸High-volume chat
- ▸Budget inference at scale
- ▸Self-hosted deployments
- ▸Lightweight code tasks
Benchmark Scores
| Benchmark | Score | Description |
|---|---|---|
| MMLU-Pro | 85.2 | General knowledge and reasoning across 57 subjects |
| HumanEval | 89.4 | Python code generation and problem solving |
| GPQA Diamond | 58.7 | Graduate-level science questions verified by domain experts |
| MATH | 82.1 | Competition-level mathematics problems |
| SWE-bench | 48.9 | Real-world software engineering tasks from GitHub issues |
Scores sourced from public benchmark datasets. See full benchmark leaderboard for all models.
Pricing Details
Input tokens
$0.14
per 1M tokens
Output tokens
$0.28
per 1M tokens
Estimated cost per 1K requests
$0.28
~1K input + ~500 output tokens avg
Prices are subject to change. Check the official documentation for current pricing. See the cost calculator for detailed estimates.
Open Source Model
DeepSeek V4 Flash is free to download and self-host under the MIT. Hosted API pricing varies by provider (e.g., Together, Fireworks, Groq). See our open source LLM guide for deployment options.