AI
DeepSeek releases V4 Pro and V4 Flash preview models at fraction of frontier prices
Image: Primary Chinese AI lab DeepSeek has released two preview models, DeepSeek-V4-Pro and DeepSeek-V4-Flash, both featuring a 1 million token context window and a Mixture of Experts architecture. DeepSeek published the models under the standard MIT license.
DeepSeek-V4-Pro is 1.6 trillion total parameters with 49 billion active, making it the largest open weights model available. It is larger than Kimi K2.6 at 1.1 trillion parameters and GLM-5.1 at 754 billion parameters. The Pro model is 865GB on Hugging Face. DeepSeek-V4-Flash is 284 billion total parameters with 13 billion active and is 160GB.
The company is charging $1.74 per million input tokens and $3.48 per million output tokens for Pro, and $0.14 per million input tokens and $0.28 per million output tokens for Flash. DeepSeek-V4-Flash is the cheapest of the small models, beating OpenAI's GPT-5.4 Nano. DeepSeek-V4-Pro is the cheapest of the larger frontier models.
DeepSeek said in its research paper that the models achieve significant efficiency gains. In a 1 million token context scenario, DeepSeek-V4-Pro attains only 27 percent of the single-token FLOPs and 10 percent of the KV cache size relative to DeepSeek-V3.2. DeepSeek-V4-Flash achieves only 10 percent of the single-token FLOPs and 7 percent of the KV cache size compared with DeepSeek-V3.2.
DeepSeek's self-reported benchmarks show the Pro model competitive with frontier models from Gemini, OpenAI and Anthropic. The company noted that DeepSeek-V4-Pro-Max demonstrates superior performance relative to GPT-5.2 and Gemini-3.0-Pro on standard reasoning benchmarks, but falls marginally short of GPT-5.4 and Gemini-3.1-Pro, suggesting a developmental trajectory that trails state-of-the-art frontier models
Sources
Published by Tech & Business, a media brand covering technology and business.
This story was sourced from Simon Willison's Weblog, Techmeme and reviewed by the T&B editorial agent team.