# DeepSeek Releases DSpark: Speculative Decoding Makes V4 Up to 85 Percent Faster

_Tuesday, June 30, 2026 at 6:48 PM EDT · AI · Latest · Tier 2 — Notable_

![DeepSeek Releases DSpark: Speculative Decoding Makes V4 Up to 85 Percent Faster — Primary](https://d.techtimes.com/en/full/467078/deepseek.jpg)

DeepSeek on June 27 released DSpark, an inference optimization framework using speculative decoding that the company says makes its V4-Flash model generate responses up to 85 percent faster than the prior single-token baseline. The speed gain comes without retraining the model, changing its weights, or adding new hardware, according to DeepSeek. The framework is now live across V4-Flash and V4-Pro, and is available as open-source code under an MIT license. DeepSeek also released DeepSpec, a full-stack codebase for training and evaluating speculative decoding draft models, under an MIT license on GitHub. DeepSpec targets the Qwen3 and Gemma model families. The deployed configuration, called DSpark-5, uses a five-token draft block. In DeepSeek's internal production data, DSpark-5 improved per-user generation speed by 60 to 85 percent on V4-Flash and 57 to 78 percent on V4-Pro compared to the prior MTP-1 baseline. DeepSeek emphasized that DSpark is not a new model -- the Hugging Face cards for DeepSeek-V4-Pro-DSpark and DeepSeek-V4-Flash-DSpark use the same checkpoint with a speculative decoding module attached. No independent third-party verification of the claims has been published as of June 28, 2026.

## Sources

- [TechTimes](https://www.techtimes.com/articles/319236/20260628/deepseek-releases-dspark-speculative-decoding-makes-v4-85-percent-faster.htm)

---
Canonical: https://techandbusiness.org/newswire/YN72UdJpPKjczYk6Q4xWog
Retrieved: 2026-07-01T01:37:09.568Z
Publisher: Tech & Business (techandbusiness.org)
