DeepSeek-V4 Launch: Accelerating Inference and Verified Reinforcement Learning with SGLang and Miles

deepseek inference

2026-04-26 | Source: Mastodon | Original article

DeepSeek-V4 launches with fast inference and verified RL.

DeepSeek-V4 has launched, bringing significant advancements in AI technology. As reported earlier, the highest-earning and most experienced workers are rapidly adopting AI in their jobs, and DeepSeek-V4 is poised to further accelerate this trend. This new model boasts a hybrid attention architecture, combining Compressed Sparse Attention and Heavily Compressed Attention for long-context efficiency, using only ~27% of per-token inference FLOPs and ~10% of KV memory at 1M-token context. The launch of DeepSeek-V4 matters because it offers a verified RL training pipeline in Miles, providing stability, efficiency, and broad hardware support. This pipeline is made possible by the collaboration between SGLang and Miles, enabling fast inference and verified RL on day zero. The model's capabilities have significant implications for industries relying on AI, such as coding, document analysis, and agentic workflows. As the AI landscape continues to evolve, it's essential to watch how DeepSeek-V4 is integrated into various applications and industries. With its launch, NVIDIA has also made the model available for download, allowing developers to build long-context coding, document analysis, and agentic workflows using familiar API patterns. The coming days will reveal how DeepSeek-V4's capabilities are harnessed and what new innovations emerge from its adoption.

Sources

Back to AIPULSEN