GitHub Releases DeepSeek 4, a Local Inference Engine for Metal, CUDA, and ROCm

deepseek inference meta

2026-06-22 | Source: Mastodon | Original article

GitHub releases DeepSeek 4, a local inference engine for Metal, CUDA, and ROCm. It supports flash and pro versions.

GitHub has introduced a new project, ds4, a local inference engine for DeepSeek 4 Flash and PRO, supporting Metal, CUDA, and ROCm. This engine is a significant achievement in terms of technology, despite some users expressing concerns about its performance to parameters ratio. The ds4 project is a custom native inference engine built specifically for DeepSeek v4 Flash, with support for DeepSeek v4 PRO on high-memory machines. It has been benchmarked on various platforms, including a 128GB MacBook, showing promising results. What matters here is the potential of ds4 to enable efficient local inference for DeepSeek 4 models, which could be a game-changer for AI applications. As the project continues to evolve, it will be interesting to watch how it addresses performance concerns and expands its capabilities to support more models and hardware configurations.

Sources

Back to AIPULSEN