DeepSeek - China taking us by storm
1 min read

DeepSeek is a series of open-source large language models (LLMs) developed by a Chinese AI firm of the same name. It uses a Mixture-of-Experts (MoE) system, activating only the necessary parts of its neural network for a given task. This means it can operate efficiently despite its large size (671 billion parameters). DeepSeek achieves high scores on various benchmarks, including HumanEval (coding), GSM8K (problem-solving), and can handle long-context tasks with up to 128K tokens.
This makes it accessible to a wider audience, including businesses and developers who may not have the resources for proprietary models.
DeepSeek's open-source nature and focus on efficiency are making a significant impact on the AI landscape. It's challenging the status quo and allowing more people to access and benefit from advanced AI technology. [Wiki]
Benchmark | DeepSeek Score | Developer Advantage |
HumanEval Pass@1 | 73.78% | Faster, more precise code generation and debugging |
GSM8K 0-shot | 84.1% | Better algorithm design and problem-solving |
Math 0-shot | 32.6% | Enhanced computational analysis for complex tasks |
References : https://daily.dev/blog/deepseek-everything-you-need-to-know-about-this-new-llm-in-one-place