Chinese AI startup DeepSeek unveiled its AI models — DeepSeek-V3 and DeepSeek-R1, open-source AI reasoning model. The model got widespread acceptance, surpassing ChatGPT as the most downloaded app on the App Store. The model development stood strong against the o1 and o3 models of OpenAI, with a fraction of their investments ($5 million). The primary reason for the cost-efficiency of the model is the use of a watered-down version of the GPU—NVIDIA H800, with lower chip-to-chip bandwidth.
The DeepSeek-V3 model offers increased efficiency and reduced cost, which can be attributed to the use of Multi-Head Latent Attention (MLA). The model is based on the Mix-of-Experts (MOE) architecture, which brings together a team of specialist models working together to answer a question instead of a single big model managing everything.
Similarly, DeepSeek-R1 is a thinking model and offers test-time computing supported by the MOE architecture. The model surpasses the performance of the frontier model in the market, in performing mathematical calculations, coding, and general knowledge. Additionally, the model is reportedly 90-95% more economical than its competitors. The model beats the competition by presenting its chain of thought while producing the output to the prompt.
The rise of the models reinforces the competitiveness of open-source AI over closed, proprietary models of major technology firms. This is reasoned by the freedom of open-source AI models to cater to the users the freedom to modify and build the desired. However, the major reason proposed for keeping AI models closed, is to protect the users against data privacy breaches and potential misuse of the technology. Furthermore, while DeepSeek’s model weights and codes are open, its training data sources remain largely opaque, raising concerns related to potential biases or security risks. Experts are moreover concerned about the hidden motive of the (China), for data collection.
The model launch brings more than privacy concerns. OpenAI CEO, Sam Altman, acknowledged DeepSeek’s R1 is delivering impressively or the price. It is anticipated that DeepSeek’s models will inevitably put downward pressure on AI prices. The model launch established that the AI supply chain can be cost-efficient using open-source software. Thus, the players that fail to differentiate themselves could face significant funding challenges.
Amidst the growing acceptance of open-source, the flip is threatened by potential export control and cutting-edge chips, such as Nvidia H100s and GB10s. Additionally, major market players including OpenAI, Meta, Google, and more, own billions of dollars in massive computing resources and global distribution. Furthermore, OpenAI is far from over. It was the first to market, both with LLMs (GPT-4) and reasoning models (o1). Thus, it would not be wrong to anticipate a breakthrough at the end of the company which is more cost-effective, also when the company lacks neither efficiency nor resources.
Although the launch of the open-source models by DeepSeek has been turbulent for Silicon Valley, the experts are skeptical about the stride. The model emerged as a more efficient and cost-effective alternative to the close-source model, however, is still surrounded by privacy concerns, owing to the intentions of data collection by China. Furthermore, as the model uses an obsolete GPU, the expansion demands extensive capabilities to sustain amongst updated versions. Furthermore, the model can be subjected to trade restrictions globally, leading to a contracted market. Thus, albeit the Nasdaq witnessed significant movement in the past weeks, post-model launch, it is still early to conclude about its long-term sustenance.