Nvidia has acknowledged the breakthrough of DeepSeek’s R1 model, describing it as “an excellent AI advancement,” even as the Chinese startup’s rise led to a sharp 17% drop in Nvidia’s stock price on Monday. Nvidia calls China’s DeepSeek R1 model ‘an excellent AI advancement’ while acknowledging its reliance on compliant Nvidia GPUs.
A spokesperson for Nvidia told CNBC that DeepSeek represents a notable achievement in AI and highlights the potential of “Test Time Scaling.” According to Nvidia, the model demonstrates how innovative techniques can leverage widely available resources while adhering to export control regulations.
“DeepSeek’s work showcases how new AI models can emerge using techniques like Test Time Scaling, which rely on compliant models and compute infrastructure,” the spokesperson said.
DeepSeek’s R1 Model: Disrupting AI Development Costs
Nvidia calls China’s DeepSeek R1 model ‘an excellent AI advancement’ for showcasing the potential of Test Time Scaling in AI development. DeepSeek, a Hangzhou-based AI startup, released its R1 model last week. This open-source reasoning model has reportedly surpassed the performance levels of leading models from U.S.-based companies, including OpenAI. Impressively, DeepSeek claims its R1 model was trained at a cost of less than $6 million—a fraction of the billions spent by tech giants in Silicon Valley.
The R1 model’s hybrid architecture combines reinforcement learning with chain-of-thought reasoning. It comes in two versions: DeepSeek-R1 and DeepSeek-R1-Zero, the latter being capable of unsupervised fine-tuning for even greater reasoning abilities.
Implications for Nvidia and U.S. Tech Giants
Despite the competition posed by DeepSeek, Nvidia noted that its GPUs are integral to DeepSeek’s operations. The startup reportedly utilized approximately 2,000 Nvidia H800 chips, designed to comply with U.S. export regulations introduced in 2022.
“Inference tasks require significant numbers of Nvidia GPUs and high-performance networking,” the spokesperson added, emphasizing the importance of GPUs in supporting AI advancements like DeepSeek.
This development has raised concerns among analysts over the efficiency of U.S. tech companies’ massive investments in AI infrastructure. Microsoft plans to spend $80 billion on AI infrastructure in 2025, while Meta’s projected capital expenditures for AI are estimated between $60 billion and $65 billion for the same year.
Analysts Question High Costs of AI Development
Experts believe that if training costs for models like R1 remain significantly lower, companies relying on AI services could benefit from cost reductions in the short term. However, the long-term impact on hyperscale AI revenues and investments may be more profound.
Bank of America analyst Justin Post noted that lower model training costs could lead to savings for sectors like advertising and consumer applications. However, this would also reduce the scale of revenues for AI infrastructure providers.
Shifting Focus to Test Time Scaling
In a groundbreaking shift, Nvidia calls China’s DeepSeek R1 model ‘an excellent AI advancement’ for its hybrid architecture and reasoning abilities. Nvidia CEO Jensen Huang, OpenAI CEO Sam Altman, and Microsoft CEO Satya Nadella have recently highlighted a new phase of AI development driven by “Test Time Scaling.”
This concept builds on the 2020 scaling law proposed by OpenAI researchers, which emphasized that increasing computation and data led to better AI models. Test Time Scaling, however, focuses on optimizing the use of compute resources during inference to improve model accuracy and reasoning.
DeepSeek’s R1 model exemplifies this approach, leveraging additional computational power during inference to achieve better outputs. The model’s efficiency and lower training costs have sparked questions about the future direction of AI investments.
Amid its growing recognition, DeepSeek faced a large-scale cyberattack, forcing the startup to restrict user registrations to mainland China phone numbers. The company did not disclose the duration of these restrictions or further details about the attack.