China

DeepSeek rushes to launch new AI model as China goes all in

Next-gen AI model R2 launch moved up from May to 'as early as possible'

DeepSeek rushes to launch new AI model as China goes all in
The Deepseek logo and words reading "Artificial Intelligence AI" are seen in this illustration taken on January 29, 2025.
Reuters

Chinese startup triggered $1T market sell-off with January's R1 model

Company operates like research lab with flat hierarchy, unlike Chinese tech giants

Uses cost-efficient AI techniques, prices 20-40x cheaper than OpenAI

DeepSeek is looking to press home its advantage.

The Chinese startup triggered a $1 trillion-plus sell-off in global equities markets last month with a cut-price AI reasoning model that outperformed many Western competitors.

Now, the Hangzhou-based firm is accelerating the launch of the successor to January's R1 model, according to three people familiar with the company.

Deepseek had planned to release R2 in early May but now wants it out as early as possible, two of them said, without providing specifics.

Rivals are still digesting the implications of R1, which was built with less-powerful Nvidia chips but is competitive with those developed at the costs of hundreds of billions of dollars by U.S. tech giants.

"The launch of DeepSeek's R2 model could be a pivotal moment in the AI industry," said Vijayasimha Alilughatta, chief operating officer of Indian tech services provider Zensar.

R2 is likely to worry the U.S. government, which has identified leadership of AI as a national priority.

Different path

Little is known about DeepSeek, whose founder Liang Wenfeng became a billionaire through his quantitative hedge fund High-Flyer.

Reuters interviewed a dozen former employees, as well as quant fund professionals knowledgeable about the operations of DeepSeek and its parent company High-Flyer.

They told a story of a company that functioned more like a research lab than a for-profit enterprise and was unencumbered by the hierarchical traditions of China's high-pressure tech industry.

At DeepSeek and High-Flyer, Liang has similarly shunned the practices of Chinese tech giants known for rigid top-down management, low pay for young employees and "996" - working from 9 a.m. to 9 p.m. six days a week.

"Liang gave us control and treated us as experts. He constantly asked questions and learned alongside us," said 26-year-old researcher Benjamin Liu, who left the company in September.

While Baidu and other Chinese tech giants were racing to build their consumer-facing versions of ChatGPT in 2023 and profit off of the global AI boom, Liang told Chinese media outlet Waves last year that he deliberately avoided spending heavily on app development, focusing instead on refining the AI model's quality.

Computing power

DeepSeek's success with a low-cost AI model is based on High-Flyer's decade-long and substantial investment in research and computing power, three people said.

High-Flyer spent 1.2 billion yuan on two supercomputing AI clusters in 2020 and 2021. The second cluster, Fire-Flyer II, was made up of around 10,000 Nvidia A100 chips, used for training AI models.

As one of the few companies with a large A100 cluster, High-Flyer and DeepSeek were able to attract some of China's best research talent, two former employees said.

The startup used techniques like Mixture-of-Experts (MoE) and multihead latent attention (MLA), which incur far lower computing costs, its research papers show.

Deepseek app is seen in this illustration taken, January 28, 2025. Reuters

DeepSeek's pricing was 20 to 40 times cheaper than what OpenAI charged for equivalent models, analysts at Bernstein brokerage estimated in early February.

For now, Western and Chinese tech giants have signaled plans to continue heavy AI spending, but DeepSeek's success with R1 and its earlier V3 model has prompted some to alter strategies.

OpenAI cut prices this month, while Google's Gemini has introduced discounted tiers of access.

Even before R1 gripped global attention, there were signs that DeepSeek had caught Beijing's favor.

State embrace

The subsequent fanfare over the cost competitiveness of its models has buoyed Beijing's belief that it can out-innovate the U.S., with Chinese companies and government bodies embracing DeepSeek models at a pace that has not been offered to other firms.

At least 13 Chinese city governments and 10 state-owned energy companies say they have deployed DeepSeek into their systems, while tech giants Lenovo, Baidu and Tencent have integrated DeepSeek's models into their products.

The Chinese embrace comes as governments from South Korea to Italy remove DeepSeek from national app stores, citing privacy concerns.

"Our problem has never been funding," he told Waves in July. "It's the embargo on high-end chips."

Comments

See what people are discussing

More from Science

AI to replace 4000 jobs in Southeast Asia's largest bank

AI to replace 4000 jobs in Southeast Asia's largest bank

International Monetary Fund had already warned that AI could affect around 40 percent of jobs around the world.

More from World

Pakistan top court revisits Bajwa’s extension amid military trial appeals

Pakistan top court revisits Bajwa’s extension amid military trial appeals

Supreme Court quotes 2019 ruling on former army chief’s extension as judges scrutinize legal changes made to extend his tenure