OpenAI Accuses Chinese Rival of Stealing AI Model Data

Above: The 'deepseek' logo is displayed on a phone screen on Jan. 30, 2025. Image copyright: Omer Taha Cetin/Anadolu via Getty images

The Facts

  • OpenAI has accused Chinese AI firm, DeepSeek, of "distilling" its data to cheaply replicate OpenAI's technology. Microsoft reportedly detected data being exfiltrated in late 2024 through OpenAI developer accounts linked to the Chinese company.

  • Distillation involves training smaller AI models using outputs from larger, more advanced ones. Such a process is in violation of OpenAI's terms of service and the $157B San Francisco-based firm is now reviewing evidence against DeepSeek.

  • DeepSeek's R1, which cost only $5.6M to train using 2,048 Nvidia H800 graphics cards, demonstrated capabilities comparable to US AI models. This caused tech stocks to lose $1T in market value, with Nvidia shares alone falling 17% ($589B).


The Spin

Narrative A

Chinese companies are systematically attempting to exploit US technology through unauthorized access to leading AI models. The sophisticated nature of DeepSeek's alleged data extraction through API manipulation represents a serious threat to American technological leadership and intellectual property rights. This incident demonstrates the critical need for stronger protections of US AI capabilities against foreign competitors.


Narrative B

The efficiency demonstrated by DeepSeek's model development process represents a legitimate technological advancement and challenges the assumption that building advanced AI requires massive resources. The accusations also appear hypocritical given OpenAI's own history of training models on data without explicit consent, and the company may be attempting to maintain market dominance by discrediting legitimate competition.



Metaculus Prediction




Public Figures


Go Deeper


Articles on this story