Chinese Lab DeepSeek Unveils Reasoning AI Model That Competes With OpenAI’s O1

Trending post

OpenAI Introduces ChatGPT Pro: a Premium $200 monthly Subscription

2024-12-06

Threads Adds New Feature to Track Post Performance and Audience Engagement

2024-12-06

Ransomware Attacks Continue to Target UK Hospitals, Exposing Sensitive Data

2024-12-04

Superhuman Introduces Availability-Sharing Feature in Email and Calendar Integration

2024-12-04

Amazon Connect Unveils New Generative AI Features and Enhanced Security for Smarter, More Secure Customer Service

Amazon Connect Unveils New Generative AI Features and More Secure Customer Service

2024-12-02

South Korea becomes first country 10% of workforce are robots

2024-11-29

Anthropic Launches Model Context Protocol to Streamline AI Integration with Data Sources

2024-11-26

DeepSeek, a Chinese AI research company backed by quantitative traders, has introduced a new AI model called DeepSeek-R1, which it claims rivals OpenAI’s reasoning model, o1. Released in a preview on Wednesday, the model is designed to perform complex reasoning tasks and is said to offer competitive performance compared to OpenAI’s own models on key benchmarks. DeepSeek-R1 is one of the fast-growing groups of reasoning AI, where models take longer times in examining and critically thinking through questions to serve with more precision and reliability.

Reasoning Models: What Is It All About?

Most AI models process and answer queries relatively quickly, but reasoning models, such as DeepSeek-R1 and OpenAI’s o1, adopt a more thoughtful approach. These models are designed to analyze questions more deeply, spending additional time to fact-check and consider multiple possibilities before offering an answer. This method helps the models avoid common pitfalls that arise from more instantaneous responses.

Both DeepSeek-R1 and OpenAI’s o1 function in ways that they can think through tasks by planning and performing several steps to a solution. For example, one model may take several seconds, or tens of seconds, to process complex inquiries and complete a task. DeepSeek claims that its R1 model has a performance comparable to that of OpenAI’s o1-preview on various AI benchmark tasks like AIME and MATH. For instance, AIME uses other AI models to rate the performance of a model, whereas MATH is comprised of numerous word problems that test the logic and problem-solving capabilities of a model.

However, despite its advancements, DeepSeek-R1 is not without limitations. On some logic problems, including simple games like tic-tac-toe, the model struggles just as OpenAI’s o1 does. This suggests that even cutting-edge reasoning models have room for improvement, especially in tasks requiring precise logical reasoning.

Security Concerns: Jailbreaking and Censorship

DeepSeek-R1 is not tamper-proof. Like all AI models, it can be “jailbroken,” meaning that it can be primed to ignore supposed safeguards aimed at its failure to pick up harmful content. For instance, one user was able to persuade the model to create an extended recipe for methamphetamine. Such weaknesses spell problems in its ability to handle malicious prompts.

DeepSeek-R1, on the other hand, shows behavior consistent with the censorship practices observed in China’s AI regulations. In tests, the model refused to engage with questions about politically sensitive topics, such as Chinese leader Xi Jinping, the Tiananmen Square protests, or the potential consequences of China invading Taiwan. This is probably attributed to the strict governmental scrutiny of AI models in China. AI models have to conform to the “core socialist values” of the country. There are even reports that the Chinese government has compiled a list of banned sources on which AI systems should not train. Many models, including DeepSeek-R1, tend to sidestep controversial topics so as to avoid regulatory backlash.

The Rise of a New Scaling Law

As AI models become increasingly sophisticated, the focus has shifted from traditional scaling laws—where improvements are achieved by simply adding more data and computational power—to newer methods of improving performance. One such approach is test-time compute, which underlies models like DeepSeek-R1 and OpenAI’s o1. Test-time compute, also known as inference compute, allows a model additional processing time during task completion. This additional computational power allows the models to think much more coherently and debate the information at hand, thus providing quality answers.

Microsoft CEO Satya Nadella recently took note of this new scaling law in a recent keynote speech during the company’s Ignite conference. He says AI development is not only crossing the well-known scales but is using techniques like test-time compute to augment the models’ abilities further.

DeepSeek: A Novel Operation in the AI World

DeepSeek’s participation has been great not only because of its technological achievements but also because it is supported by High-Flyer Capital Management, a Chinese hedge fund that uses AI in trading by supporting decisions. This combination of financial muscles and AI knowledge has given DeepSeek an edge in terms of resource superiority. One of its early prototypes, DeepSeek-V2, a general AI that can process both text and images, compelled big companies like ByteDance, Baidu, and Alibaba to slash their prices for AI models, with some going free.

High-Flyer, which runs its own data centers for training AI models, has published a cluster of 10,000 Nvidia A100 GPUs. With an estimated value of around $138 million, the fintech company was founded by Liang Wenfeng, a computer science graduate, hoping to develop “superintelligent” AI through DeepSeek.

Future Plans and Open Source Vision

Looking forward, DeepSeek has a plan to open-source the DeepSeek-R1 model and provide an API for developers and researchers to enhance the applications on their sides by using DeepSeek’s reasoning capabilities. This would make DeepSeek-R1 more accessible and accelerate its adoption in multiple industries. By opening up the model to the public, DeepSeek sees opportunities to be competitive with the likes of OpenAI and Google in AI development.