DeepSeek-R1: The Future of Industry is Open-Source AI

Advertisements

In just under a month since its launch, DeepSeek-R1 has surged to the forefront as the most popular model ever on the Hugging Face platform. As reported on February 14 by Clément Delangue, co-founder of Hugging Face, the model and its thousands of derivatives have collectively surpassed 10 million downloads. A dramatic uptick in downloads has been depicted in graphs shared by Delangue, where the orange line representing DeepSeek-R1 shoots upward almost vertically, while other established open-source models like Llama, Stable Diffusion, and Mistral crawl at a much slower pace.

This explosion in popularity follows an impressive feat where the DeepSeek AI assistant vaulted to the top of the free download charts in the U.S. App Store. Let's explore how the performance of this intelligent assistant has played out over its first month on the market.

On the same date, DeepSeek held onto a commendable third place in the productivity tools category, with ChatGPT regaining the top slot and Google Gemini following in fourth place. User reviews in DeepSeek’s comments section reveal a predominant sentiment of satisfaction and excitement:

“I've canceled my GPT subscription. I love being able to read through its ‘reasoning’ processes... Not to mention I can run the local 14b and 32b models on my MacBook. This is way better than Apple Intelligence... If DeepSeek can do this better and cheaper, Apple should seriously consider reevaluating their AI teams,” one user commented.

Another user added, “Five-star rating! I recently had the opportunity to use DeepSeek, and I must say it has completely transformed how I approach data analysis and decision-making... I’m particularly impressed by the level of customization and flexibility it offers... Thank you, DeepSeek, for creating such a powerful and user-friendly solution!”

Overall, however, DeepSeek maintains a rating of only 4.1, falling short compared to its main competitors: ChatGPT, which boasts a 4.9 rating, and Google Gemini at 4.8. It’s important to note that ChatGPT and Gemini have undergone multiple iterations of refinement, featuring polished UI/UX designs that contribute to smoother user experiences. Meanwhile, although DeepSeek excels in AI model compression and lightweight design, users frequently report issues like response delays, unstable server connectivity, and limited access. Furthermore, as a product hailing from China, its trustworthiness in the U.S. market is comparably low. In spite of these hurdles, DeepSeek's current acclaim and download figures are notably impressive.

Looking back over the past month, DeepSeek has consistently made headlines across various outlets, emerging as a hot topic within the tech sector and venture capital circles.

By offering a “cost-effective” language model, the Hangzhou-based company has sparked a profound reevaluation of "burn rate" AI business models while shaking the stock market in the process.

The date January 27 marked what could be seen as a watershed moment as DeepSeek's AI assistant captured the top of the App Store free download rankings. In response, the Nasdaq index plummeted by over 3%, dipping to 19204.95 points, while the S&P 500 fell by 1.46%, hitting a low of 5962.92 points.

However, as the initial panic subsided, the two indices have since exhibited signs of recovery, with the Nasdaq re-emerging above the 20000 mark last Friday, and the S&P 500 rebounding to 6114.63 points.

Why has DeepSeek caused such a shockwave in the U.S. market?

There are four main reasons cited as contributing factors to this startling impact:

1. **Remarkably low training costs**: The DeepSeek team claims that they trained R1 with a mere $6 million, while estimates for GPT-4 reach into the hundreds of millions.

2. **Validation of Chinese AI capabilities**: The U.S. has consistently limited the export of AI chips to China (like Nvidia GPUs), yet DeepSeek's breakthroughs suggest that such restrictions have not stifled Chinese advancements in AI.

3. **Open-source accessibility with generous licensing**: DeepSeek-R1 uses the MIT license, presenting an even more open approach than Meta’s Llama, allowing anyone to freely use, modify, and even commercialize the model.

4. **Transparency in reasoning processes**: This is crucial. Unlike past releases by OpenAI that shrouded reasoning processes in secrecy, DeepSeek has made its reasoning processes public. This openness can enable smaller models to quickly undergo knowledge distillation, lowering their training costs and speeding up performance.

Interestingly, prior to DeepSeek's meteoric rise, it had already captured the attention of the English tech community for some time.

The earliest discussions on platforms like HackerNews date back to September 2024, where initial impressions of DeepSeek's performance were generally positive, yet users voiced concerns over terms of service and privacy policies. A reply stated, “It’s an open-source model, it’s cheap and useful, so there’s not much to worry about.”

In October 2024, a post titled “DeepSeek v2.5 – An open-source large language model comparable to GPT-4, yet 95% cheaper” sparked lively discourse, highlighting that many developers were seeking more economical alternatives in large language models. However, most commenters felt that while DeepSeek v2.5 covered basic needs effectively, it lacked completeness and performance relative to GPT-4.

The months from October to December continued to see significant chatter surrounding DeepSeek.

The turning point came on January 20 when DeepSeek officially launched R-1, a model that can match OpenAI's capabilities in mathematics, code generation, and natural language reasoning while requiring much less computational power than mainstream models. Following this launch, the DeepSeek AI assistant erupted to the top of the U.S. App Store’s free applications, wreaking havoc on the stock prices of American tech firms.

Within the anonymous workplace social platform Blind, a Meta employee divulged the scale of disruption DeepSeek has inflicted on Meta’s GenAI division:

“Management is worried about justifying the vast costs of the GenAI division. When each leader within GenAI earns more than what it’ll cost to fully train DeepSeek v3, what will they face from higher management? Moreover, we have dozens of such leaders. The rise of DeepSeek-R1 has made everything more daunting. I can’t disclose confidential information, but it will become public soon. GenAI was meant to be a lean engineering-focused group, yet due to competition over influence, the organization ended up bloating with unnecessary hires—resulting in losses for all.”

A responding employee from Google echoed the prevailing sentiment: “What DeepSeek has done is indeed radical. But it benefits the entire sector; we are witnessing firsthand how open competition spurs innovation.”

While it's uncertain whether Meta's GenAI department really feels the pressures detailed in that post, it is undeniable that AI titans like OpenAI, Google, and Anthropic are beginning to feel the heat from DeepSeek and are accelerating product upgrades to consolidate their dominance.

On January 31, OpenAI announced the release of the new reasoning model, o3-mini, making it available to free users for the first time. This latest addition to OpenAI's reasoning models is 93% cheaper than o1, with input prices set at $1.10 per million tokens and output at $4.40 per million tokens.

On February 5, Google made headlines by updating the entire Gemini 2.0 lineup, introducing Gemini 2.0 Flash for general use, the more powerful Gemini 2.0 Pro, and the cost-effective Gemini 2.0 Flash-Lite. According to official announcements, Gemini 2.0 Flash-Lite matches the speed and cost of 1.5 Flash, with a highlight of “high cost-effectiveness”; its input price is at $0.075 per million tokens, and output at $0.30 per million tokens.

When compared to DeepSeek-R1's input/output prices of $0.14 and $2.19 per million tokens, o3-mini still leans towards the expensive side. Meanwhile, although Gemini 2.0 Flash-Lite offers lower costs, it may underperform in more complex reasoning or demanding computational scenarios.

Given such a “budget-friendly” model, it’s no wonder that developers are eager to take advantage of it.

Perplexity was the first to integrate DeepSeek offerings, and Microsoft CEO Satya Nadella announced on January 29 during an earnings call that DeepSeek-R1 would be accessible through Azure AI Foundry and GitHub. Subsequently, cloud service and chip giants like AWS, Nvidia, AMD, and Intel scrambled to align themselves with DeepSeek, aiming to leverage its cost-effective and efficient reasoning capabilities to bolster their respective AI ecosystems and better serve developer needs.

In an increasingly complex geopolitical landscape, disparities arise in AI due to policies and regulations. DeepSeek's adoption of open-source and liberal licensing allows more researchers to cross national and institutional boundaries for in-depth exploration and validation. This “community-driven” approach not only accelerates technological advancements but also significantly reduces the distrust typically associated with geopolitical rivalries, establishing a relatively open public platform for AI innovation that resonates with an unsettling degree of “shock.”

Is this the “Sputnik moment” of AI? Or a gift?

In 1957, the Soviet Union launched Sputnik, the world's first artificial satellite, leaving Americans acutely aware of their shaken technological lead and sparking an unprecedented space race. Today, mainstream U.S. media similes the launch of DeepSeek-R1 by a Chinese team as an equivalent "Sputnik moment" for AI, as it touches a nerve in the tech community, igniting a sense of crisis and urgency akin to that witnessed during the space race. Notable figures like former President Donald Trump have even declared it a wake-up call for the AI industry, urging for heightened focus on competition.

According to Martin Casado, a partner at the U.S. venture capital firm Andreessen Horowitz (a16z), this AI race mirrors the historical space race; America must triumph. He poignantly noted in a recent podcast that the rapid attention DeepSeek received can be attributed to its high level of open-source accessibility and flexible licensing, alongside its showcase of reasoning processes, enabling smaller models to undergo quicker knowledge distillation, thus reducing training costs.

For contrast, OpenAI, while branding itself as “open”, withheld reasoning details during its o1 model release in favor of solidifying its dominant position in the market. Casado, coming from an engineering background and specializing in enterprise software, cybersecurity, cloud computing, and AI investments at a16z, expressed concerns regarding U.S. AI policy failures. He emphasized that stringent restrictions regarding chip and software exports aimed at stifling Chinese AI progress have failed to achieve their intended effects, a fact evidenced by DeepSeek’s emergence.

“We need to view this issue from a broader perspective—China boasts world-class AI research teams. DeepSeek has previously released several state-of-the-art models, like V3, which may actually have higher technical merits than R1. Like GPT-4, they leverage Chain of Thought reasoning and were in fact long under exploration by DeepSeek,” said Casado.

Just as the launch of Sputnik prompted the U.S. to reconsider its technological and educational frameworks while expediting investments into space exploration, the rise of DeepSeek compels a reckoning within the U.S. tech sphere. The reality is that while American giants like OpenAI, Google, and Anthropic prioritize proprietary models under the guise of free-market values, Chinese teams have achieved groundbreaking advancements through open-source avenues, significantly lowering the barriers and costs associated with cutting-edge AI and cultivating a burgeoning AI ecosystem.

For large corporations, maintaining proprietary models aids in controlling intellectual property and solidifying market influence. Yet, this excessive insularity in a rapidly evolving AI domain increasingly falls short of public demand for transparency and openness, thereby potentially stymieing innovation and collaboration.

From a governmental standpoint, measures undertaken by the White House that impose limits on GPU availability and coding censorship have neither curbed Chinese progression nor, paradoxically, preserved American superiority.

Alex Rampell, another partner at a16z, bluntly observed that the Biden administration expresses fears that if the U.S. embraces open-source in AI, China could simply replicate such efforts. Instead, DeepSeek has taken the opposite route—now it’s China providing open-source AI, and American companies are eager to utilize or replicate it due to its impressive performance.

Open-source was once heralded as America's crowning achievement in high-tech, with foundational innovations in internet protocols, operating systems, and databases emerging from environments of extensive openness that solidified America's grasp on the information revolution. Nevertheless, recent years have seen a heightened focus on intellectual property and commercial gains, compounded by national security considerations, with some tech giants opting for a more closed development model that limits collaboration and innovation.

Rampell resists labeling DeepSeek as another “Sputnik moment,” opting instead for the phrase “a gift to the American people”. It obligates the “proud” U.S. to acknowledge the reality of global AI competition and hasten investments in technology, talent, and finance.

Amidst this backdrop, many tech insiders advocate for profound adjustments in U.S. AI policy. Relying on containment and control to maintain advantages will only squander opportunities for substantial advancements across the industry. As Meta's Chief AI Scientist Yann LeCun once mentioned on LinkedIn, “To those who interpret DeepSeek’s performance as a signal of China surpassing the U.S. in AI, you've misunderstood. The correct reading is that open-source models are emerging as superior to proprietary ones.”

As AI competition transitions from a mere quest for larger scales, parameters, and processing power to an emphasis on tailored applications and ecosystem integration, the players who can efficiently deploy extensive models across various industry scenarios and forge strong collaborative networks will emerge victorious.

Nvidia’s CEO Jensen Huang has previously emphasized that the size of a model does not equate to market value; what truly enables technology to flourish is the seamless alignment with real-world demands. Similarly, Stanford University professor Andrew Ng has reiterated that addressing practical issues and generating value for users must remain the fundamental goal of all extensive model development. From healthcare to finance to retail, each sector possesses distinctive challenges and regulatory requirements, necessitating research teams to engage in model customization and tailoring.

In light of this trend, teams like DeepSeek, built on an open and adaptable ecosystem, are not only providing a lower barrier for growth within the industry, but are also pulling in developers and partners, thereby unleashing the potential of AI technology across diverse applications and ensuring sustainable progress.

OpenAI’s CEO Sam Altman has also begun reassessing his strategic approach. Following the launch of o3-mini, this Silicon Valley visionary engaged with executives in response to user inquiries on Reddit. When posed with the question of whether OpenAI might reveal any weights of extensive models, Altman responded candidly: “I personally believe we may have strayed in our open-source approach and need to explore a new model of openness. However, not all OpenAI members share this view, and it’s not currently our top priority.”

In essence, DeepSeek may represent a significant victory for open-source large models, and perhaps even a “gift to the world,” setting a fresh benchmark for the entire industry.

Your email address will not be published. Required fields are marked *

DeepSeek-R1: The Future of Industry is Open-Source AI

Leave a Reply