DeepSeek's AI Breakthrough

The release of DeepSeek’s AI Models, V3 and R1 has caused shockwaves across the technology and investment world. The initial market reaction has hit the semiconductor stocks, a few of the Magnificent Seven stocks and (somewhat surprisingly) the energy stocks that were poised to benefit from the power demands of AI.

A few things are worth mentioning from an investment perspective. But to do so, it is important to understand the background and genesis of DeepSeek.

Founded as an offshoot of a Quant Hedge Fund, DeepSeek harnessed patriotic PhD talent and resource-efficient strategies to develop cutting-edge AI despite U.S. export controls.

DeepSeek's Unconventional Origins

Initially setup as the research arm of a Quant Hedge Fund called High-Flyer, DeepSeek was born out of the founder, Liang Wenfeng’s drive for scientific curiosity. According to Liang, when he put together DeepSeek’s research team, he was not looking for experienced engineers to build a consumer-facing product. Instead, he focused on PhD students from China’s top universities who were eager to prove themselves.

The hiring strategy helped create a collaborative company culture where people were free to use ample computing resources to pursue unorthodox research projects. The fact that these young researchers were almost entirely educated in China added to their drive. “This younger generation embodies a sense of patriotism, particularly as they navigate US restrictions and choke points in critical hardware and software technologies,” according to one researcher.

Because of US government export controls, DeepSeek had to come up with more efficient methods to train its models. “They optimized their model architecture using a battery of engineering tricks—custom communication schemes between chips, reducing the size of fields to save memory, and innovative use of the mix-of-models approach,” says Wendy Chang, a policy analyst at the Mercator Institute for China Studies. “Many of these approaches aren’t new ideas, but combining them successfully to produce a cutting-edge model is a remarkable feat.”

This is very different from the approach thus far in the U.S., where cheap memory and compute power (funded though money being thrown at the problem, we might add) made it feasible to just use brute force methods at fitting and forecasting problems. Thus what we have are highly inefficient models using huge computational capacity to do amazing things.

To be fair, the first iteration of anything is fantastically expensive. Then the cost cutting gets to work. And in a way, DeepSeek and the Chinese researchers have taken us back to the previous generations of computer programming, where one had to be really careful about resources, using efficient algorithms, and as little memory and compute power as possible.

DeepSeek’s new V3 and R1 models rival OpenAI and Anthropic in performance, earning praise for real-world usability and achieving 45x training efficiency through groundbreaking optimizations.

About the DeepSeek Models

As an investor who wants to better understand what is under the hood, without getting into the technical details of it, a few things worth knowing.

DeepSeek has released two new models that have basically world-competitive performance levels on par with the best models from OpenAI and Anthropic (blowing past the Meta Llama3 models and other smaller open source model players such as Mistral). These models are called DeepSeek-V3 (essentially their answer to GPT-4o and Claude3.5 Sonnet) and DeepSeek-R1 (their answer to OpenAI's o1 model). For reference, GPT-4o is the typical ChatGPT web query model, while the o1/R1 models are the newer reasoning models (also called chain-of-though models) – something for which OpenAI was recently charging $200 per month.

Furthermore, two other important facets are worth sharing (which for lack of our own technical know-how, we take from the blogsphere)¹:

“One, the DeepSeek models are absolutely legit. AI benchmarks are routinely gamed so that models appear to perform great on the benchmarks but then don’t really work in real world tests. DeepSeek models are not like that — the responses are coherent, compelling, and absolutely on the same level as those from OpenAI and Anthropic.” Programmers, in particular, are switching to it, en masse.
“Two, that DeepSeek has made profound advancements not just in model quality, but more importantly in model training and inference efficiency. By layering together a handful of distinct, very clever optimizations, DeepSeek was able to train their models in a dramatically more efficient way. By some measurements, over ~45x more efficiently than other leading-edge models.”

Investment Implications

Short term correction in stock prices of certain companies
Those hundreds of millions of dollars of investment in the first generation of AI training are going to get undercut by the new low-cost competitors. As a result, expect the premium pricing model for specialized AI training infrastructure to be hit. NVIDIA's dominant position in the AI chip market, while still strong, could see margin compression. And companies that have heavily invested in traditional, resource-intensive training approaches may need to adapt their strategies
Open-Sourcing is a Big Deal
The embrace of open-source development by DeepSeek represents a pivotal shift in the AI landscape, ironically embodying the original mission that OpenAI abandoned when Sam Altman transformed it from a non-profit to a for-profit entity. This democratization of AI technology is poised to accelerate innovation significantly, as researchers and developers outside the dominant frontier labs like OpenAI, Anthropic, and Google can now actively contribute to and shape next-generation models.
Don’t Expect Demand for Compute to Decline
Despite efficiency gains, computational demand is set to accelerate rather than decline, following basic economic principles where decreased costs drive increased usage.
With Increasing Innovation in the Application Layer
This growth trajectory will continue particularly in the AI applications that extend well beyond language models into fields like robotics, autonomous vehicles, chip design, and biological research. As language models become more efficient and serve as foundational components, the computational bottleneck simply shifts to these broader applications, unlocking new waves of demand across various sectors.
Look for Value in Chinese Tech
DeepSeek’s success could spark renewed interest in China’s AI sector, which remains undervalued relative to U.S. peers. Despite regulatory uncertainties and geopolitical tensions that have impacted investor confidence in recent years, as DeepSeek has shown China’s tech companies are on the forefront of the cutting-edge and well-positioned to capitalize on breakthroughs in AI.

DeepSeek's innovation under constraints signals a shift in AI development toward efficient algorithms and clever engineering, challenging resource- heavy models.

Closing Thoughts

DeepSeek's breakthrough demonstrates how constraints can drive innovation in unexpected ways. Their success, achieved despite limited access to advanced chips, has shown that efficient model architecture and clever engineering can compete with approaches that rely primarily on massive computing power. This development is likely to democratize AI development, accelerate innovation through open-sourcing, and shift focus toward more efficient algorithmic approaches. While this may disrupt current market leaders and cause short- term market volatility, it ultimately points toward a more sustainable and accessible future for AI development. The industry is entering a new phase where success will depend not just on computational resources, but on innovative approaches to model design and training efficiency.

The key takeaway is that the AI race is far from over - it's merely entering a new, more nuanced phase where multiple paths to advancement exist, and where clever engineering may prove as valuable as raw computing power.

‍

1. The Short Case for Nvidia Stock | YouTube Transcript Optimizer

This content is provided for informational purposes only and does not constitute investment advice. It should not be relied upon as the basis for making any financial or investment decisions.