|
28 January 2025 -- So we received an early Chinese New Year surprise from DeepSeek. The Chinese AI firm launched its reasoning model last week - and analysts belatedly woke up to it.
The firm’s consumer app jumped to #1 in the Apple AppStore and U.S. stock markets, overly indexed on big tech, took a pounding. About $1.2 trillion was been wiped off the U.S. markets yesterday, led of Nvidia getting a hammering. It had a $600 billion one-day loss in its capital - the largest ever drop for one stock in the history of the U.S. stock markets.
What happened? The Chinese artificial-intelligence upstart has trained high-performing AI models cheaply - very, very, very, very cheaply - without the most advanced gear provided by Nvidia and others.
And it is better than or equal to OpenAI on almost every task. And remember: DeepSeek is open source.
That has pulled the rug from under global companies riding the AI wave, including chip makers, infrastructure suppliers and power stocks, as investors question the outlook for AI spending.
My Chief Technology Officer, Eric De Grasse, went into the fine detail in a post yesterday.
The bottom line? DeepSeek showed you could do the training for pennies, not the astronomical sums Sam Altman and his Robber Barons have been demanding.
The Chinese changed the math. Google, OpenAI, Meta, and Nvidia have all bet on capital spending being the path forward and huge amounts of it. Cash would buy chips. Lots of chips. This was going to provide the moat, the source of advantage. U.S. model makers have been locked into a single paradigm of building ever-larger, more compute-hungry models.
And, after all, the capital markets were willing to fund this outsize spending on GPUs, so why not go for it?
With China’s venture capital market becoming moribund, local players could not access enough capital. Even those that could, such as the Qwen team from Alibaba or the Doubau from ByteDance, export restrictions hampered access to compute power. So they went internal.
Steven Sinofsky (who has been up to his knees and elbows in the whole world of tech, but especially AI, for ages) put it aptly when he observed that the history of computing is one of innovation followed by a scale-up, eventually disrupted by a “scale-out” approach - when bigger and faster methods are replaced by smaller, more numerous alternatives. As he noted last night on his blog:
China faced an AI situation not unlike Cisco did in its early years. Many point to the Nvidia embargo as the cause, but the details don’t really matter.
The point is they had different constraints: more engineers than data centers to train in. Inevitably, they would develop a different kind of solution. One thing for certain is that all firms will look at model development practices with an emphasis on driving efficiencies.
As I wrote last month about OpenAI’s o3, early versions are often expensive, but we can assume that the performance we get at $3,500 will cost us substantially less, perhaps a dollar or two, within no more than a couple of years. The cost of GPT4 quality results has declined by more than 99% in the last two years. GPT-4 launched in March 2023 at $36 per million tokens.
Except today, China’s DeepSeek offers similar performance for $0.14, or 250 times cheaper.
Former Clubhouse influencer Marc Andreessen probably had it right when he posted on Twitter "AI’s Sputnik moment has arrived".
So panic all around. AI and chip manufacturer stocks went into free fall this morning as the market reacts to DeepSeek.
Oh, and yesterday DeepSeek was subject to a massive cyber attack. The tech mavens and financiers with so much $$$$$$$$$$$$$$ invested in OpenAI and its kin will do anything for revenge 😈
But if you’re looking for a real break down of what DeepSeek can’t do that ChatGPT can, it’s a lot of quality of life stuff. It can’t generate images, can’t talk to you, doesn’t support third party plugins, and doesn’t have “vision” like ChatGPT does.
All that said, on Monday, DeepSeek released an open-source image generator called Janus-Pro-7B that is, once again, as good, if not better, than OpenAI’s DALL-E 3.
But the key thing is this. Limitations aside, the fact DeepSeek is essentially free, costing cents to use its API, open source, and was reportedly created by a team for only around $5 million, has raised several existential questions for America’s tech giants.
Or as noted AI evangelist and OpenAI (former?) superfan Ed Zitron wrote on Bluesky yesterday:
The AI bubble was inflated based on the idea that we need bigger models that both are trained and run on bigger and even larger GPUs. A company came along that has undermined the narrative - ways both substantive and questionable.
But to put all of this in larger context, Andreessen’s Sputnik comparison isn’t totally inaccurate. Especially if you, like him, believe that artificial general intelligence is both possible and a genuine nuclear-level threat to our existence or space-race-esque quest to change the future of humanity. But I’d actually compare DeepSeek to something much more recent: TikTok.
And some of the commentary was hilarious:
|