Deepseek R1 could mean reduced VC investments into large LLM training runs. They claim to have done it with ~6M. If there’s a big risk of someone else coming out with a comparable model at 1/10th the cost, then there’s no moat in OpenAI in the long run. I don’t know how much the VC / investors buy the ASI as an end goal and even what the pitch would be. They’re probably looking at more prosaic things like moats and growth rates, and this may mean reduced appetite for further investment instead of more.
What can be done for $6 million, can be done even better with 6 million GPUs[1]. What can be done with 6 million GPUs, can’t be done for $6 million. Giant training systems are the moat.
Not sure if it’s correct, I didn’t actually short NVDA so all I can do is collect my bayes points. I did expect most investors to think at a first-level thinking as that was my immediate reaction on reading about DeepSeek’s training cost. If models can be duplicated a few weeks / months after they’re out for cheaper, then you don’t have a moat (this is for most regular technologies. I’m not saying AI isn’t different, just that most investors think of this like any other tech innovation)
It’s been said that the real money maker for Amazon is AWS and not their retail business.
In fact, the lock-in is so strong that there’s a cottage industry of people with AWS certifications and firms whose sole job is “AWS Cost Optimization”.
But what seems to be not yet priced in is the ease of which anyone with a datacenter can now build an AWS-compatible API in the future.
In the end of the day, amazon is bunch of servers in a datacenter. All the so called “services” are just some syntactic sugar for people that don’t want to manage their own servers—and that’s where their moat lies.
It’s hard for a startup who’s built on top of these services to migrate out to another bare-bones rack in another datacenter , but if the datacenter can give them a compatible API, then moving becomes a click of a button (for the most part).
But if you look at how openai competitors worked, almost everyone has a “openai-compatible” API—all I do is change the URL to new model provider and I’m good to go.
This seems like it would truly kill the AWS lock-in, and it doesn’t seem to be priced in to their stock price at all. Maybe people don’t think of AWS as a SaaS company? I would never myself short a stock, but it does seem like the second-order effect to all this is obviously not priced in at all.
The runtime/data-plane APIs are not the moat. There already exist compatible APIs for at least some of AWS services (S3, DynamoDB), and many others use standard/open APIs (SQL), or very simple APIs (SNS, SQS, Firehose, even Lambda and ECS).
It’s the very deep auth/RBAC mechanisms, the automation of control plane/setup, the integration of the services to use together which are the operational barrier to competion. And the history of durability and availability, and clear guidance as to design considerations for reliability which are the trust barriers to competition. Oh, and there are economies of scale even for datacenter—learning to design, build, and operate them has a pretty steep curve.
The easy part is getting easier. The hard part isn’t (well it is, because AWS provides an example, and because LLMs make everything faster. They make AWS better too, though, and AWS has the people/institutional knowledge to get excellent use of LLMs on these topics.
I’m not saying AWS is immune to competition on core services, only that it won’t be a swarm of startups, it’ll be gradual change of equilibrium with other large providers. That said, for newer services, there’s a lot of room for competition with startups built on AWS, which do the new functions better than AWS does, because they make different tradeoffs, like not being fully compatible with AWS auth/setup/billing/management interfaces, which are by necessity rather complex. Even there, the risk is interesting and probably different from recent history. Previously, small competitors to AWS in areas that AWS wants to get good at just get acquired, and become part of AWS. Now it may be more feasible for AWS to rapidly compete with them and implement AWS-style services that make the startup far less attractive to customers.
[ disclaimer: I have worked for companies related to this topic, and this opinion is not based on anything but my speculation and outside knowledge ]
I think there are 3 ways to think about AI and lot of confusion seems to happen because the different paradigms are talking past each other. The 3 paradigms I see on the internet & when talking to people:
Paradigm A) AI is a new technology like the internet / smartphone / electricity—this seems to be mostly held by VC’s / enterpreneurs / devs that think this will unlock a whole new set of apps like AI:new apps like smartphone:Uber or internet:Amazon
Paradigm B) AI is a step change in how humanity will work. Similarly to the agricultural revolution that led to the change in how large society could get and GDP growth, and the industrial revolution was a step-change in GDP growth from ~0% to 2-4% a year, and made things possible such as electricity and the internet and smartphones.
Paradigm C) AI is like the rise of humanity on this earth (the first general intelligences). The world changed completely with the rise of GI, and ASI/AGI will be a similar paradigm. We’ve been locked at humanity’s level of intelligence for the past ~200k years, and getting ASI will be like unlocking multiple new revolutions all at the same time.
Most of the LW crowd is probably (C) or between (B) and (C)
When talking to the general population, I’ve found it to be very helpful to probe about where they are before talking about things like AI safety / how the world will change.
If RL becomes the next thing in improving LLM capabilities, one thing that I would bet on becoming big is computer-use in 2025. Seems hard to get more intelligence with just RL (who verifies the outputs?), but with something like computer use, it’s easy to verify if a task has been done (has the email been sent, ticket been booked etc..) that it’s starting to look to more to me like it can do self-learning.
One thing that’s left AI still fully not integrated into the rest of the economy is simply that the current interfaces were built for humans and moving all those takes engineering time / effort etc.
I’m fairly sure the economic disruption would be pretty quick once this happens. For example, I can just run 10 LLM agents to act as customer service agents using my *existing tools* - just open emails, whatsapp, and message customers, check internal dashboards etc., then it’s game over. What’s stopping people right now is that there’s not enough people to build that pipeline fast enough to utilize even the current capabilities.
Just finished reading “If Anyone Builds It, Everyone Dies”. I had a question that seems like an obvious one, but one I didn’t see addressed in the book, maybe someone can help:
The main argument in the book is the analogy to humans. Evolution “wanted” us to maximize genetic fitness, but it didn’t get what it trained for. Instead, it created humans who love ice cream and condoms even though they reduce our genetic fitness.
With AGI, we’re on track to do something similar—we won’t get an AI aligned to human interests even though we do RLHF or any other such simple training or shaping to an AI, it’ll end up wanting something weird and inhuman rather than maximizing human values.
But in my mind, this seems to miss a fairly important point: The fact that human brains don’t come pre-wired with much knowledge. We have to learn it from scratch. We don’t come out of the womb with concept of “inclusive genetic fitness”. It took us culture and ~200,000 years to figure that out, and we still only learn it after about 15-20 years of existing. So there’s no way that evolution could have made us point our utility function to “inclusive genetic fitness” because that concept doesn’t exist in our brains.
Modern AIs don’t seem like that. They come with the sum of human knowledge baked in during pre-training. As they get smarter, the concept of “human values” or “friendly AI” is definitely something in it’s existing mind. So it should be much easier for us to do alignement and test whether we can point it to that specific concept vs. what what evolution had.
Yes, I agree with that. I’m not claiming that knowing about it stops you from wanting ice cream.
I’m claiming that if the concept was hardwired into our brains, evolution would have had an easy time optimizing us directly to want “inclusive genetic fitness” rather than wanting ice cream.
i.e—we wouldn’t want ice cream at all but reason from first principles what we should eat based on fitness.
Deepseek R1 could mean reduced VC investments into large LLM training runs. They claim to have done it with ~6M. If there’s a big risk of someone else coming out with a comparable model at 1/10th the cost, then there’s no moat in OpenAI in the long run. I don’t know how much the VC / investors buy the ASI as an end goal and even what the pitch would be. They’re probably looking at more prosaic things like moats and growth rates, and this may mean reduced appetite for further investment instead of more.
What can be done for $6 million, can be done even better with 6 million GPUs[1]. What can be done with 6 million GPUs, can’t be done for $6 million. Giant training systems are the moat.
H/t Gwern.
Yeah, in one sense that makes sense. But also, NVDA is down ~16% today.
And is that correct? Do you expect that to last? My 2021 NVDA purchases still feeling pretty wise right now. :P
Not sure if it’s correct, I didn’t actually short NVDA so all I can do is collect my bayes points. I did expect most investors to think at a first-level thinking as that was my immediate reaction on reading about DeepSeek’s training cost. If models can be duplicated a few weeks / months after they’re out for cheaper, then you don’t have a moat (this is for most regular technologies. I’m not saying AI isn’t different, just that most investors think of this like any other tech innovation)
I am so out of touch with mindset of typical investors that I was taken completely by surprise to see NVDA drop. Thanks for the insight.
No.
This whole SaaSpocalyse scenario outlined here https://www.lesswrong.com/posts/bKrpLhqcoN6WycrFp/citrini-s-scenario-is-a-great-but-deeply-flawed-thought has made me think that one obvious loser in all this is Amazon / AWS
It’s been said that the real money maker for Amazon is AWS and not their retail business.
In fact, the lock-in is so strong that there’s a cottage industry of people with AWS certifications and firms whose sole job is “AWS Cost Optimization”.
But what seems to be not yet priced in is the ease of which anyone with a datacenter can now build an AWS-compatible API in the future.
In the end of the day, amazon is bunch of servers in a datacenter. All the so called “services” are just some syntactic sugar for people that don’t want to manage their own servers—and that’s where their moat lies.
It’s hard for a startup who’s built on top of these services to migrate out to another bare-bones rack in another datacenter , but if the datacenter can give them a compatible API, then moving becomes a click of a button (for the most part).
But if you look at how openai competitors worked, almost everyone has a “openai-compatible” API—all I do is change the URL to new model provider and I’m good to go.
This seems like it would truly kill the AWS lock-in, and it doesn’t seem to be priced in to their stock price at all. Maybe people don’t think of AWS as a SaaS company? I would never myself short a stock, but it does seem like the second-order effect to all this is obviously not priced in at all.
The runtime/data-plane APIs are not the moat. There already exist compatible APIs for at least some of AWS services (S3, DynamoDB), and many others use standard/open APIs (SQL), or very simple APIs (SNS, SQS, Firehose, even Lambda and ECS).
It’s the very deep auth/RBAC mechanisms, the automation of control plane/setup, the integration of the services to use together which are the operational barrier to competion. And the history of durability and availability, and clear guidance as to design considerations for reliability which are the trust barriers to competition. Oh, and there are economies of scale even for datacenter—learning to design, build, and operate them has a pretty steep curve.
The easy part is getting easier. The hard part isn’t (well it is, because AWS provides an example, and because LLMs make everything faster. They make AWS better too, though, and AWS has the people/institutional knowledge to get excellent use of LLMs on these topics.
I’m not saying AWS is immune to competition on core services, only that it won’t be a swarm of startups, it’ll be gradual change of equilibrium with other large providers. That said, for newer services, there’s a lot of room for competition with startups built on AWS, which do the new functions better than AWS does, because they make different tradeoffs, like not being fully compatible with AWS auth/setup/billing/management interfaces, which are by necessity rather complex. Even there, the risk is interesting and probably different from recent history. Previously, small competitors to AWS in areas that AWS wants to get good at just get acquired, and become part of AWS. Now it may be more feasible for AWS to rapidly compete with them and implement AWS-style services that make the startup far less attractive to customers.
[ disclaimer: I have worked for companies related to this topic, and this opinion is not based on anything but my speculation and outside knowledge ]
I think there are 3 ways to think about AI and lot of confusion seems to happen because the different paradigms are talking past each other. The 3 paradigms I see on the internet & when talking to people:
Paradigm A) AI is a new technology like the internet / smartphone / electricity—this seems to be mostly held by VC’s / enterpreneurs / devs that think this will unlock a whole new set of apps like AI:new apps like smartphone:Uber or internet:Amazon
Paradigm B) AI is a step change in how humanity will work. Similarly to the agricultural revolution that led to the change in how large society could get and GDP growth, and the industrial revolution was a step-change in GDP growth from ~0% to 2-4% a year, and made things possible such as electricity and the internet and smartphones.
Paradigm C) AI is like the rise of humanity on this earth (the first general intelligences). The world changed completely with the rise of GI, and ASI/AGI will be a similar paradigm. We’ve been locked at humanity’s level of intelligence for the past ~200k years, and getting ASI will be like unlocking multiple new revolutions all at the same time.
Most of the LW crowd is probably (C) or between (B) and (C)
When talking to the general population, I’ve found it to be very helpful to probe about where they are before talking about things like AI safety / how the world will change.
If RL becomes the next thing in improving LLM capabilities, one thing that I would bet on becoming big is computer-use in 2025. Seems hard to get more intelligence with just RL (who verifies the outputs?), but with something like computer use, it’s easy to verify if a task has been done (has the email been sent, ticket been booked etc..) that it’s starting to look to more to me like it can do self-learning.
One thing that’s left AI still fully not integrated into the rest of the economy is simply that the current interfaces were built for humans and moving all those takes engineering time / effort etc.
I’m fairly sure the economic disruption would be pretty quick once this happens. For example, I can just run 10 LLM agents to act as customer service agents using my *existing tools* - just open emails, whatsapp, and message customers, check internal dashboards etc., then it’s game over. What’s stopping people right now is that there’s not enough people to build that pipeline fast enough to utilize even the current capabilities.
Just finished reading “If Anyone Builds It, Everyone Dies”. I had a question that seems like an obvious one, but one I didn’t see addressed in the book, maybe someone can help:
The main argument in the book is the analogy to humans. Evolution “wanted” us to maximize genetic fitness, but it didn’t get what it trained for. Instead, it created humans who love ice cream and condoms even though they reduce our genetic fitness.
With AGI, we’re on track to do something similar—we won’t get an AI aligned to human interests even though we do RLHF or any other such simple training or shaping to an AI, it’ll end up wanting something weird and inhuman rather than maximizing human values.
But in my mind, this seems to miss a fairly important point: The fact that human brains don’t come pre-wired with much knowledge. We have to learn it from scratch. We don’t come out of the womb with concept of “inclusive genetic fitness”. It took us culture and ~200,000 years to figure that out, and we still only learn it after about 15-20 years of existing. So there’s no way that evolution could have made us point our utility function to “inclusive genetic fitness” because that concept doesn’t exist in our brains.
Modern AIs don’t seem like that. They come with the sum of human knowledge baked in during pre-training. As they get smarter, the concept of “human values” or “friendly AI” is definitely something in it’s existing mind. So it should be much easier for us to do alignement and test whether we can point it to that specific concept vs. what what evolution had.
Knowing about “inclusive genetic fitness” does not stop you from wanting ice cream.
For superhuman AIs, knowing about human values won’t necessarily make them care.
Yes, I agree with that. I’m not claiming that knowing about it stops you from wanting ice cream.
I’m claiming that if the concept was hardwired into our brains, evolution would have had an easy time optimizing us directly to want “inclusive genetic fitness” rather than wanting ice cream.
i.e—we wouldn’t want ice cream at all but reason from first principles what we should eat based on fitness.