Just a note here that I’m appreciating our conversation :) We clearly have very different views right now on what is strategically needed but digging your considered and considerate responses.
Thank you! Same here :)
How do you account for the problem here that Nvidia’s and downstream suppliers’ investment in GPU hardware innovation and production capacity also went up as a result of the post-ChatGPT race (to the bottom) between tech companies on developing and releasing their LLM versions?
I frankly don’t know how to model this somewhat soundly. It’s damn complex.
I think it’s definitely true that AI-specific compute is further along than it would be if there hadn’t been the LLM boom happening. I think the relationship is unaffected though—earlier LLM development means faster timelines but slower takeoff.
Personally I think slower takeoff is more important than slower timelines, because that means we get more time to work with and understand these proto-AGI systems. On the other hand to people who see alignment as more of a theoretical problem that is unrelated to any specific AI system, slower timelines are good because they give theory people more time to work and takeoff speeds are relatively unimportant.
But I do think the latter view is very misguided. I can imagine a setup for training a LLM in a way that makes it both generally intelligent and aligned; I can’t imagine a recipe for alignment that works outside of any particular AI paradigm, or that invents its own paradigm while simultaneously aligning it. I think the reason a lot of theory-pilled people such as people at MIRI become doomers is that they try to make that general recipe and predictably fail.
This not a very particular view – in terms of the possible lines of reasoning and/or people with epistemically diverse worldviews that end up arriving at this conclusion. I’d be happy to discuss the reasoning I’m working from, in the time that you have.
I think I’d like to have a discussion about whether practical alignment can work at some point, but I think it’s a bit outside the scope of the current convo. (I’m referring to the two groups here as ‘practical’ and ‘theoretical’ as a rough way to divide things up).
Above and beyond the argument over whether practical or theoretical alignment can work I think there should be some norm where both sides give the other some credit. Because in practice I doubt we’ll convince each other, but we should still be able to co-operate to some degree.
E.g. for myself I think theoretical approaches that are unrelated to the current AI paradigm are totally doomed, but I support theoretical approaches getting funding because who knows, maybe they’re right and I’m wrong.
And on the other side, given that having people at frontier AI labs who care about AI risk is absolutely vital for practical alignment, I take anti-frontier lab rhetoric as breaking a truce between the two groups in a way that makes AI risk worse. Even if this approach seems doomed to you, I think if you put some probability on you being wrong about it being doomed then the cost-benefit analysis should still come up robustly positive for AI-risk-aware people working at frontier labs (including on capabilities).
This is a bit outside the scope of your essay since you focused on leaders at Anthropic who it’s definitely fair to say have advanced timelines by some significant amount. But for the marginal worker at a frontier lab who might be discouraged from joining due to X-risk concerns, I think the impact on timelines is very small and the possible impact on AI risk is relatively much larger.
Above and beyond the argument over whether practical or theoretical alignment can work I think there should be some norm where both sides give the other some credit …
E.g. for myself I think theoretical approaches that are unrelated to the current AI paradigm are totally doomed, but I support theoretical approaches getting funding because who knows, maybe they’re right and I’m wrong.
I understand this is a common area of debate.
Both approaches do not work based on the reasoning I’ve gone through.
Thank you! Same here :)
I think it’s definitely true that AI-specific compute is further along than it would be if there hadn’t been the LLM boom happening. I think the relationship is unaffected though—earlier LLM development means faster timelines but slower takeoff.
Personally I think slower takeoff is more important than slower timelines, because that means we get more time to work with and understand these proto-AGI systems. On the other hand to people who see alignment as more of a theoretical problem that is unrelated to any specific AI system, slower timelines are good because they give theory people more time to work and takeoff speeds are relatively unimportant.
But I do think the latter view is very misguided. I can imagine a setup for training a LLM in a way that makes it both generally intelligent and aligned; I can’t imagine a recipe for alignment that works outside of any particular AI paradigm, or that invents its own paradigm while simultaneously aligning it. I think the reason a lot of theory-pilled people such as people at MIRI become doomers is that they try to make that general recipe and predictably fail.
I think I’d like to have a discussion about whether practical alignment can work at some point, but I think it’s a bit outside the scope of the current convo. (I’m referring to the two groups here as ‘practical’ and ‘theoretical’ as a rough way to divide things up).
Above and beyond the argument over whether practical or theoretical alignment can work I think there should be some norm where both sides give the other some credit. Because in practice I doubt we’ll convince each other, but we should still be able to co-operate to some degree.
E.g. for myself I think theoretical approaches that are unrelated to the current AI paradigm are totally doomed, but I support theoretical approaches getting funding because who knows, maybe they’re right and I’m wrong.
And on the other side, given that having people at frontier AI labs who care about AI risk is absolutely vital for practical alignment, I take anti-frontier lab rhetoric as breaking a truce between the two groups in a way that makes AI risk worse. Even if this approach seems doomed to you, I think if you put some probability on you being wrong about it being doomed then the cost-benefit analysis should still come up robustly positive for AI-risk-aware people working at frontier labs (including on capabilities).
This is a bit outside the scope of your essay since you focused on leaders at Anthropic who it’s definitely fair to say have advanced timelines by some significant amount. But for the marginal worker at a frontier lab who might be discouraged from joining due to X-risk concerns, I think the impact on timelines is very small and the possible impact on AI risk is relatively much larger.
I understand this is a common area of debate.
Both approaches do not work based on the reasoning I’ve gone through.