Note that the goal of “work on long-term research bets now so that a workforce of AI agents can automate it in a couple of years” implies somewhat different priorities than “work on long-term research bets to eventually have them pay off through human labor”, notably:
The research direction needs to be actually pursued by the agents, either through the decision of the human leadership, or through the decision of AI agents that the human leadership defers to. This means that if some long-term research bet isn’t respected by lab leadership, it’s unlikely to be pursued by their workforce of AI agents.
This implies that a major focus of current researchers should be on credibility and having a widely agreed-on theory of change. If this is lacking, then the research will likely never be pursued by the AI agent workforce and all the work will likely have been for nothing.
Maybe there is some hope that despite a research direction being unpopular among lab leadership, the AI agents will realize its usefulness on their own, and possibly persuade the lab leadership to let them expend compute on the research direction in question. Or maybe the agents will have so much free reign over research that they don’t even need to ask for permission to pursue new research directions.
Setting oneself up for providing oversight to AI agents. There might be a period during which agents are very capable at research engineering / execution but not research management, and leading AGI companies are eager to hire human experts to supervise large numbers of AI agents. If one doesn’t have legible credentials or good relations with AGI companies, they are less likely to be hired during this period.
The research direction needs to be actually pursued by the agents, either through the decision of the human leadership, or through the decision of AI agents that the human leadership defers to. This means that if some long-term research bet isn’t respected by lab leadership, it’s unlikely to be pursued by their workforce of AI agents.
I think you’re starting from a good question here (e.g. “Will the research direction actually be pursued by AI researchers?”), but have entirely the wrong picture; lab leaders are unlikely to be very relevant decision-makers here. The key is that no lab has a significant moat, and the cutting edge is not kept private for long, and those facts look likely to remain true for a while. Assuming even just one cutting-edge lab continues to deploy at all like they do today, basically-cutting-edge models will be available to the public, and therefore researchers outside the labs can just use them to do the relevant research regardless of whether lab leadership is particularly invested. Just look at the state of things today: one does not need lab leaders on board in order to prompt cutting-edge models to work on one’s own research agenda.
That said, “Will the research direction actually be pursued by AI researchers?” remains a relevant question. The prescription is not so much about convincing lab leadership, but rather about building whatever skills will likely be necessary in order to use AI researchers productively oneself.
Labs do have a moat around compute. In the worlds where automated R&D gets unlocked I would expect compute allocation to substantially pivot, making non-industrial automated research efforts non-competitive.
“Labs” are not an actor. No one lab has a moat around compute; at the very least Google, OpenAI, Anthropic, xAI, and Facebook all have access to plenty of compute. It only takes one of them to sell access to their models publicly.
Sure, but I think that misses the point that I was trying to convey. If we end up in a world similar to the ones forecasted in ai-2027, the fraction of compute which labs allocate towards speeding up their own research threads will be larger than the amount of compute which labs will sell for public consumption.
My view is that even in worlds with significant speed ups in R&D, we still ultimately care about the relative speed of progress on scalable alignment (in the Christiano sense) compared to capabilities & prosaic safety; doesn’t matter if we finish quicker if catastrophic ai is finished quickest. Thus, an effective TOC for speeding up long horizon research would still route through convincing lab leadership of the pursuitworthiness of research streams.
I have no idea what you’re picturing here. Those sentences sounded like a sequence of nonsequiturs, which means I probably am completely missing what you’re trying to say. Maybe spell it out a bit more?
Some possibly-relevant points:
The idea that all the labs focus on speeding up their own research threads rather than serving LLMs to customers is already pretty dubious. Developing LLMs and using them are two different skillsets; it would make economic sense for different entities to specialize in those things, with the developers selling model usage to the users just as they do today. More capable AI doesn’t particularly change that economic logic. I wouldn’t be surprised if at least some labs nonetheless keep things in-house, but all of them?
The implicit assumption that alignment/safety research will be bottlenecked on compute at all likewise seems dubious at best, though I could imagine an argument for it (routing through e.g. scaling inference compute).
It sounds like maybe you’re assuming that there’s some scaling curve for (alignment research progress as a function of compute invested) and another for (capabilities progress as a function of compute invested), and you’re imagining that to keep the one curve ahead of the other, the amount of compute aimed at alignment needs to scale in a specific way with the amount aimed at capabilities? (That model sounds completely silly to me, that is not at all how this works, but it would be consistent with the words you’re saying.)
The idea that all the labs focus on speeding up their own research threads rather than serving LLMs to customers is already pretty dubious. Developing LLMs and using them are two different skillsets; it would make economic sense for different entities to specialize in those things
I can maybe see it. Consider the possibility that the decision to stop providing public access to models past some capability level is convergent: e. g., the level at which they’re extremely useful for cyberwarfare (with jailbreaks still unsolved) such that serving the model would drown the lab in lawsuits/political pressure, or the point at which the task of spinning up an autonomous business competitive with human businesses, or making LLMs cough up novel scientific discoveries, becomes trivial (i. e., such that the skill level required for using AI for commercial success plummets – which would start happening inasmuch as AGI labs are successful in moving LLMs to the “agent” side of the “tool/agent” spectrum).
In those cases, giving public access to SOTA models would stop being the revenue-maximizing thing to do. It’d either damage your business reputation[1], or it’d simply become more cost-effective to hire a bunch of random bright-ish people and get them to spin up LLM-wrapper startups in-house (so that you own 100% stake in them).
Some loose cannons/open-source ideologues like DeepSeek may still provide free public access, but those may be few and far between, and significantly further behind. (And getting progressively scarcer; e. g., the CCP probably won’t let DeepSeek keep doing it.)
Less extremely, AGI labs may move to a KYC-gated model of customer access, such that only sufficiently big, sufficiently wealthy entities are able to get access to SOTA models. Both because those entities won’t do reputation-damaging terrorism, and because they’d be the only ones able to pay the rates (see OpenAI’s maybe-hype maybe-real whispers about $20,000/month models).[2] And maybe some EA/R-adjacent companies would be able to get in on that, but maybe not.
Also,
no lab has a significant moat, and the cutting edge is not kept private for long, and those facts look likely to remain true for a while
This is a bit flawed, I think. I think the situation is that runner-ups aren’t far behind the leaders in wall-clock time. Inasmuch as the progress is gradual, this translates to runner-ups being not-that-far-behind the leaders in capability level. But if AI-2027-style forecasts come true, with the capability progress accelerating, a 90-day gap may become a “GPT-2 vs. GPT-4”-level gap. In which case alignment researchers having privileged access to true-SOTA models becomes important.
(Ideally, we’d have some EA/R-friendly company already getting cozy with e. g. Anthropic so that they can be first-in-line getting access to potential future research-level models so that they’d be able to provide access to those to a diverse portfolio of trusted alignment researchers...)
Even if the social benefits of public access would’ve strictly outweighed the harms on a sober analysis, the public outcry at the harms may be significant enough to make the idea commercially unviable. Asymmetric justice, etc.
hire a bunch of random bright-ish people and get them to spin up LLM-wrapper startups in-house (so that you own 100% stake in them).
I doubt it’s really feasible. These startups will require significant infusion of capital so AI companies CEOs and CFOs will have a say on how they develop. But tech CEOs and CFOs have no idea how developments in other industries work and why they are slow so they will mismanage such startups.
P. S. Oh, and also I realized the other day: whether you are an AI agent or just a human, imagine the temptation to organize a Theranos-type fraud if details of your activity are mostly secret and you only report to tech bros believing in the power of AGI/ASI!
Note that the goal of “work on long-term research bets now so that a workforce of AI agents can automate it in a couple of years” implies somewhat different priorities than “work on long-term research bets to eventually have them pay off through human labor”, notably:
The research direction needs to be actually pursued by the agents, either through the decision of the human leadership, or through the decision of AI agents that the human leadership defers to. This means that if some long-term research bet isn’t respected by lab leadership, it’s unlikely to be pursued by their workforce of AI agents.
This implies that a major focus of current researchers should be on credibility and having a widely agreed-on theory of change. If this is lacking, then the research will likely never be pursued by the AI agent workforce and all the work will likely have been for nothing.
Maybe there is some hope that despite a research direction being unpopular among lab leadership, the AI agents will realize its usefulness on their own, and possibly persuade the lab leadership to let them expend compute on the research direction in question. Or maybe the agents will have so much free reign over research that they don’t even need to ask for permission to pursue new research directions.
Setting oneself up for providing oversight to AI agents. There might be a period during which agents are very capable at research engineering / execution but not research management, and leading AGI companies are eager to hire human experts to supervise large numbers of AI agents. If one doesn’t have legible credentials or good relations with AGI companies, they are less likely to be hired during this period.
Delaying engineering-heavy projects until engineering is cheap relative to other types of work.
(some of these push in opposite directions, e.g., engineering-heavy research outputs might be especially good for legibility)
I think you’re starting from a good question here (e.g. “Will the research direction actually be pursued by AI researchers?”), but have entirely the wrong picture; lab leaders are unlikely to be very relevant decision-makers here. The key is that no lab has a significant moat, and the cutting edge is not kept private for long, and those facts look likely to remain true for a while. Assuming even just one cutting-edge lab continues to deploy at all like they do today, basically-cutting-edge models will be available to the public, and therefore researchers outside the labs can just use them to do the relevant research regardless of whether lab leadership is particularly invested. Just look at the state of things today: one does not need lab leaders on board in order to prompt cutting-edge models to work on one’s own research agenda.
That said, “Will the research direction actually be pursued by AI researchers?” remains a relevant question. The prescription is not so much about convincing lab leadership, but rather about building whatever skills will likely be necessary in order to use AI researchers productively oneself.
Labs do have a moat around compute. In the worlds where automated R&D gets unlocked I would expect compute allocation to substantially pivot, making non-industrial automated research efforts non-competitive.
“Labs” are not an actor. No one lab has a moat around compute; at the very least Google, OpenAI, Anthropic, xAI, and Facebook all have access to plenty of compute. It only takes one of them to sell access to their models publicly.
Sure, but I think that misses the point that I was trying to convey. If we end up in a world similar to the ones forecasted in ai-2027, the fraction of compute which labs allocate towards speeding up their own research threads will be larger than the amount of compute which labs will sell for public consumption.
My view is that even in worlds with significant speed ups in R&D, we still ultimately care about the relative speed of progress on scalable alignment (in the Christiano sense) compared to capabilities & prosaic safety; doesn’t matter if we finish quicker if catastrophic ai is finished quickest. Thus, an effective TOC for speeding up long horizon research would still route through convincing lab leadership of the pursuitworthiness of research streams.
I have no idea what you’re picturing here. Those sentences sounded like a sequence of nonsequiturs, which means I probably am completely missing what you’re trying to say. Maybe spell it out a bit more?
Some possibly-relevant points:
The idea that all the labs focus on speeding up their own research threads rather than serving LLMs to customers is already pretty dubious. Developing LLMs and using them are two different skillsets; it would make economic sense for different entities to specialize in those things, with the developers selling model usage to the users just as they do today. More capable AI doesn’t particularly change that economic logic. I wouldn’t be surprised if at least some labs nonetheless keep things in-house, but all of them?
The implicit assumption that alignment/safety research will be bottlenecked on compute at all likewise seems dubious at best, though I could imagine an argument for it (routing through e.g. scaling inference compute).
It sounds like maybe you’re assuming that there’s some scaling curve for (alignment research progress as a function of compute invested) and another for (capabilities progress as a function of compute invested), and you’re imagining that to keep the one curve ahead of the other, the amount of compute aimed at alignment needs to scale in a specific way with the amount aimed at capabilities? (That model sounds completely silly to me, that is not at all how this works, but it would be consistent with the words you’re saying.)
I can maybe see it. Consider the possibility that the decision to stop providing public access to models past some capability level is convergent: e. g., the level at which they’re extremely useful for cyberwarfare (with jailbreaks still unsolved) such that serving the model would drown the lab in lawsuits/political pressure, or the point at which the task of spinning up an autonomous business competitive with human businesses, or making LLMs cough up novel scientific discoveries, becomes trivial (i. e., such that the skill level required for using AI for commercial success plummets – which would start happening inasmuch as AGI labs are successful in moving LLMs to the “agent” side of the “tool/agent” spectrum).
In those cases, giving public access to SOTA models would stop being the revenue-maximizing thing to do. It’d either damage your business reputation[1], or it’d simply become more cost-effective to hire a bunch of random bright-ish people and get them to spin up LLM-wrapper startups in-house (so that you own 100% stake in them).
Some loose cannons/open-source ideologues like DeepSeek may still provide free public access, but those may be few and far between, and significantly further behind. (And getting progressively scarcer; e. g., the CCP probably won’t let DeepSeek keep doing it.)
Less extremely, AGI labs may move to a KYC-gated model of customer access, such that only sufficiently big, sufficiently wealthy entities are able to get access to SOTA models. Both because those entities won’t do reputation-damaging terrorism, and because they’d be the only ones able to pay the rates (see OpenAI’s maybe-hype maybe-real whispers about $20,000/month models).[2] And maybe some EA/R-adjacent companies would be able to get in on that, but maybe not.
Also,
This is a bit flawed, I think. I think the situation is that runner-ups aren’t far behind the leaders in wall-clock time. Inasmuch as the progress is gradual, this translates to runner-ups being not-that-far-behind the leaders in capability level. But if AI-2027-style forecasts come true, with the capability progress accelerating, a 90-day gap may become a “GPT-2 vs. GPT-4”-level gap. In which case alignment researchers having privileged access to true-SOTA models becomes important.
(Ideally, we’d have some EA/R-friendly company already getting cozy with e. g. Anthropic so that they can be first-in-line getting access to potential future research-level models so that they’d be able to provide access to those to a diverse portfolio of trusted alignment researchers...)
Even if the social benefits of public access would’ve strictly outweighed the harms on a sober analysis, the public outcry at the harms may be significant enough to make the idea commercially unviable. Asymmetric justice, etc.
Indeed, do we know it’s not already happening? I can easily imagine some megacorporations having had privileged access to o3 for months.
P. P. S.
In the month since writing the previous comment I have read the following article by @Abhishaike Mahajan and believe it illustrates well why the non-tech world is so difficult for AI, can recommend: https://www.owlposting.com/p/what-happened-to-pathology-ai-companies
I doubt it’s really feasible. These startups will require significant infusion of capital so AI companies CEOs and CFOs will have a say on how they develop. But tech CEOs and CFOs have no idea how developments in other industries work and why they are slow so they will mismanage such startups.
P. S. Oh, and also I realized the other day: whether you are an AI agent or just a human, imagine the temptation to organize a Theranos-type fraud if details of your activity are mostly secret and you only report to tech bros believing in the power of AGI/ASI!