SOTA models will take thousands of dollars per month to run at maximum intelligence and thoughtfulness, and most people won’t pay for that.
So do you disagree with the capability description of Agent-3-mini then?
We say: “It blows the other AIs out of the water. Agent-3-mini is less capable than Agent-3, but 10x cheaper, and still better than the typical OpenBrain employee.” And presumably at remote work jobs besides AI research it is more likely the median employee or lower, but still quite capable. So while of course maximum performance will require very high spend, we are projecting in July that quite high capabilities are available somewhat cheaply.
Sorry, I should have been clearer. I do agree that high capabilities will be available relatively cheaply. I think I expect Agent-3-mini models slightly later than the scenario depicts due to various bottlenecks and random disruptions, but showing up slightly later isn’t relevant to my point, there. My point was that I expect that even in the presence of high-capability models there still won’t be much social consensus, in part because the technology will still be unevenly distributed and our ability to form social consensus is currently quite bad. This means that some people will theoretically have access to Agent-3-mini, but they’ll do some combination of ignoring it and focusing on what it can’t do and implicitly assume that it’s about the best AI will ever be. Meanwhile, other people will be good at prompting, have access to high-inference-cost frontier models, and will be future-oriented. These two groups will have very different perceptions of AI, and those differing perceptions will lead to mutually thinking that the other group is insane and society not being able to get on the same page except for some basics, like “take-home programming problems are not a good way to test potential hires.”
I don’t know if that makes sense. I’m not even sure if it’s incompatible with your vision, but I think the FUD, fog-of-war, and lack of agreement across society will get worse in coming years, not better, and that this trend is important to how things will play out.
If so much effort is being focused into AI research capability, I’d actually expect modally Agent-3 to be better than typical OpenBrain employee but completely incapable of replacing almost all employees in other fields. “capabilities are spiky” is a clear current fact about frontier AI, but your scenario seems to underestimate it.
So do you disagree with the capability description of Agent-3-mini then?
We say: “It blows the other AIs out of the water. Agent-3-mini is less capable than Agent-3, but 10x cheaper, and still better than the typical OpenBrain employee.” And presumably at remote work jobs besides AI research it is more likely the median employee or lower, but still quite capable. So while of course maximum performance will require very high spend, we are projecting in July that quite high capabilities are available somewhat cheaply.
Sorry, I should have been clearer. I do agree that high capabilities will be available relatively cheaply. I think I expect Agent-3-mini models slightly later than the scenario depicts due to various bottlenecks and random disruptions, but showing up slightly later isn’t relevant to my point, there. My point was that I expect that even in the presence of high-capability models there still won’t be much social consensus, in part because the technology will still be unevenly distributed and our ability to form social consensus is currently quite bad. This means that some people will theoretically have access to Agent-3-mini, but they’ll do some combination of ignoring it and focusing on what it can’t do and implicitly assume that it’s about the best AI will ever be. Meanwhile, other people will be good at prompting, have access to high-inference-cost frontier models, and will be future-oriented. These two groups will have very different perceptions of AI, and those differing perceptions will lead to mutually thinking that the other group is insane and society not being able to get on the same page except for some basics, like “take-home programming problems are not a good way to test potential hires.”
I don’t know if that makes sense. I’m not even sure if it’s incompatible with your vision, but I think the FUD, fog-of-war, and lack of agreement across society will get worse in coming years, not better, and that this trend is important to how things will play out.
If so much effort is being focused into AI research capability, I’d actually expect modally Agent-3 to be better than typical OpenBrain employee but completely incapable of replacing almost all employees in other fields. “capabilities are spiky” is a clear current fact about frontier AI, but your scenario seems to underestimate it.
This is a good point, and I think meshes with my point about lack of consensus about how powerful AIs are.
“Sure, they’re good at math and coding. But those are computer things, not real-world abilities.”