I agree that Anthropic will attempt to add more RL tasks and potentially update Mythos’s weights even after the pre-train, and this could effect doubling times, but my point was that Mythos suggests that you can in fact just scale upwards in parameter and pre-training compute multiple times, and that the memes of compute/pre-training scaling being dead weren’t correct at all.
Indeed, 500x the compute of GPT-4 to train a GPT-6-level model by pre-training is probably possible by 2028 if you were willing to avoid using RL (though when it’s deployed it’s more likely to be in 2029 due to RL and inference soaking up compute), and if the scale-up of AI in 2026 is as potent as people believe (and indeed this is likely to happen once the nerfed models release sometime this year), AI companies can get enough revenue to comfortably build GPT-7, which would have 5000x the compute of GPT-4, which I suspect would be built in 2031.
But there is something I want to say here, and that is how the shift to ever larger post-training/RL could enable us to incrementally solve continual learning/learning on the job/continual weight updates like brains. One of the gifts of AI 2027 is that it points out in the January 2027 section that modulo some very important details like catastrophic forgetting of earlier tasks when later tasks are RLed in, fast enough weight updates for a long enough time via perpetually adding in more RL tasks is essentially equivalent to human-level continual learning, and while Agent-2 isn’t quite there yet, it is on the path, and it’s already able to continuously learn from the world all the time with it’s weights, it’s just slower than humans at this:
With Agent-1’s help, OpenBrain is now post-training Agent-2. More than ever, the focus is on high-quality data. Copious amounts of synthetic data are produced, evaluated, and filtered for quality before being fed to Agent-2.42 On top of this, they pay billions of dollars for human laborers to record themselves solving long-horizon tasks.43 On top of all that, they train Agent-2 almost continuously using reinforcement learning on an ever-expanding suite of diverse difficult tasks: lots of video games, lots of coding challenges, lots of research tasks. Agent-2, more so than previous models, is effectively “online learning,” in that it’s built to never really finish training. Every day, the weights get updated to the latest version, trained on more data generated by the previous version the previous day.
I expect something like this to ultimately be done once continual learning is targeted, albeit at a slower pace than what AI 2027 describes.
I’d make an even stronger claim, in that the quoted claim is basically false for a lot of things parents would hope to persuade their children away from, and that the actual reason we have laws to prevent people from say conversion therapy isn’t because we are concerned conversion therapists or parents will persuade people to be no longer LGBTQ (for an example), but rather to prevent people from torturing LGBTQ children or making environments where they will reliably kill themselves in the attempt to convert them away from LGBTQ.
Similar things happen for anti-grooming laws or anti-abuse laws, where in practice it prevents people from attempting abusive tactics to persuade the child to have sex or do something else that a parent or other person wants them to do, and the reason for concern isn’t because the child will be persuaded, but rather to prevent people from damaging the child’s health in pursuit of attempting to persuade them.
One of my core takeaways on the heritability of a lot of traits is that this means parents have very little control over what the child will actually be like, and most claims that parents can reliably persuade them to do something against their genetics is bullshit.
On the original topic of AIs being able to persuade humans to easily change their preferences, I do agree that if you let technology advance far enough, humans would be able to be persuadable into arbitrary things, so some form of paternalism is likely to be necessary in the long run, but that thankfully is well past the critical period where we need to handle x-risks from AI, where x-risks either have already played out into full-blown existential catastrophe or we manage to get into existential security, so the sorts of superpersuasion/changing people’s preferences massively mostly don’t matter from my perspective.