Yeah, they can be used as a way to get out of procrastination/breach ugh fields/move past the generalized writer’s block, by giving you the ability to quickly and non-effortfully hammer out a badly-donefirst draft of the thing.
They do like making complex stuff, though… I wonder if that’s because there’s a lot more bad code that looks good than just good code?
Wild guess: it’s because AGI labs (and Anthropic in particular) are currently trying to train them to be able to autonomously spin up large, complex codebases, so much of their post-training consists of training episodes where they’re required to do that. So they end up biased towards architectural choices suitable for complex applications instead of small ones.
Like, given free reign, they seem eager to build broad foundations on which lots of other features could later be implemented (n = 2), even if you gave them the full spec and it only involves 1-3 features.
This would actually be kind of good – building a good foundation which won’t need refactoring if you’ll end up fancying a new feature! – except they’re, uh, not actually good at building those foundations.
Which, if my guess is correct, is because their post-training always pushes them to the edge of, and then slightly past, their abilities. Like, suppose the post-training uses rejection sampling (training on episodes in which they succeeded), and it involves a curriculum spanning everything from small apps to “build an OS”. If so, much like in METR’s studies, there’ll be some point of program complexity where they only succeed 50% of the time (or less). But they’re still going to be trained on those 50% successful trajectories. So they’ll end up “overambitious”/”overconfident”, trying to do things they’re not quite reliable at yet.
Yeah, they can be used as a way to get out of procrastination/breach ugh fields/move past the generalized writer’s block, by giving you the ability to quickly and non-effortfully hammer out a badly-done first draft of the thing.
Wild guess: it’s because AGI labs (and Anthropic in particular) are currently trying to train them to be able to autonomously spin up large, complex codebases, so much of their post-training consists of training episodes where they’re required to do that. So they end up biased towards architectural choices suitable for complex applications instead of small ones.
Like, given free reign, they seem eager to build broad foundations on which lots of other features could later be implemented (n = 2), even if you gave them the full spec and it only involves 1-3 features.
This would actually be kind of good – building a good foundation which won’t need refactoring if you’ll end up fancying a new feature! – except they’re, uh, not actually good at building those foundations.
Which, if my guess is correct, is because their post-training always pushes them to the edge of, and then slightly past, their abilities. Like, suppose the post-training uses rejection sampling (training on episodes in which they succeeded), and it involves a curriculum spanning everything from small apps to “build an OS”. If so, much like in METR’s studies, there’ll be some point of program complexity where they only succeed 50% of the time (or less). But they’re still going to be trained on those 50% successful trajectories. So they’ll end up “overambitious”/”overconfident”, trying to do things they’re not quite reliable at yet.
Or maybe not; wild guess, as I said.