Yes, and my point is “I would have expected this to work with scaffolding, and, therefore, I also expect there to be ways of getting the same behavior out of the core AI model.” (I didn’t say the second part before, it’s more like ‘it’s not really a crux for me whether you solve this via scaffolding or training, the raw ingredients seem to be there.’)
BUT, the fact that we’re still seeing some problems with this is somewhat surprising to me, and I’m not sure if the update is more like “my underlying model here is wrong” or more like “well it takes more sweat and elbow grease and scaling to leverage the existing capabilities in a generalized way.”
You “would have expected” that and you would be wrong.
Doesn’t matter if it’s just 32 bits worth of connections—it’s 32 bits worth of connections that aren’t currently present in the model. Nothing fundamental stops them from being present. It’s just that no one burned them in with training, so they aren’t there.
Finding all the ways to generalize from improved prompts and scaffolding into improved models isn’t at all trivial. And people aren’t at the point of “search and refinement at scale” yet.
Downvoted because your string of comments here feel weirdly aggro, and don’t feel like they are really engaging with what I’m saying. (Like, yes, I am saying I am confused and presumably wrong about something. You seem to want to go hard on saying “yes you’re wrong!” and I’m like “yes, I know?”)
Yes, and my point is “I would have expected this to work with scaffolding, and, therefore, I also expect there to be ways of getting the same behavior out of the core AI model.” (I didn’t say the second part before, it’s more like ‘it’s not really a crux for me whether you solve this via scaffolding or training, the raw ingredients seem to be there.’)
BUT, the fact that we’re still seeing some problems with this is somewhat surprising to me, and I’m not sure if the update is more like “my underlying model here is wrong” or more like “well it takes more sweat and elbow grease and scaling to leverage the existing capabilities in a generalized way.”
You “would have expected” that and you would be wrong.
Doesn’t matter if it’s just 32 bits worth of connections—it’s 32 bits worth of connections that aren’t currently present in the model. Nothing fundamental stops them from being present. It’s just that no one burned them in with training, so they aren’t there.
Finding all the ways to generalize from improved prompts and scaffolding into improved models isn’t at all trivial. And people aren’t at the point of “search and refinement at scale” yet.
Downvoted because your string of comments here feel weirdly aggro, and don’t feel like they are really engaging with what I’m saying. (Like, yes, I am saying I am confused and presumably wrong about something. You seem to want to go hard on saying “yes you’re wrong!” and I’m like “yes, I know?”)