Oh interesting. That’s not why it seemed important to me.
I thought this was roughly priced in, and it seems still seems like 4.6 & co can’t actually do many types of mental cognitive tasks. And in terms of “it can qualitatively do the things that you need to invent AGI now, whereas it couldn’t before”, I don’t actually know that that’s changed (much).
I would have thought last-year’s Opus 4 were able to do the range of things Opus 4.6 is able to do, because of how competent and planning-y and metacognition-y they seemed at coding. But, then when you put them in situations that were remotely outside their wheelhouse, they sputtered and got confused and flail-y. I eventually updated “okay clearly something interesting is happening here, but, it’s more like they have very-narrow-domain-specialized-metacognition.
Outside those narrow domains, Opus 4 has metacognition, but, none of the actual skills it needs connect it to useful things.
I think Opus 4.6 is another step along the chain, where it has many more narrow-focused-stacks-of-skills in coding that all reinforce each other, and also it’s probably at least somewhat better at metacognition overall. But, I would bet against it turning out to be good enough at non-code non-math domains.
(has anyone tried running Openclaw agent swarms with overseers and detect-loop watchers with Opus 4? I am curious if they could have handled your big one-shot tasks)
...
Nonetheless, I think Opus 4.6 + “current gen Cursor scaffold” is sufficiently good at enough different things, with enough longterm planning that I’m like:
“Okay, I feel like they have some kind of ‘complete stack’ here, but, due to the jagged frontier, the set of domains their stack is minimum-viable at is different from human humans.” (My example wouldn’t be “make a peanut butter sandwich”, it’d be “navigating abstract domains with bad feedback loops.”)
The thing I think is significant about this is “We have left the domain where ‘AGI’ is a particularly useful discriminator, and we need better ontology in order to navigate what’s coming next.”
I also think it’s a useful time to say to everyone who’d been vaguely dismissing things for not being real, to be like “Bro, it is real. Whatever you were waiting for to think Shit’s Real, it’s clearly here by now.” Which is closer to what you had in mind, I think. But, I still don’t have particular things in mind for most people to do, if they weren’t the sort of person to have already figured out it was real last year.
I expect the main things most people can do is apply pressure on their governments to take policy action. Making this happen is no small feat, and is mostly a matter of sufficient awareness so that everyone knows that everyone knows it’s real and there are options to stop it until we finish more safety work. Coordination on this scale is not just a few people acting, it’s getting people to organically come to the ideas and apply pressure, and doing that requires sufficient credible signals that it’s real and saying it’s real and having everyone believe it.
Nod. But, I think trying to warn “AGI is literally here” feels kinda like the wrong move to me anyways.
The move I would make is “AI keeps improving in ways that are on the path to generalization and strategic awareness. Here is where it was 3 years ago. Here’s where it was last year. Here’s where it was last month”. I think that’s consistently alarming whether or not people agree on what counts as AGI. (and, every few months there are more alarming things to point at).
I think it’s currently at the point where people paying attention should notice “this sure doesn’t seem to obviously NOT be AGI”, but, I think it’s still at a point where crying “AGI” might leave people underwhelmed and then get Boy Cried Wolf syndrome. (and meanwhile just focusing on it’s object level capabilities seems more robustly good)
You mean why I think it’s good to claim that AGI is here given that I believe it is?
My theory is that saying this might wake some people up to the current situation and get them to act in ways that reduce existential risk.
Oh interesting. That’s not why it seemed important to me.
I thought this was roughly priced in, and it seems still seems like 4.6 & co can’t actually do many types of mental cognitive tasks. And in terms of “it can qualitatively do the things that you need to invent AGI now, whereas it couldn’t before”, I don’t actually know that that’s changed (much).
I would have thought last-year’s Opus 4 were able to do the range of things Opus 4.6 is able to do, because of how competent and planning-y and metacognition-y they seemed at coding. But, then when you put them in situations that were remotely outside their wheelhouse, they sputtered and got confused and flail-y. I eventually updated “okay clearly something interesting is happening here, but, it’s more like they have very-narrow-domain-specialized-metacognition.
Outside those narrow domains, Opus 4 has metacognition, but, none of the actual skills it needs connect it to useful things.
I think Opus 4.6 is another step along the chain, where it has many more narrow-focused-stacks-of-skills in coding that all reinforce each other, and also it’s probably at least somewhat better at metacognition overall. But, I would bet against it turning out to be good enough at non-code non-math domains.
(has anyone tried running Openclaw agent swarms with overseers and detect-loop watchers with Opus 4? I am curious if they could have handled your big one-shot tasks)
...
Nonetheless, I think Opus 4.6 + “current gen Cursor scaffold” is sufficiently good at enough different things, with enough longterm planning that I’m like:
“Okay, I feel like they have some kind of ‘complete stack’ here, but, due to the jagged frontier, the set of domains their stack is minimum-viable at is different from human humans.” (My example wouldn’t be “make a peanut butter sandwich”, it’d be “navigating abstract domains with bad feedback loops.”)
The thing I think is significant about this is “We have left the domain where ‘AGI’ is a particularly useful discriminator, and we need better ontology in order to navigate what’s coming next.”
I also think it’s a useful time to say to everyone who’d been vaguely dismissing things for not being real, to be like “Bro, it is real. Whatever you were waiting for to think Shit’s Real, it’s clearly here by now.” Which is closer to what you had in mind, I think. But, I still don’t have particular things in mind for most people to do, if they weren’t the sort of person to have already figured out it was real last year.
I expect the main things most people can do is apply pressure on their governments to take policy action. Making this happen is no small feat, and is mostly a matter of sufficient awareness so that everyone knows that everyone knows it’s real and there are options to stop it until we finish more safety work. Coordination on this scale is not just a few people acting, it’s getting people to organically come to the ideas and apply pressure, and doing that requires sufficient credible signals that it’s real and saying it’s real and having everyone believe it.
Nod. But, I think trying to warn “AGI is literally here” feels kinda like the wrong move to me anyways.
The move I would make is “AI keeps improving in ways that are on the path to generalization and strategic awareness. Here is where it was 3 years ago. Here’s where it was last year. Here’s where it was last month”. I think that’s consistently alarming whether or not people agree on what counts as AGI. (and, every few months there are more alarming things to point at).
I think it’s currently at the point where people paying attention should notice “this sure doesn’t seem to obviously NOT be AGI”, but, I think it’s still at a point where crying “AGI” might leave people underwhelmed and then get Boy Cried Wolf syndrome. (and meanwhile just focusing on it’s object level capabilities seems more robustly good)