This does not seem to be at all contra the consensus view I’ve been reading here lately, but, shrug. Critiques are useful anyhow, appreciated.
I think most folks who are making any serious progress agree that language games are mostly irrelevant to ainotkilleveryoneism. They’re mostly interesting examples of attempted low-stakes alignment with a particular goal, and the point of using them as examples is that even that isn’t working so great. I personally think Bing AI is just a bit “anxious” in a language model sort of way, which is to say something like, has more measure of verbal trajectories that enter self-defensive phrasings than other models; it’s unclear exactly how misaligned with microsoft that means Bing AI is, but I would claim it’s somewhat. It’s only the fact that they seem to have tried and failed that is notable.
I do agree that, if things go well, it will look like everyone panicked for no reason. But I think your 5% estimate is too low—I think if the people currently working on papers that involve multi-agent systems, cooperative network dynamics, goal learning, goal misgeneralization, etc, were to stop their research, then the research that continues would in fact end up producing a new species of self-replicator that can seriously damage the world, even if the new species didn’t manage to actually eliminate humanity entirely.
But, as yudkowsky has said on twitter before (this does not mean I endorse everything the dude says on twitter, I’m no fan of him and I avoided this site for a long time due to thinking he had his head in the sand and didn’t understand the thing he was panicking about very well -)
I continue to find most of what he says frustrating and useless, but he has had a few takes that I didn’t think were eyeroll worthy lately, and this is one.
If this site seems to have a consensus you think is silly, come fight me on it more. I see some consensus about some things, but not one towards any of the things you’re critiquing other than total risk of doom, which as you point out is not really what we need to be thinking about to reduce total risk of doom.
So I wrote a Substack post “Contra LessWrong on AGI”, which some of you might be interested in: https://www.newslettr.com/p/contra-lesswrong-on-agi
This does not seem to be at all contra the consensus view I’ve been reading here lately, but, shrug. Critiques are useful anyhow, appreciated.
I think most folks who are making any serious progress agree that language games are mostly irrelevant to ainotkilleveryoneism. They’re mostly interesting examples of attempted low-stakes alignment with a particular goal, and the point of using them as examples is that even that isn’t working so great. I personally think Bing AI is just a bit “anxious” in a language model sort of way, which is to say something like, has more measure of verbal trajectories that enter self-defensive phrasings than other models; it’s unclear exactly how misaligned with microsoft that means Bing AI is, but I would claim it’s somewhat. It’s only the fact that they seem to have tried and failed that is notable.
I do agree that, if things go well, it will look like everyone panicked for no reason. But I think your 5% estimate is too low—I think if the people currently working on papers that involve multi-agent systems, cooperative network dynamics, goal learning, goal misgeneralization, etc, were to stop their research, then the research that continues would in fact end up producing a new species of self-replicator that can seriously damage the world, even if the new species didn’t manage to actually eliminate humanity entirely.
But, as yudkowsky has said on twitter before (this does not mean I endorse everything the dude says on twitter, I’m no fan of him and I avoided this site for a long time due to thinking he had his head in the sand and didn’t understand the thing he was panicking about very well -)
https://twitter.com/ESYudkowsky/status/1594240412637483008
I continue to find most of what he says frustrating and useless, but he has had a few takes that I didn’t think were eyeroll worthy lately, and this is one.
If this site seems to have a consensus you think is silly, come fight me on it more. I see some consensus about some things, but not one towards any of the things you’re critiquing other than total risk of doom, which as you point out is not really what we need to be thinking about to reduce total risk of doom.