I really don’t think it’s crazy to believe that humans figure out a way to control AGI at least.
They want to, yes. But is it feasible?
One problem is that “AGI” is a misnomer (the road to superintelligence goes not via human equivalence, but around it; we have the situation where AI systems are wildly superhuman along larger and larger number of dimensions, and are still deficient along some important dimensions compared to humans, preventing us from calling them “AGIs”; by the time they are no longer deficient along any important dimensions, they are already wildly superhuman along way too many dimensions).
Another problem, a “narrow AGI” (in the sense defined by Tom Davidson, https://www.lesswrong.com/posts/Nsmabb9fhpLuLdtLE/takeoff-speeds-presentation-at-anthropic, so we are still talking about very “sub-AGI” systems) is almost certainly sufficient for “non-saturating recursive self-improvement”, so one has a rapidly moving target for one’s control ambitions (it’s also likely that it’s not too difficult to reach the “non-saturating recursive self-improvement” mode, so if one freezes one’s AI and prevents it from self-modifications, others will bypass its capabilities).
Of course, it might be just the stress of this very adversarial situation, talking to hostile lawyers, with his own lawyer pushing him hard to say as little as possible, so I would hope this is not a reflection of any genuine evolution in his thinking. But we don’t know...
I also think it’s possible that the U.S. and China might already be talking behind the scenes about a superintelligence ban.
Even if they are talking about this, too many countries and orgs are likely to have feasible route to superintelligence. For example, Japan is one of those countries (for example, they have Sakana AI), and their views on superintelligence are very different from our Western views, so it would be difficult to convince them to join a ban; e.g. quoting from https://www.lesswrong.com/posts/Yc6cpGmBieS7ADxcS/japan-ai-alignment-conference-postmortem:
A second difficulty in communicating alignment ideas was based on differing ontologies. A surface-level explanation is that Japan is quite techno-optimistic compared to the west, and has strong intuitions that AI will operate harmoniously with humans. A more nuanced explanation is that Buddhist- and Shinto-inspired axioms in Japanese thinking lead to the conclusion that superintelligence will be conscious and aligned by default. One senior researcher from RIKEN noted during the conference that “it is obviously impossible to control a superintelligence, but living alongside one seems possible.” Some visible consequences of this are that machine consciousness research in Japan is taken quite seriously, whereas in the West there is little discussion of it.
Other countries which are contenders include UK, a number of European countries including Switzerland, Israel, Saudi Arabia, UAE, Singapore, South Korea, and, of course, Brazil and Russia, and I doubt this is a complete list.
We already are seeing recursive self-improvement efforts taking longer to saturate, compared to their behavior a couple of years ago. I doubt they’ll keep saturating for long.
Another reply, sorry I just think what you said is super interesting. The insight you shared about Eastern spirituality affecting attitudes towards AI is beautiful. I do wonder if our own Western attitudes towards AI are due to our flawed spiritual beliefs. Particularly the idea of a wrathful, judgemental Abrahamic god. I’m not sure if it’s a coincidence that someone who was raised as an Orthodox Jew (Eliezer) came to fear AI so much.
On another note, the Old Testament is horrible (I was raised reform/californian Jewish, I guess I’m just mentioning this because I don’t want to come across as antisemitic). It imbues what should be the greatest source of beauty with our weakest, most immature impulses. The New Testament’s emphasis on mercy is a big improvement/beautiful, but even then I don’t like the Book of Revelation talking about casting the sinners into a lake of fire.
I think we do tend to underestimate differences between people.
We know theoretically that people differ a lot, but we usually don’t viscerally feel how strong those differences are. One of the most remarkable examples of that is described here:
With AI existential safety, I think our progress is so slow because people mostly pursue anthropocentric approaches. Just like with astronomy, one needs a more invariant point of view to make progress.
They want to, yes. But is it feasible?
One problem is that “AGI” is a misnomer (the road to superintelligence goes not via human equivalence, but around it; we have the situation where AI systems are wildly superhuman along larger and larger number of dimensions, and are still deficient along some important dimensions compared to humans, preventing us from calling them “AGIs”; by the time they are no longer deficient along any important dimensions, they are already wildly superhuman along way too many dimensions).
Another problem, a “narrow AGI” (in the sense defined by Tom Davidson, https://www.lesswrong.com/posts/Nsmabb9fhpLuLdtLE/takeoff-speeds-presentation-at-anthropic, so we are still talking about very “sub-AGI” systems) is almost certainly sufficient for “non-saturating recursive self-improvement”, so one has a rapidly moving target for one’s control ambitions (it’s also likely that it’s not too difficult to reach the “non-saturating recursive self-improvement” mode, so if one freezes one’s AI and prevents it from self-modifications, others will bypass its capabilities).
In 2023 Ilya was sounding like he had good grasp of these complexities and he was clearly way above par in the quality of his thinking about AI existential safety: https://www.lesswrong.com/posts/TpKktHS8GszgmMw4B/ilya-sutskever-s-thoughts-on-ai-safety-july-2023-a
Of course, it might be just the stress of this very adversarial situation, talking to hostile lawyers, with his own lawyer pushing him hard to say as little as possible, so I would hope this is not a reflection of any genuine evolution in his thinking. But we don’t know...
Even if they are talking about this, too many countries and orgs are likely to have feasible route to superintelligence. For example, Japan is one of those countries (for example, they have Sakana AI), and their views on superintelligence are very different from our Western views, so it would be difficult to convince them to join a ban; e.g. quoting from https://www.lesswrong.com/posts/Yc6cpGmBieS7ADxcS/japan-ai-alignment-conference-postmortem:
Other countries which are contenders include UK, a number of European countries including Switzerland, Israel, Saudi Arabia, UAE, Singapore, South Korea, and, of course, Brazil and Russia, and I doubt this is a complete list.
We already are seeing recursive self-improvement efforts taking longer to saturate, compared to their behavior a couple of years ago. I doubt they’ll keep saturating for long.
Those are all good points. Well I hope these things are nice.
Same here :-)
I do see feasible scenarios where these things are sustainably nice.
But whether we end up reaching those scenarios… who knows...
Another reply, sorry I just think what you said is super interesting. The insight you shared about Eastern spirituality affecting attitudes towards AI is beautiful. I do wonder if our own Western attitudes towards AI are due to our flawed spiritual beliefs. Particularly the idea of a wrathful, judgemental Abrahamic god. I’m not sure if it’s a coincidence that someone who was raised as an Orthodox Jew (Eliezer) came to fear AI so much.
On another note, the Old Testament is horrible (I was raised reform/californian Jewish, I guess I’m just mentioning this because I don’t want to come across as antisemitic). It imbues what should be the greatest source of beauty with our weakest, most immature impulses. The New Testament’s emphasis on mercy is a big improvement/beautiful, but even then I don’t like the Book of Revelation talking about casting the sinners into a lake of fire.
I think we do tend to underestimate differences between people.
We know theoretically that people differ a lot, but we usually don’t viscerally feel how strong those differences are. One of the most remarkable examples of that is described here:
https://www.lesswrong.com/posts/NyiFLzSrkfkDW4S7o/why-it-s-so-hard-to-talk-about-consciousness
With AI existential safety, I think our progress is so slow because people mostly pursue anthropocentric approaches. Just like with astronomy, one needs a more invariant point of view to make progress.
I’ve done a bit of scribblings along those lines: https://www.lesswrong.com/posts/WJuASYDnhZ8hs5CnD/exploring-non-anthropocentric-aspects-of-ai-existential
But that’s just a starting point, a seed of what needs to be done in order to make progress…