A notable section from Ilya Sutskever’s recent deposition:
WITNESS SUTSKEVER: Right now, my view is that, with very few exceptions, most likely a person who is going to be in charge is going to be very good with the way of power. And it will be a lot like choosing between different politicians.
ATTORNEY EDDY: The person in charge of what?
WITNESS SUTSKEVER: AGI.
ATTORNEY EDDY: And why do you say that?
ATTORNEY AGNOLUCCI: Object to form.
WITNESS SUTSKEVER: That’s how the world seems to work. I think it’s very—I think it’s not impossible, but I think it’s very hard for someone who would be described as a saint to make it. I think it’s worth trying. I just think it’s—it’s like choosing between different politicians. Who is going to be the head of the state?
On one hand, he has switched from focusing on the ill-defined “AGI” to focusing on superintelligence a while ago. But he is using this semi-obsolete “AGI” terminology here.
On the other hand, he seemed to have understood a couple of years ago that no one could be “in charge” of such a system, that at most one could perhaps be in charge of a privileged access to it and privileged collaboration with it (and even that is only feasible if the system chooses to cooperate in maintaining this kind of privileged access).
So it’s very strange, almost as if he has backtracked a few years in his thinking… of course, this is right after a break in page numbers, this is page 300, and the previous one is page 169 (I guess there is a process for what of this (marked as “highly confidential”) material is released).
I really don’t think it’s crazy to believe that humans figure out a way to control AGI at least. There’s enormous financial incentive for it, and power hungry capitalists want that massive force multiplier. There are also a bunch of mega-talented technical people hacking away at the problem. OpenAI is trying to recruit a ton of quants as well, so I think by throwing thousands of the greatest minds alive at the problem they might figure it out (obviously one might take issue with calling quants “the greatest minds alive.” So if you don’t like that replace “greatest minds alive” with “super driven, super smart people.”)
I also think it’s possible that the U.S. and China might already be talking behind the scenes about a superintelligence ban. That’s just a guess though. (Likely because it’s much more intuitive that you can’t control a superintelligence). AGI lets you stop having to pay wages and makes you enormously rich. But you don’t have to worry about being outsmarted.
I really don’t think it’s crazy to believe that humans figure out a way to control AGI at least.
They want to, yes. But is it feasible?
One problem is that “AGI” is a misnomer (the road to superintelligence goes not via human equivalence, but around it; we have the situation where AI systems are wildly superhuman along larger and larger number of dimensions, and are still deficient along some important dimensions compared to humans, preventing us from calling them “AGIs”; by the time they are no longer deficient along any important dimensions, they are already wildly superhuman along way too many dimensions).
Another problem, a “narrow AGI” (in the sense defined by Tom Davidson, https://www.lesswrong.com/posts/Nsmabb9fhpLuLdtLE/takeoff-speeds-presentation-at-anthropic, so we are still talking about very “sub-AGI” systems) is almost certainly sufficient for “non-saturating recursive self-improvement”, so one has a rapidly moving target for one’s control ambitions (it’s also likely that it’s not too difficult to reach the “non-saturating recursive self-improvement” mode, so if one freezes one’s AI and prevents it from self-modifications, others will bypass its capabilities).
Of course, it might be just the stress of this very adversarial situation, talking to hostile lawyers, with his own lawyer pushing him hard to say as little as possible, so I would hope this is not a reflection of any genuine evolution in his thinking. But we don’t know...
I also think it’s possible that the U.S. and China might already be talking behind the scenes about a superintelligence ban.
Even if they are talking about this, too many countries and orgs are likely to have feasible route to superintelligence. For example, Japan is one of those countries (for example, they have Sakana AI), and their views on superintelligence are very different from our Western views, so it would be difficult to convince them to join a ban; e.g. quoting from https://www.lesswrong.com/posts/Yc6cpGmBieS7ADxcS/japan-ai-alignment-conference-postmortem:
A second difficulty in communicating alignment ideas was based on differing ontologies. A surface-level explanation is that Japan is quite techno-optimistic compared to the west, and has strong intuitions that AI will operate harmoniously with humans. A more nuanced explanation is that Buddhist- and Shinto-inspired axioms in Japanese thinking lead to the conclusion that superintelligence will be conscious and aligned by default. One senior researcher from RIKEN noted during the conference that “it is obviously impossible to control a superintelligence, but living alongside one seems possible.” Some visible consequences of this are that machine consciousness research in Japan is taken quite seriously, whereas in the West there is little discussion of it.
Other countries which are contenders include UK, a number of European countries including Switzerland, Israel, Saudi Arabia, UAE, Singapore, South Korea, and, of course, Brazil and Russia, and I doubt this is a complete list.
We already are seeing recursive self-improvement efforts taking longer to saturate, compared to their behavior a couple of years ago. I doubt they’ll keep saturating for long.
Another reply, sorry I just think what you said is super interesting. The insight you shared about Eastern spirituality affecting attitudes towards AI is beautiful. I do wonder if our own Western attitudes towards AI are due to our flawed spiritual beliefs. Particularly the idea of a wrathful, judgemental Abrahamic god. I’m not sure if it’s a coincidence that someone who was raised as an Orthodox Jew (Eliezer) came to fear AI so much.
On another note, the Old Testament is horrible (I was raised reform/californian Jewish, I guess I’m just mentioning this because I don’t want to come across as antisemitic). It imbues what should be the greatest source of beauty with our weakest, most immature impulses. The New Testament’s emphasis on mercy is a big improvement/beautiful, but even then I don’t like the Book of Revelation talking about casting the sinners into a lake of fire.
I think we do tend to underestimate differences between people.
We know theoretically that people differ a lot, but we usually don’t viscerally feel how strong those differences are. One of the most remarkable examples of that is described here:
With AI existential safety, I think our progress is so slow because people mostly pursue anthropocentric approaches. Just like with astronomy, one needs a more invariant point of view to make progress.
A notable section from Ilya Sutskever’s recent deposition:
Thanks for posting that deposition.
It’s really strange how he phrases it here.
On one hand, he has switched from focusing on the ill-defined “AGI” to focusing on superintelligence a while ago. But he is using this semi-obsolete “AGI” terminology here.
On the other hand, he seemed to have understood a couple of years ago that no one could be “in charge” of such a system, that at most one could perhaps be in charge of a privileged access to it and privileged collaboration with it (and even that is only feasible if the system chooses to cooperate in maintaining this kind of privileged access).
So it’s very strange, almost as if he has backtracked a few years in his thinking… of course, this is right after a break in page numbers, this is page 300, and the previous one is page 169 (I guess there is a process for what of this (marked as “highly confidential”) material is released).
I really don’t think it’s crazy to believe that humans figure out a way to control AGI at least. There’s enormous financial incentive for it, and power hungry capitalists want that massive force multiplier. There are also a bunch of mega-talented technical people hacking away at the problem. OpenAI is trying to recruit a ton of quants as well, so I think by throwing thousands of the greatest minds alive at the problem they might figure it out (obviously one might take issue with calling quants “the greatest minds alive.” So if you don’t like that replace “greatest minds alive” with “super driven, super smart people.”)
I also think it’s possible that the U.S. and China might already be talking behind the scenes about a superintelligence ban. That’s just a guess though. (Likely because it’s much more intuitive that you can’t control a superintelligence). AGI lets you stop having to pay wages and makes you enormously rich. But you don’t have to worry about being outsmarted.
They want to, yes. But is it feasible?
One problem is that “AGI” is a misnomer (the road to superintelligence goes not via human equivalence, but around it; we have the situation where AI systems are wildly superhuman along larger and larger number of dimensions, and are still deficient along some important dimensions compared to humans, preventing us from calling them “AGIs”; by the time they are no longer deficient along any important dimensions, they are already wildly superhuman along way too many dimensions).
Another problem, a “narrow AGI” (in the sense defined by Tom Davidson, https://www.lesswrong.com/posts/Nsmabb9fhpLuLdtLE/takeoff-speeds-presentation-at-anthropic, so we are still talking about very “sub-AGI” systems) is almost certainly sufficient for “non-saturating recursive self-improvement”, so one has a rapidly moving target for one’s control ambitions (it’s also likely that it’s not too difficult to reach the “non-saturating recursive self-improvement” mode, so if one freezes one’s AI and prevents it from self-modifications, others will bypass its capabilities).
In 2023 Ilya was sounding like he had good grasp of these complexities and he was clearly way above par in the quality of his thinking about AI existential safety: https://www.lesswrong.com/posts/TpKktHS8GszgmMw4B/ilya-sutskever-s-thoughts-on-ai-safety-july-2023-a
Of course, it might be just the stress of this very adversarial situation, talking to hostile lawyers, with his own lawyer pushing him hard to say as little as possible, so I would hope this is not a reflection of any genuine evolution in his thinking. But we don’t know...
Even if they are talking about this, too many countries and orgs are likely to have feasible route to superintelligence. For example, Japan is one of those countries (for example, they have Sakana AI), and their views on superintelligence are very different from our Western views, so it would be difficult to convince them to join a ban; e.g. quoting from https://www.lesswrong.com/posts/Yc6cpGmBieS7ADxcS/japan-ai-alignment-conference-postmortem:
Other countries which are contenders include UK, a number of European countries including Switzerland, Israel, Saudi Arabia, UAE, Singapore, South Korea, and, of course, Brazil and Russia, and I doubt this is a complete list.
We already are seeing recursive self-improvement efforts taking longer to saturate, compared to their behavior a couple of years ago. I doubt they’ll keep saturating for long.
Those are all good points. Well I hope these things are nice.
Same here :-)
I do see feasible scenarios where these things are sustainably nice.
But whether we end up reaching those scenarios… who knows...
Another reply, sorry I just think what you said is super interesting. The insight you shared about Eastern spirituality affecting attitudes towards AI is beautiful. I do wonder if our own Western attitudes towards AI are due to our flawed spiritual beliefs. Particularly the idea of a wrathful, judgemental Abrahamic god. I’m not sure if it’s a coincidence that someone who was raised as an Orthodox Jew (Eliezer) came to fear AI so much.
On another note, the Old Testament is horrible (I was raised reform/californian Jewish, I guess I’m just mentioning this because I don’t want to come across as antisemitic). It imbues what should be the greatest source of beauty with our weakest, most immature impulses. The New Testament’s emphasis on mercy is a big improvement/beautiful, but even then I don’t like the Book of Revelation talking about casting the sinners into a lake of fire.
I think we do tend to underestimate differences between people.
We know theoretically that people differ a lot, but we usually don’t viscerally feel how strong those differences are. One of the most remarkable examples of that is described here:
https://www.lesswrong.com/posts/NyiFLzSrkfkDW4S7o/why-it-s-so-hard-to-talk-about-consciousness
With AI existential safety, I think our progress is so slow because people mostly pursue anthropocentric approaches. Just like with astronomy, one needs a more invariant point of view to make progress.
I’ve done a bit of scribblings along those lines: https://www.lesswrong.com/posts/WJuASYDnhZ8hs5CnD/exploring-non-anthropocentric-aspects-of-ai-existential
But that’s just a starting point, a seed of what needs to be done in order to make progress…