Gary Kasparov would beat me at chess in some way I can’t predict in advance. However, if the game starts with half his pieces removed from the board, I will beat him by playing very carefully. The first above-human level A.G.I. seems overwhelmingly likely to be down a lot of material—massively outnumbered, running on our infrastructure, starting with access to pretty crap/low bandwidth actuators in the physical world and no legal protections (yes, this actually matters when you’re not as smart as ALL of humanity—it’s a disadvantage relative to even the average human). If we exercise even a modicum of competence it will also be even tougher (e.g. an air gap, dedicated slightly weaker controllers, exposed thoughts at some granularity). If the chess metaphor holds we should expect the first such A.G.I. not to beat us—but it may well attempt to escape under many incentive structures. Does this mean we should expect to have many tries to solve alignment?
If you think not, it’s probably because of some dis-analogy with chess. For instance, the search space in the real world is much richer, and maybe there are always some “killer moves” available if you’re smart enough to see them e.g. invent nanotech. This seems to tie in with people’s intuitions about A) how fragile the world is and B) how g-loaded the game of life is. Personally I’m highly uncertain about both, but I suspect the answers are “somewhat.”
I would guess that A.G.I. that only wants to end the world might be able to pull it off with slightly superhuman intelligence, which is very scary to me. But I think it would actually be very hard to bootstrap all singularity level infrastructure from a post-apocalyptic wasteland, so perhaps this is actually not an convergent instrumental subgoal at this level of intelligence.
Is life actually much more g-loaded than chess? In terms of how far you can in principle multiply your material, unequivocally yes. However life is also more stochastic—I will never beat Gary Kasparov in a fair game, but if Jeff Bezos and I started over with ~0 dollars and no name recognition / average connections today, I think there’s a good >1% chance I’m richer in a year. It’s not immediately clear to me which view is more relevant here.
Gary Kasparov would beat me at chess in some way I can’t predict in advance. However, if the game starts with half his pieces removed from the board, I will beat him by playing very carefully. The first above-human level A.G.I. seems overwhelmingly likely to be down a lot of material—massively outnumbered, running on our infrastructure, starting with access to pretty crap/low bandwidth actuators in the physical world and no legal protections (yes, this actually matters when you’re not as smart as ALL of humanity—it’s a disadvantage relative to even the average human). If we exercise even a modicum of competence it will also be even tougher (e.g. an air gap, dedicated slightly weaker controllers, exposed thoughts at some granularity). If the chess metaphor holds we should expect the first such A.G.I. not to beat us—but it may well attempt to escape under many incentive structures. Does this mean we should expect to have many tries to solve alignment?
If you think not, it’s probably because of some dis-analogy with chess. For instance, the search space in the real world is much richer, and maybe there are always some “killer moves” available if you’re smart enough to see them e.g. invent nanotech. This seems to tie in with people’s intuitions about A) how fragile the world is and B) how g-loaded the game of life is. Personally I’m highly uncertain about both, but I suspect the answers are “somewhat.”
I would guess that A.G.I. that only wants to end the world might be able to pull it off with slightly superhuman intelligence, which is very scary to me. But I think it would actually be very hard to bootstrap all singularity level infrastructure from a post-apocalyptic wasteland, so perhaps this is actually not an convergent instrumental subgoal at this level of intelligence.
Is life actually much more g-loaded than chess? In terms of how far you can in principle multiply your material, unequivocally yes. However life is also more stochastic—I will never beat Gary Kasparov in a fair game, but if Jeff Bezos and I started over with ~0 dollars and no name recognition / average connections today, I think there’s a good >1% chance I’m richer in a year. It’s not immediately clear to me which view is more relevant here.