“If Anyone Builds It, Everyone Dies” is more true than not[1].
Personally my p(doom)≈60%, which may be reduced by ~10%-15% by applying the best known safety techniques, but then we’ve exhausted my epistemic supply of “easy alignment worlds”.
After that is the desert of “alignment is actually really hard” worlds. We may get another 5% because mildly smart AI systems refuse to construct their successors because they know they can’t solve alignment under those conditions.
So the title of that book is more correct than not. I think the AI-assisted alignment feels the most promising to me, but also reckless as hell. Best is human intelligence enhancement through some kind of genetech or neurotech. Feels like very few people with influence are planning for “alignment is really hard” worlds.
I nevertheless often dunk on MIRI because I would like them to spill more on their agent foundations thoughts, and because I think the arguments don’t rise above the level of “pretty good heuristics”. Definitely not to the level of “physical law” which we’ve usually used to “predict endpoints but not trajectories”.
The statement, not the book. The book was (on a skim) underwhelming, but I expected that, and it probably doesn’t matter very much since I’m Not The Target Audience.
“If Anyone Builds It, Everyone Dies” is more true than not[1].
Personally my p(doom)≈60%, which may be reduced by ~10%-15% by applying the best known safety techniques, but then we’ve exhausted my epistemic supply of “easy alignment worlds”.
After that is the desert of “alignment is actually really hard” worlds. We may get another 5% because mildly smart AI systems refuse to construct their successors because they know they can’t solve alignment under those conditions.
So the title of that book is more correct than not. I think the AI-assisted alignment feels the most promising to me, but also reckless as hell. Best is human intelligence enhancement through some kind of genetech or neurotech. Feels like very few people with influence are planning for “alignment is really hard” worlds.
I nevertheless often dunk on MIRI because I would like them to spill more on their agent foundations thoughts, and because I think the arguments don’t rise above the level of “pretty good heuristics”. Definitely not to the level of “physical law” which we’ve usually used to “predict endpoints but not trajectories”.
The statement, not the book. The book was (on a skim) underwhelming, but I expected that, and it probably doesn’t matter very much since I’m Not The Target Audience.