Research Coordinator of area “Do Not Build Uncontrollable AI” for AI Safety Camp.
See explainer on why AGI could not be controlled enough to stay safe:
https://www.lesswrong.com/posts/xp6n2MG5vQkPpFEBH/the-control-problem-unsolved-or-unsolvable
This answer will sound unsatisfying:
If a mathematician or analytical philosopher wrote a bunch of squiggles on a whiteboard, and said it was a proof, would you recognise it as a proof?
Say that unfamiliar new analytical language and means of derivation are used (which is not uncommon for impossibility proofs by contradiction, see Gödel’s incompleteness theorems and Bell’s theorem).
Say that it directly challenges technologists’ beliefs about their capacity to control technology, particularly their capacity to constrain a supposedly “dumb local optimiser”: evolutionary selection.
Say that the reasoning is not only about a formal axiomatic system, but needs to make empirically sound correspondences with how real physical systems work.
Say that the reasoning is not only about an interesting theoretical puzzle, but has serious implications for how we can and cannot prevent human extinction.
This is high stakes.
We were looking for careful thinkers who had the patience to spend time on understanding the shape of the argument, and how the premises correspond with how things work in reality. Linda and Anders turned out to be two of these people, and we did three long calls so far (first call has an edited transcript).
I wish we could short-cut that process. But if we cannot manage to convey the overall shape of the argument and the premises, then there is no point to moving on to how the reasoning is formalised.
I get that people are busy with their own projects, and want to give their own opinions about what they initially think the argument entails. And, if the time they commit to understanding the argument is not at least 1⁄5 of the time I spend on conveying the argument specifically to them, then in my experience we usually lack the shared bandwidth needed to work through the argument.
Saying, “guys, big inferential distance here” did not help. People will expect it to be a short inferential distance anyway.
Saying it’s a complicated argument that takes time to understand did not help. A smart busy researcher did some light reading, tracked down a claim that seemed “obviously” untrue within their mental framework, and thereby confidently dismissed the entire argument. BTW, they’re a famous research insider, and we’re just outsiders whose response got downvoted – must be wrong right?
Saying everything in this comment does not help. It’s some long-assessed plea for your patience.
If I’m so confident about the conclusion, why am I not passing you the proof clean and clear now?!
Feel free to downvote this comment and move on.
Here is my best attempt at summarising the argument intuitively and precisely, still prompting some misinterpretations by well-meaning commenters. I feel appreciation for people who realised what is at stake, and were therefore willing to continue syncing up on the premises and reasoning, as Will did: