This comment reads as kind of unhelpfully intellectual to me? “What would it take to fix the door?” “Well it would involve replacing the broken hinge.” “Ah, that’s framing the door being fixed as a binary, really you should claim that replacing the broken hinge has a certain probabilityof fixing the door.”
I get that in life we don’t have certainties, and many good & relevant policies mitigate different parts of the risk, but I think the point is to move toward a mechanistic model of what the problem is and what its causes are, and often talking about probabilities isn’t appropriate for that and essentially just adds cognitive overhead.
“What is a policy that could be implemented by a particular government, or an international treaty signed by many governments, such that you would no longer think that working on preventing extinction-or-similar from AGI was the top priority (or close to the top priority) for our civilization?”
Another phrasing: “such that you would no longer be concerned that, this century, humanity would essentially lose control over the future by getting outcompeted by an AGI with totally different values?”
I think I could understand you not getting what the question means if, in your model of the future, all routes are crazy and pass through AGI takeover of some sort, and our full-time job regardless of what happens is to navigate that. Like, there isn’t cleanly a ‘safe’ world, all the worlds involve our full-time job being dealing with alignment problems of AGIs.
I’m surprised that’s the question. I would guess that’s not what Eliezer means because he says Dath Ilan is responding sufficiently to AI risk but also hints at Dath Ilan still spending a significant fraction of its resources on AI safety (I’ve only read a fraction of the work here, maybe wrong). I have a background belief that the largest problems don’t change that much, and it’s rare for problems to go from #1 problem to not-in-top-10 problems, and that most things have diminishing returns such that it’s not worthwhile to solve them so thoroughly. An alternative definition that’s spiritually similar that I like more is; “What policy could governments implement such that the improving the AI x-risk policy would now not be the #1 priority, if the governments were wise.”. This isolates AI / puts it in context of other global problems, such that the AI solution doesn’t need to prevent governments from changing their minds over the next 100 years or whatever needs to happen for the next 100 years to go well.
Fixing doors is so vastly easier than predicting the future that analogies and intuitions don’t transfer.
Compare someone asking in 1875, 1920, 1945, or 2025, “What is the minimum necessary and sufficient policy that you think would prevent Germany invading France in the next 50 years?”. The problem is non-binary, there are no guarantees, and even definitions are treacherous. I wouldn’t ask the question that way
Instead I might ask “what policies best support peace between France and Germany, and how?”. So we can talk mechanistically without the distraction of “minimum”, “necessary”, “sufficient”, and “prevent”.
Separately, I do not want anyone to be thinking of minimum policies here. There is no virtue in doing the minimum necessary to prevent extinction.
This comment reads as kind of unhelpfully intellectual to me? “What would it take to fix the door?” “Well it would involve replacing the broken hinge.” “Ah, that’s framing the door being fixed as a binary, really you should claim that replacing the broken hinge has a certain probability of fixing the door.”
I get that in life we don’t have certainties, and many good & relevant policies mitigate different parts of the risk, but I think the point is to move toward a mechanistic model of what the problem is and what its causes are, and often talking about probabilities isn’t appropriate for that and essentially just adds cognitive overhead.
I just have no idea what Eliezer’s question even means. What do you think it means?
“What is a policy that could be implemented by a particular government, or an international treaty signed by many governments, such that you would no longer think that working on preventing extinction-or-similar from AGI was the top priority (or close to the top priority) for our civilization?”
Another phrasing: “such that you would no longer be concerned that, this century, humanity would essentially lose control over the future by getting outcompeted by an AGI with totally different values?”
I think I could understand you not getting what the question means if, in your model of the future, all routes are crazy and pass through AGI takeover of some sort, and our full-time job regardless of what happens is to navigate that. Like, there isn’t cleanly a ‘safe’ world, all the worlds involve our full-time job being dealing with alignment problems of AGIs.
(Edited down to cut extraneous text.)
I’m surprised that’s the question. I would guess that’s not what Eliezer means because he says Dath Ilan is responding sufficiently to AI risk but also hints at Dath Ilan still spending a significant fraction of its resources on AI safety (I’ve only read a fraction of the work here, maybe wrong). I have a background belief that the largest problems don’t change that much, and it’s rare for problems to go from #1 problem to not-in-top-10 problems, and that most things have diminishing returns such that it’s not worthwhile to solve them so thoroughly. An alternative definition that’s spiritually similar that I like more is; “What policy could governments implement such that the improving the AI x-risk policy would now not be the #1 priority, if the governments were wise.”. This isolates AI / puts it in context of other global problems, such that the AI solution doesn’t need to prevent governments from changing their minds over the next 100 years or whatever needs to happen for the next 100 years to go well.
Fixing doors is so vastly easier than predicting the future that analogies and intuitions don’t transfer.
Compare someone asking in 1875, 1920, 1945, or 2025, “What is the minimum necessary and sufficient policy that you think would prevent Germany invading France in the next 50 years?”. The problem is non-binary, there are no guarantees, and even definitions are treacherous. I wouldn’t ask the question that way
Instead I might ask “what policies best support peace between France and Germany, and how?”. So we can talk mechanistically without the distraction of “minimum”, “necessary”, “sufficient”, and “prevent”.
Separately, I do not want anyone to be thinking of minimum policies here. There is no virtue in doing the minimum necessary to prevent extinction.