This is great. I recognize that this is almost certainly related to the book “If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All”, which I have preordered, but as a standalone piece I feel estimating p(doom) > 90 and dismissing alignment-by-default without an argument is too aggressive.
as a standalone piece … estimating p(doom) > 90 … without an argument is too aggressive
The alternative of claiming some other estimate (by people who actually estimate p(doom) > 90) would be dishonest, and the alternative of only ever giving book-length arguments until the position is more popular opposes lightness (as a matter of social epistemic norms), makes it more difficult for others to notice the fact that someone is actually making such estimates.
Thanks! It was hard to make; glad to finally show it off, and double glad people appreciate it.
Noting that the piece does not contain the words ‘p(doom) > 90’, in case someone just reads the comments and not the post, and then decides to credit MIRI with those exact words.
I would also note that most proponents of alignment by default either:
Give pretty slim odds of it.
Would have disagreements with points in this piece that are upstream of their belief in alignment by default (this is my guess for, e.g., Alex, Nora, Quintin).
I think quibbling over the details with people who do [1] is kind of a waste of time, since our positions are so, so close together that we should really focus on accomplishing our shared goal: avoiding literally everyone literally dying, rather than focusing on 100 percent consensus over a thing we both agree is, at best, very unlikely. If an ally of mine wants to spend a sliver of their time thinking about the alignment by default worlds, I won’t begrudge them that (but I also won’t be participating).
To the extent someone is in camp [2], I’d like it if they point out their disagreements with the contents of the post, or say why they take the arguments to be weak, and what they think the better argument is, rather than saying ‘but what about alignment by default?’ My answer to the question ‘but what about alignment by default’ is ‘The Problem’.
[other contributors to the work certainly disagree with me in various places]
> If we were to put a number on how likely extinction is in the absence of an aggressive near-term policy response, MIRI’s research leadership would give one upward of 90%.
This is what I interpreted as implying p(doom) > 90%, but it’s clearly a misreading to assume that someone advocating for “an aggressive near-term policy response” believes that it has a ~0% chance of happening.
I am in camp 2, but will try to refine my argument more before writing it down.
I was pushing back on ‘p(doom)’ as an ambiguous construction that different people bake different conditionals into, and attempting to protect against people ripping things out of context if they hadn’t even seen the line you were referencing.
This is great. I recognize that this is almost certainly related to the book “If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All”, which I have preordered, but as a standalone piece I feel estimating p(doom) > 90 and dismissing alignment-by-default without an argument is too aggressive.
The alternative of claiming some other estimate (by people who actually estimate p(doom) > 90) would be dishonest, and the alternative of only ever giving book-length arguments until the position is more popular opposes lightness (as a matter of social epistemic norms), makes it more difficult for others to notice the fact that someone is actually making such estimates.
Thanks! It was hard to make; glad to finally show it off, and double glad people appreciate it.
Noting that the piece does not contain the words ‘p(doom) > 90’, in case someone just reads the comments and not the post, and then decides to credit MIRI with those exact words.
I would also note that most proponents of alignment by default either:
Give pretty slim odds of it.
Would have disagreements with points in this piece that are upstream of their belief in alignment by default (this is my guess for, e.g., Alex, Nora, Quintin).
I think quibbling over the details with people who do [1] is kind of a waste of time, since our positions are so, so close together that we should really focus on accomplishing our shared goal: avoiding literally everyone literally dying, rather than focusing on 100 percent consensus over a thing we both agree is, at best, very unlikely. If an ally of mine wants to spend a sliver of their time thinking about the alignment by default worlds, I won’t begrudge them that (but I also won’t be participating).
To the extent someone is in camp [2], I’d like it if they point out their disagreements with the contents of the post, or say why they take the arguments to be weak, and what they think the better argument is, rather than saying ‘but what about alignment by default?’ My answer to the question ‘but what about alignment by default’ is ‘The Problem’.
[other contributors to the work certainly disagree with me in various places]
> If we were to put a number on how likely extinction is in the absence of an aggressive near-term policy response, MIRI’s research leadership would give one upward of 90%.
This is what I interpreted as implying p(doom) > 90%, but it’s clearly a misreading to assume that someone advocating for “an aggressive near-term policy response” believes that it has a ~0% chance of happening.
I am in camp 2, but will try to refine my argument more before writing it down.
I was pushing back on ‘p(doom)’ as an ambiguous construction that different people bake different conditionals into, and attempting to protect against people ripping things out of context if they hadn’t even seen the line you were referencing.
Oh yeah, I also find that annoying.