I don’t really see why this is a crux. I’m currently at like ~5% on this claim (given my understanding of what you mean), but moving to 15% or even 50% (while keeping the rest of the distribution the same) wouldn’t really change my strategic orientation. Maybe you’re focused on getting to a world with a more acceptable level of risk (e.g., <5%), but I think going from 40% risk to 20% risk is better to focus on.
I think you kinda convinced me here this reasoning isn’t (as stated) very persuasive.
I think my reasoning had some additional steps like:
when I’m 15% on ‘alignment might be philosophically hard’, I still expect to maybe learn more and update to 90%+, and it seems better to pursue strategies that don’t actively throw that world under the bus. (and, while I don’t fully understand the Realpolitik, it seems to me that Anthropic could totally be pursuing strategies that achieve a lot of it’s goals without Policy Comms that IMO actively torch the “long pause” worlds)
you are probably right I was oriented around “getting to like 5% risk” than reducing risk on the margin.
I’m probably partly just not really visualizing what it’d be like to be a 15%-er and bringing some bias in.
Not sure how interesting this is to discuss, but I don’t think I agree with this. Stuff they’re doing does seem harmful to worlds where you need a long pause, but feels like at the very least Anthropic is a small fraction of the torching right? Like if you think Anthropic is making this less likely, surely they are a small fraction of people pushing in this direction such that they aren’t making this that much worse (and can probably still pivot later given what they’ve said so far).
I think you kinda convinced me here this reasoning isn’t (as stated) very persuasive.
I think my reasoning had some additional steps like:
when I’m 15% on ‘alignment might be philosophically hard’, I still expect to maybe learn more and update to 90%+, and it seems better to pursue strategies that don’t actively throw that world under the bus. (and, while I don’t fully understand the Realpolitik, it seems to me that Anthropic could totally be pursuing strategies that achieve a lot of it’s goals without Policy Comms that IMO actively torch the “long pause” worlds)
you are probably right I was oriented around “getting to like 5% risk” than reducing risk on the margin.
I’m probably partly just not really visualizing what it’d be like to be a 15%-er and bringing some bias in.
Not sure how interesting this is to discuss, but I don’t think I agree with this. Stuff they’re doing does seem harmful to worlds where you need a long pause, but feels like at the very least Anthropic is a small fraction of the torching right? Like if you think Anthropic is making this less likely, surely they are a small fraction of people pushing in this direction such that they aren’t making this that much worse (and can probably still pivot later given what they’ve said so far).