I particularly like your “Logical vs. physical risk aversion” distinction, and agree that we should prioritize reducing logical risk. I think acausal trade makes this particularly concrete. If we make a misaligned superintelligence that “plays nice” in the acausal bargaining community I’d think that’s better than making an aligned superintelligence that doesn’t, because overall it matters far more that the community is nice than that it have a high population of people with our values.
I also really like your point about how providing evidence that AI safety is difficult may be one of the most important reasons to do AI safety research. I guess I’d like to see some empirically grounded analysis of how likely it is that the relevant policymakers and so forth will be swayed by such things. So far it seems like they’ve been swayed by direct arguments that the problem is hard, and not so much by our failures to make progress. If anything failure of AI safety researchers to make progress seems to encourage their critics.
I particularly like your “Logical vs. physical risk aversion” distinction, and agree that we should prioritize reducing logical risk. I think acausal trade makes this particularly concrete. If we make a misaligned superintelligence that “plays nice” in the acausal bargaining community I’d think that’s better than making an aligned superintelligence that doesn’t, because overall it matters far more that the community is nice than that it have a high population of people with our values.
I also really like your point about how providing evidence that AI safety is difficult may be one of the most important reasons to do AI safety research. I guess I’d like to see some empirically grounded analysis of how likely it is that the relevant policymakers and so forth will be swayed by such things. So far it seems like they’ve been swayed by direct arguments that the problem is hard, and not so much by our failures to make progress. If anything failure of AI safety researchers to make progress seems to encourage their critics.