we should see our odds of alignment being close to the knife’s edge, because those are the situations that require the most computation-heavy simulations to determine the outcome of
No, because “successfully aligned” is a value-laden category. We could be worth simulating if our success probability is close to zero, but there’s a lot of uncertainty over which unaligned-with-us superintelligence we create.
No, because “successfully aligned” is a value-laden category. We could be worth simulating if our success probability is close to zero, but there’s a lot of uncertainty over which unaligned-with-us superintelligence we create.
oh, you’re absolutely right. thanks for pointing this out.