Even if shut down in particular isn’t something we want it to be indifferent to, I think being able to make an agent indifferent to something is very plausibly useful for designing it to be corrigible?
Even if shut down in particular isn’t something we want it to be indifferent to, I think being able to make an agent indifferent to something is very plausibly useful for designing it to be corrigible?