That just pushes the problem along a step. IF the Last Judge can’t be mistaken about the results of the AI running AND the Last Judge is willing to sacrifice the utility of the mass of humanity (including hirself) to protect one or more people from being tortured, then it’s safe. That’s very far from saying there’s a zero probability.
IF … the Last Judge is willing to sacrifice the utility of the mass of humanity (including hirself) to protect one or more people from being tortured, then it’s safe.
If the Last Judge peeks at the output and finds that it’s going to decide to torture people, that doesn’t imply abandoning FAI, it just requires fixing the bug and trying again.
That just pushes the problem along a step. IF the Last Judge can’t be mistaken about the results of the AI running AND the Last Judge is willing to sacrifice the utility of the mass of humanity (including hirself) to protect one or more people from being tortured, then it’s safe. That’s very far from saying there’s a zero probability.
If the Last Judge peeks at the output and finds that it’s going to decide to torture people, that doesn’t imply abandoning FAI, it just requires fixing the bug and trying again.