Great stuff! Yeah, if the AI really understands your preference to not be turned into a highly-competent zombie, (i.e. low R(sta) for world-histories where you are coerced to follow Pi(max)) then you don’t get turned into a highly competent zombie. But this is another one of those cases where the AI’s search algorithm is directly searching for a loophole in its model of your preferences, and you’re just hoping that the search fails.
Great stuff! Yeah, if the AI really understands your preference to not be turned into a highly-competent zombie, (i.e. low R(sta) for world-histories where you are coerced to follow Pi(max)) then you don’t get turned into a highly competent zombie. But this is another one of those cases where the AI’s search algorithm is directly searching for a loophole in its model of your preferences, and you’re just hoping that the search fails.