Roughly the problem is that, if one system is generating a long string and another system is checking it, then it’s easy for the first system to change the string in an important way without the second system noticing (e.g. using steganography). This seems like an important problem to think about in these adversarial-type approaches.
See this LessWrong thread for another type of problem with some of these adversarial approaches (including the “meeting halfway” proposal).
Roughly the problem is that, if one system is generating a long string and another system is checking it, then it’s easy for the first system to change the string in an important way without the second system noticing (e.g. using steganography). This seems like an important problem to think about in these adversarial-type approaches.