jessicata comments on Things that can go wrong with decomposed modules

jessicata 16 Feb 2016 21:56 UTC
0 points
AF
See this LessWrong thread for another type of problem with some of these adversarial approaches (including the “meeting halfway” proposal).

Roughly the problem is that, if one system is generating a long string and another system is checking it, then it’s easy for the first system to change the string in an important way without the second system noticing (e.g. using steganography). This seems like an important problem to think about in these adversarial-type approaches.