gwern comments on Meta learning to gradient hack