# paulfchristiano comments on A possible training procedure for human-imitators

• This looks a lot like a variational autoencoder, unless I’m missing some distinction (Note that once you have a bit-by-bit predictor, you can assume WLOG that the distribution on A is uniform). Thinking about how variational autoencoders work in the superintelligent case seems worthwhile if we want to scale up something like imitation. I also wouldn’t be that surprised if it produced some practically useful insight.

(Incidentally, I think that variational autoencoders and generative adversarial nets are the leading approaches to generative modeling right now.)

I agree that the steganography problem looks kind of bad for adversarial methods in the limit. For coping with the analogous problem with approval-maximization, I think the best bet is to try to make the generative model state transparent to the discriminative model. But this is obviously not going to work for generative adversarial models, since access to the generator state would make distinguishing trivial.

• Actually I’m not sure exactly what you mean by importance sampling here.

The variational lower bound would be to draw samples from and compute . The log probability of the output under is bounded by the expectation of this quantity (with equality iff is the correct conditional distribution over ).

I’m just going to work with this in my other comments, I assume it amounts to the same thing.

• What I mean is: compute , which is a probabilistic lower bound on .

The variational score gives you a somewhat worse lower bound if is different from . Due to Jensen’s inequality,

It probably doesn’t make a huge difference either way.