paulfchristiano comments on A possible training procedure for human-imitators

paulfchristiano 27 Feb 2016 0:57 UTC
LW: 1 AF: 1
AF
This looks a lot like a variational autoencoder, unless I’m missing some distinction (Note that once you have a bit-by-bit predictor, you can assume WLOG that the distribution on A is uniform). Thinking about how variational autoencoders work in the superintelligent case seems worthwhile if we want to scale up something like imitation. I also wouldn’t be that surprised if it produced some practically useful insight.

(Incidentally, I think that variational autoencoders and generative adversarial nets are the leading approaches to generative modeling right now.)

I agree that the steganography problem looks kind of bad for adversarial methods in the limit. For coping with the analogous problem with approval-maximization, I think the best bet is to try to make the generative model state transparent to the discriminative model. But this is obviously not going to work for generative adversarial models, since access to the generator state would make distinguishing trivial.
- paulfchristiano 18 Mar 2016 4:03 UTC
  0 points
  AF Parent
  Actually I’m not sure exactly what you mean by importance sampling here.
  
  The variational lower bound would be to draw samples from $q$ and compute $l o g (p / q)$ . The log probability of the output under $p$ is bounded by the expectation of this quantity (with equality iff $q$ is the correct conditional distribution over $A$ ).
  
  I’m just going to work with this in my other comments, I assume it amounts to the same thing.
  - jessicata 19 Mar 2016 19:47 UTC
    0 points
    AF Parent
    What I mean is: compute ${^E}_{q} [[f (A) = x] p (A) / q (A)]$ , which is a probabilistic lower bound on $P_{p} (f (A) = x)$ .
    
    The variational score gives you a somewhat worse lower bound if $q$ is different from $p (A | f (A) = x)$ . Due to Jensen’s inequality, $E_{q} [log ([f (A) = x] p (A) / q (A))] \leq log E_{q} [[f (A) = x] p (A) / q (A)] \leq log P_{p} (f (A) = x)$
    
    It probably doesn’t make a huge difference either way.