Additional note: I think you can implement the approach I sketched (which I’m calling “reversible probabilistic programming”) using autoencoders. Represent each layer of the autoencoder by a single-layer neural net f : a -> Distr b and approximate inverse g : b -> Distr a. Given a distribution for the previous layer x : Distr a, get the distribution for the next layer by taking Unsamp g (InvFmap (\(x, y) -> (y, x)) (Samp f x)) :: Distr b. Compose a lot of these together to get a multi-layer generative model. This seems simple enough that there’s probably a simple more direct way to estimate the entropy of a generative model represented by an autoencoder.

(actually, I think this might not work, because the inverses won’t be very accurate, but maybe something like this works?)

Additional note: I think you can implement the approach I sketched (which I’m calling “reversible probabilistic programming”) using autoencoders. Represent each layer of the autoencoder by a single-layer neural net

`f : a -> Distr b`

and approximate inverse`g : b -> Distr a`

. Given a distribution for the previous layer`x : Distr a`

, get the distribution for the next layer by taking`Unsamp g (InvFmap (\(x, y) -> (y, x)) (Samp f x)) :: Distr b`

. Compose a lot of these together to get a multi-layer generative model. This seems simple enough that there’s probably a simple more direct way to estimate the entropy of a generative model represented by an autoencoder.(actually, I think this might not work, because the inverses won’t be very accurate, but maybe something like this works?)