Actually I’m not sure exactly what you mean by importance sampling here.

The variational lower bound would be to draw samples from q and compute log(p/q). The log probability of the output under p is bounded by the expectation of this quantity (with equality iff q is the correct conditional distribution over A).

I’m just going to work with this in my other comments, I assume it amounts to the same thing.

What I mean is: compute ^Eq[[f(A)=x]p(A)/q(A)], which is a probabilistic lower bound on Pp(f(A)=x).

The variational score gives you a somewhat worse lower bound if q is different from p(A|f(A)=x). Due to Jensen’s inequality,
Eq[log([f(A)=x]p(A)/q(A))]≤logEq[[f(A)=x]p(A)/q(A)]≤logPp(f(A)=x)

It probably doesn’t make a huge difference either way.

Actually I’m not sure exactly what you mean by importance sampling here.

The variational lower bound would be to draw samples from q and compute log(p/q). The log probability of the output under p is bounded by the expectation of this quantity (with equality iff q is the correct conditional distribution over A).

I’m just going to work with this in my other comments, I assume it amounts to the same thing.

What I mean is: compute ^Eq[[f(A)=x]p(A)/q(A)], which is a probabilistic lower bound on Pp(f(A)=x).

The variational score gives you a somewhat worse lower bound if q is different from p(A|f(A)=x). Due to Jensen’s inequality, Eq[log([f(A)=x]p(A)/q(A))]≤logEq[[f(A)=x]p(A)/q(A)]≤logPp(f(A)=x)

It probably doesn’t make a huge difference either way.