Actually I’m not sure exactly what you mean by importance sampling here.
The variational lower bound would be to draw samples from q and compute log(p/q). The log probability of the output under p is bounded by the expectation of this quantity (with equality iff q is the correct conditional distribution over A).
I’m just going to work with this in my other comments, I assume it amounts to the same thing.
What I mean is: compute ^Eq[[f(A)=x]p(A)/q(A)], which is a probabilistic lower bound on Pp(f(A)=x).
The variational score gives you a somewhat worse lower bound if q is different from p(A|f(A)=x). Due to Jensen’s inequality,
It probably doesn’t make a huge difference either way.