Fidelity is one of the most natural measures of the closeness between quantum states and has found countless applications in quantum information theory.
I agree that this sort of quantum relative entropy should also be doable. It’s possible that the result would be the same. I guess an easy check would be to perturb the posterior and check whether this measure also has a minimum around the same point.
Yeah, that was about the only sentence I read in the paper. I was wondering if you’d seen a theoretical justification (logos) rather than just an ethical appeal (ethos), but didn’t want to comb through the maths myself. By the way, fidelity won’t give the same posterior. I haven’t worked through the maths whatsoever, but I’d still put >95% probability on this claim.
they may have chosen this way because it turns out taking the derivative of a matrix logarithm without certain guarantees of commutativity of the matrix with its own differential is really really hard. Which to be fair isn’t a good reason per se, but yeah.
Also, the paper mentions that
the Kullback–Leiber divergence [7, 10], other f -divergences including Pearson divergence and Hellinger distance [34], zero-one loss [35], or the mean-square error of an estimation [36, 37]
and looking at it, the quantum fidelity reduces to one minus the Hellinger distance squared:
So it’s not in theory any worse or better than picking the K-L divergence, since all seem like a valid starting point; however it makes sense that this might be worth some further questioning.
EDIT: in addition, due to the nature of the matrix logarithm, the quantum K-L divergence has some serious drawbacks. It’s basically the equivalent of the classic ones actually—if Q(x,y) (the distribution at the denominator) is ever zero, the divergence goes to infinity. In quantum terms, that’s if any one of the eigenvalues of σ is zero. So I think it’s possible that they saw this as simply not well-behaved enough to be worth using.
No, I don’t think there’s anything like that. I do wonder about deriving the same result for the divergence. I have no idea how hard that would be; it might even be quite easy. Possibly even reduces to something more Bayes-like in case of commutating operators. I’ll try.
Quoting from the paper:
I agree that this sort of quantum relative entropy should also be doable. It’s possible that the result would be the same. I guess an easy check would be to perturb the posterior and check whether this measure also has a minimum around the same point.
Yeah, that was about the only sentence I read in the paper. I was wondering if you’d seen a theoretical justification (logos) rather than just an ethical appeal (ethos), but didn’t want to comb through the maths myself. By the way, fidelity won’t give the same posterior. I haven’t worked through the maths whatsoever, but I’d still put >95% probability on this claim.
So to add on this:
they may have chosen this way because it turns out taking the derivative of a matrix logarithm without certain guarantees of commutativity of the matrix with its own differential is really really hard. Which to be fair isn’t a good reason per se, but yeah.
Also, the paper mentions that
and looking at it, the quantum fidelity reduces to one minus the Hellinger distance squared:
https://en.wikipedia.org/wiki/Hellinger_distance
So it’s not in theory any worse or better than picking the K-L divergence, since all seem like a valid starting point; however it makes sense that this might be worth some further questioning.
EDIT: in addition, due to the nature of the matrix logarithm, the quantum K-L divergence has some serious drawbacks. It’s basically the equivalent of the classic ones actually—if Q(x,y) (the distribution at the denominator) is ever zero, the divergence goes to infinity. In quantum terms, that’s if any one of the eigenvalues of σ is zero. So I think it’s possible that they saw this as simply not well-behaved enough to be worth using.
No, I don’t think there’s anything like that. I do wonder about deriving the same result for the divergence. I have no idea how hard that would be; it might even be quite easy. Possibly even reduces to something more Bayes-like in case of commutating operators. I’ll try.