they may have chosen this way because it turns out taking the derivative of a matrix logarithm without certain guarantees of commutativity of the matrix with its own differential is really really hard. Which to be fair isn’t a good reason per se, but yeah.
Also, the paper mentions that
the Kullback–Leiber divergence [7, 10], other f -divergences including Pearson divergence and Hellinger distance [34], zero-one loss [35], or the mean-square error of an estimation [36, 37]
and looking at it, the quantum fidelity reduces to one minus the Hellinger distance squared:
So it’s not in theory any worse or better than picking the K-L divergence, since all seem like a valid starting point; however it makes sense that this might be worth some further questioning.
EDIT: in addition, due to the nature of the matrix logarithm, the quantum K-L divergence has some serious drawbacks. It’s basically the equivalent of the classic ones actually—if Q(x,y) (the distribution at the denominator) is ever zero, the divergence goes to infinity. In quantum terms, that’s if any one of the eigenvalues of σ is zero. So I think it’s possible that they saw this as simply not well-behaved enough to be worth using.
So to add on this:
they may have chosen this way because it turns out taking the derivative of a matrix logarithm without certain guarantees of commutativity of the matrix with its own differential is really really hard. Which to be fair isn’t a good reason per se, but yeah.
Also, the paper mentions that
and looking at it, the quantum fidelity reduces to one minus the Hellinger distance squared:
https://en.wikipedia.org/wiki/Hellinger_distance
So it’s not in theory any worse or better than picking the K-L divergence, since all seem like a valid starting point; however it makes sense that this might be worth some further questioning.
EDIT: in addition, due to the nature of the matrix logarithm, the quantum K-L divergence has some serious drawbacks. It’s basically the equivalent of the classic ones actually—if Q(x,y) (the distribution at the denominator) is ever zero, the divergence goes to infinity. In quantum terms, that’s if any one of the eigenvalues of σ is zero. So I think it’s possible that they saw this as simply not well-behaved enough to be worth using.