Yes. When we take convex combinations of finitely many point mass measures, the integral is just a sum. I use the sum of finitely many elements for ease of calculations, but to prove theorems, I should use measures for full generality.
The idea of finding an object A along with distinct local optima GA(x1),…,GA(xn) with n maximized looks like an interesting problem to work on. I have not worked on this kind of objective before, but I can certainly try this, as I have a few ideas of how to do this. This might work better for discrete optimization problems though since I cannot think of a good way to use gradient updates to produce new local optima. In this case, I will need to use either evolutionary computation or hill climbing instead. I do not think that this will result in natural looking objects A though, so I don’t think I can learn much from this endeavor.
I have not thought much about finding measures μ where Fμ,n,∥∗∥ has many local maxima because I have many higher priorities. These days, people are focused on the more complicated machine learning systems such as large language models, and in order to catch up, I also need to increase the performance, capabilities, and efficiency of my pseudodeterministic machine learning models. For the more complicated multi-layered models, it seems more difficult to obtain and retain pseudodeterminism. Pseudodeterminism is a robust property for simple objective functions such as when we are training a linear model or performing convex optimization, but pseuodeterminism becomes increasingly fragile as we increase the sophistication of our objective functions. This means that it is trivial to violate pseudodeterminism for the sophisticated models that I want to work more on, but it is difficult to retain pseuodeterminism.
I am not at all worried about any strange case of non-pseudodeterminism when optimizing Fμ,n,∥∗∥ for measures I have not thought about yet since this problem is not even close to being non-pseudodeterministic. For example, if E0,F0 are norm 1 completely positive superoperators of the same Choi rank ≤N and if (xn,yn)∈(U∖{0})2 for all n and (En)n,(Fn)n are sequences where
En+1 is obtained by moving from En in the direction of the gradient (with possible momentum) of log(|⟨Enxn,yn⟩|2)−log(∥En∥) and Fn+1 is obtained from Fn the same way with the same rate, then my experiments show that limn→∞∥En−Fn∥=0 regardless of what each (xn,yn) is. In other words, even if (En)n does not converge, the sequences (En)n,(Fn)n uniformly approximate each other as n→∞. This is a much stronger form of pseudodeterminism that is hard to violate, so it is not a high priority to find particular instances of non-pseudodeterminism especially if those instances do not coincide with real-world data.
I kind of expect the fitness function Fμ,n,∥∗∥ to have just one or a few local maxima because their closest relatives are the linear models and those linear models are obtained by optimizing an objective function with one local optimum. And I also expect Fμ,n,∥∗∥ to have one or very few local maxima because Fμ,n,∥∗∥ is similar to many other objective functions that I have constructed each with one or a few local optimum. And since Fμ,n,∥∗∥ is simpler than other objective functions I have looked at with few local optima, Fμ,n,∥∗∥ should also have very few local optima. And the function Fμ is concave, so there is only one local maximum value {Fμ(E):E∈Q} whenever Q is a convex set (such possible convex sets of interest include all quantum channels and all unital channels). The restriction of our attention to completely positive operators of low Choi rank and in the boundary of the unit ball means that when we maximize Fμ,n,∥∗∥, we cannot use convexity to prove that there is only one local maximum, but convexity still suggests that there should be just one especially when n is large. When n is small, we cannot use convexity to make conclusions though since I did a Hessian calculation, and the Hessian of F(μ,Φ(A1,…,Ar)) with respect to (A1,…,Ar) generally has plenty of both positive and negative eigenvalues. I do not consider it a major problem if Fμ,n,∥∗∥ has multiple local maxima, since that probably just means that we need to increase the value of n until these local maxima merge.
Yes. When we take convex combinations of finitely many point mass measures, the integral is just a sum. I use the sum of finitely many elements for ease of calculations, but to prove theorems, I should use measures for full generality.
The idea of finding an object A along with distinct local optima GA(x1),…,GA(xn) with n maximized looks like an interesting problem to work on. I have not worked on this kind of objective before, but I can certainly try this, as I have a few ideas of how to do this. This might work better for discrete optimization problems though since I cannot think of a good way to use gradient updates to produce new local optima. In this case, I will need to use either evolutionary computation or hill climbing instead. I do not think that this will result in natural looking objects A though, so I don’t think I can learn much from this endeavor.
I have not thought much about finding measures μ where Fμ,n,∥∗∥ has many local maxima because I have many higher priorities. These days, people are focused on the more complicated machine learning systems such as large language models, and in order to catch up, I also need to increase the performance, capabilities, and efficiency of my pseudodeterministic machine learning models. For the more complicated multi-layered models, it seems more difficult to obtain and retain pseudodeterminism. Pseudodeterminism is a robust property for simple objective functions such as when we are training a linear model or performing convex optimization, but pseuodeterminism becomes increasingly fragile as we increase the sophistication of our objective functions. This means that it is trivial to violate pseudodeterminism for the sophisticated models that I want to work more on, but it is difficult to retain pseuodeterminism.
I am not at all worried about any strange case of non-pseudodeterminism when optimizing Fμ,n,∥∗∥ for measures I have not thought about yet since this problem is not even close to being non-pseudodeterministic. For example, if E0,F0 are norm 1 completely positive superoperators of the same Choi rank ≤N and if (xn,yn)∈(U∖{0})2 for all n and (En)n,(Fn)n are sequences where
En+1 is obtained by moving from En in the direction of the gradient (with possible momentum) of log(|⟨Enxn,yn⟩|2)−log(∥En∥) and Fn+1 is obtained from Fn the same way with the same rate, then my experiments show that limn→∞∥En−Fn∥=0 regardless of what each (xn,yn) is. In other words, even if (En)n does not converge, the sequences (En)n,(Fn)n uniformly approximate each other as n→∞. This is a much stronger form of pseudodeterminism that is hard to violate, so it is not a high priority to find particular instances of non-pseudodeterminism especially if those instances do not coincide with real-world data.
I kind of expect the fitness function Fμ,n,∥∗∥ to have just one or a few local maxima because their closest relatives are the linear models and those linear models are obtained by optimizing an objective function with one local optimum. And I also expect Fμ,n,∥∗∥ to have one or very few local maxima because Fμ,n,∥∗∥ is similar to many other objective functions that I have constructed each with one or a few local optimum. And since Fμ,n,∥∗∥ is simpler than other objective functions I have looked at with few local optima, Fμ,n,∥∗∥ should also have very few local optima. And the function Fμ is concave, so there is only one local maximum value {Fμ(E):E∈Q} whenever Q is a convex set (such possible convex sets of interest include all quantum channels and all unital channels). The restriction of our attention to completely positive operators of low Choi rank and in the boundary of the unit ball means that when we maximize Fμ,n,∥∗∥, we cannot use convexity to prove that there is only one local maximum, but convexity still suggests that there should be just one especially when n is large. When n is small, we cannot use convexity to make conclusions though since I did a Hessian calculation, and the Hessian of F(μ,Φ(A1,…,Ar)) with respect to (A1,…,Ar) generally has plenty of both positive and negative eigenvalues. I do not consider it a major problem if Fμ,n,∥∗∥ has multiple local maxima, since that probably just means that we need to increase the value of n until these local maxima merge.