Yes. When we take convex combinations of finitely many point mass measures, the integral is just a sum. I use the sum of finitely many elements for ease of calculations, but to prove theorems, I should use measures for full generality.
The idea of finding an object along with distinct local optima with maximized looks like an interesting problem to work on. I have not worked on this kind of objective before, but I can certainly try this, as I have a few ideas of how to do this. This might work better for discrete optimization problems though since I cannot think of a good way to use gradient updates to produce new local optima. In this case, I will need to use either evolutionary computation or hill climbing instead. I do not think that this will result in natural looking objects though, so I don’t think I can learn much from this endeavor.
I have not thought much about finding measures where has many local maxima because I have many higher priorities. These days, people are focused on the more complicated machine learning systems such as large language models, and in order to catch up, I also need to increase the performance, capabilities, and efficiency of my pseudodeterministic machine learning models. For the more complicated multi-layered models, it seems more difficult to obtain and retain pseudodeterminism. Pseudodeterminism is a robust property for simple objective functions such as when we are training a linear model or performing convex optimization, but pseuodeterminism becomes increasingly fragile as we increase the sophistication of our objective functions. This means that it is trivial to violate pseudodeterminism for the sophisticated models that I want to work more on, but it is difficult to retain pseuodeterminism.
I am not at all worried about any strange case of non-pseudodeterminism when optimizing for measures I have not thought about yet since this problem is not even close to being non-pseudodeterministic. For example, if are norm 1 completely positive superoperators of the same Choi rank and if for all and are sequences where
is obtained by moving from in the direction of the gradient (with possible momentum) of and is obtained from the same way with the same rate, then my experiments show that regardless of what each is. In other words, even if does not converge, the sequences uniformly approximate each other as . This is a much stronger form of pseudodeterminism that is hard to violate, so it is not a high priority to find particular instances of non-pseudodeterminism especially if those instances do not coincide with real-world data.
I kind of expect the fitness function to have just one or a few local maxima because their closest relatives are the linear models and those linear models are obtained by optimizing an objective function with one local optimum. And I also expect to have one or very few local maxima because is similar to many other objective functions that I have constructed each with one or a few local optimum. And since is simpler than other objective functions I have looked at with few local optima, should also have very few local optima. And the function is concave, so there is only one local maximum value whenever is a convex set (such possible convex sets of interest include all quantum channels and all unital channels). The restriction of our attention to completely positive operators of low Choi rank and in the boundary of the unit ball means that when we maximize , we cannot use convexity to prove that there is only one local maximum, but convexity still suggests that there should be just one especially when is large. When is small, we cannot use convexity to make conclusions though since I did a Hessian calculation, and the Hessian of with respect to generally has plenty of both positive and negative eigenvalues. I do not consider it a major problem if has multiple local maxima, since that probably just means that we need to increase the value of until these local maxima merge.
I am going to apply my own dimensionality reduction algebra to a quantum channel (or matrices) obtained from the Okubo algebra in order to demonstrate the compatibility between my dimensionality reduction and the Okubo algebra.
TL-DR version: I trained my own machine learning algorithms on Okubo algebras and the squares of the fitness levels of the local maxima were usually either rational numbers or quadratic algebraic numbers. This suggests that my machine learning algorithm behaves mathematically.
Origin of algorithm: I have originally created this dimensionality reduction algorithm to analyze the cryptographic security of block ciphers for the cryptocurrency that I have created. If you want to discuss cryptocurrency technologies, please contact me privately off this site since I really do not feel comfortable talking about that stuff here.
After obtaining the dimensionality reduction algorithm, I noticed that such algorithms behaved mathematically for reasons that I still can’t explain, and I have concluded that such mathematical behavior is needed to construct inherently interpretable and safe machine learning algorithms. Of course, if we want inherently interpretable and safe AI, we need machine learning algorithms that we can use to train models with many layers that can solve sophisticated tasks, but I am well on my way towards creating these algorithms too despite a complete and total lack of support.
Mathematics: The Okubo algebra[1] is a close cousin to the octonions and satisfies many similar properties to the octonions.
The underlying set of the Okubo algebra is the set of all -complex Hermitian matrices with trace 0. Observe that the set of all -complex Hermitian matrices forms a real vector space of dimension . Therefore, the Okubo algebra’s underlying set has dimension . Let be the complex numbers with . Then up to complex conjugation. The Okubo algebra is endowed with a bilinear operation defined by (I scaled the operation by a factor of so that the norm on the Okubo algebra is just the Frobenius norm). The operation satisfies the property where refers to the Frobenius norm and .
Let be an isomorphism between inner product spaces. Then define an operation on by setting . Then define orthogonal matrices by where is the standard basis for real Euclidean space.
If are -complex matrices and are -complex matrices, then define the -spectral radius similarity between and by
Computational results: The following facts are suggested by computer experiments but have not been rigorously proven. To run the computer experiments, I used gradient ascent to locally maximize the -spectral radius similarity. By maximizing the -spectral radius similarity, we reduce the dimensions of a tuple of matrices, and I call this dimensionality reduction the -spectral radius dimensionality reduction (LSRDR).
The maximum value of among the real -matrices is . Let be the maximum value of among the -complex,real symmetric,complex symmetric, complex anti-symmetric, complex Hermitian matrices. Then
Similar facts seem to hold for the other values (but I have not completely performed the calculations due to numerical instabilities that I do not want to fix). For example, and for .
The fitness levels that I have are simple but they are not too simple. This indicates that LSRDRs of Okubo algebras are interesting mathematically.
Okubo algebras: automorphisms, derivations and idempotents, Alberto Elduque,2013,
https://api.semanticscholar.org/CorpusID:119713330