Joseph Van Name comments on Joseph Van Name’s Shortform

Joseph Van Name 24 Apr 2025 10:08 UTC
3 points
0
This post gives an example of some calculations that I did using my own machine learning algorithm. These calculations work out nicely which indicates that the machine learning algorithm I am using is interpretable (and it is much more interpretable than any neural network would be). These calculations show that one can begin with old mathematical structures and produce new mathematical structures, and it seems feasible to completely automate this process to continue to produce more mathematical structures. The machine learning models that I use are linear, but it seems like we can get highly non-trivial results simply by iterating the procedure of obtaining new structures from old using machine learning.
I made a similar post to this one about 7 months ago, but I decided to revisit this experiment with more general algorithms and I have obtained experimental results which I think look nice.
To illustrate how this works, we start off with the octonions. The octonions consists of an 8-dimensional inner product space $V$ together with a bilinear operation $*$ and a unit $1 \in V$ where $1 * v = v * 1 = v$ for all $v \in V$ and where $∥ u * v ∥ = ∥ u ∥ \cdot ∥ v ∥$ for all $u, v \in V$ . The octonions are uniquely determined up to isomorphism from these properties. The operation $*$ is non-associative, but the $*$ is closely related to the quaternions and complex numbers. If we take a single element in $v \in V ∖ Span (1)$ , then ${1, v}$ generates a subalgebra of $(V, *)$ isomorphic to the field of complex numbers, and if $u, v \in V$ and ${1, u, v}$ are linearly independent, then ${1, u, v, u * v}$ spans a subalgebra of $V$ isomorphic to the division ring of quaternions. For this reason, one commonly thinks of the octonions as the best way to extend the division ring of quaternions to a larger algebraic structure in the same way that the quaternions extend the field of complex numbers. But since the octonions are non-associative, they cannot be used to construct matrices, so they are not as well-known as the quaternions (and the construction of the octonions is more complicated too)
Suppose now that ${e_{0}, e_{1}, \dots, e_{7}}$ is an orthonormal basis for the division ring of octonions with $e_{0} = 1$ . Then define matrices $A_{0}, \dots, A_{7} : V \to V$ by setting $A_{j} v = e_{j} * v$ for all $j$ . Our goal is to transform $(A_{0}, \dots, A_{7})$ into other tuples of matrices that satisfy similar properties.
If $(A_{1}, \dots, A_{r}), (B_{1}, \dots, B_{r})$ are matrices, then define the $L_{2}$
-spectral radius similarity between $(A_{1}, \dots, A_{r})$ and $(B_{1}, \dots, B_{r})$ as
$∥ (A_{1}, \dots, A_{r}) ≃ (B_{1}, \dots, B_{r}) ∥_{2} =$
$\frac{ρ (A_{1} \otimes_{1} + \dots + A_{r} \otimes_{r})}{ρ (A_{1} \otimes_{1} + \dots + A_{r} \otimes_{r})^{1 / 2} \cdot ρ (B_{1} \otimes_{1} + \dots + B_{r} \otimes_{r})^{1 / 2}}$
where $ρ$ denotes the spectral radius, $\otimes$ is the tensor product, and $¯ ¯¯¯ ¯ X$ is the complex conjugate of $X$ applied elementwise.
Let $d \in {1, \dots, 8}$ , and let $F_{d}, G_{d}, H_{d}$ denote the maximum value of the fitness level $8 \cdot ∥ (A_{0}, \dots, A_{7}) ≃ (X_{0}, \dots, X_{7}) ∥^{2}$ such that each $X_{j}$ is a complex $d \times d$ anti-symmetric matrix ( $X = - X^{T}$ ), a complex $d \times d$ symmetric matrix ( $X = X^{T}$ ), and a complex $d \times d$ -Hermitian matrix ( $X = X^{*}$ ) respectively.
The following calculations were obtained through gradient ascent, so I have no mathematical proof that the values obtained are actually correct.
$G_{1} = 2$ , $H_{1} = 1$
$G_{2} = 3$ , $H_{2} = 3$
$F_{3} = 1 + \sqrt{3}$ , $G_{3} = 3.5$ , $H_{3} = 1 + 2 \sqrt{2}$
$F_{4} = 4$ , $G_{4} = 4$ , $H_{4} = 1 + 3 \sqrt{2}$
$F_{5} = (5 + \sqrt{13}) / 2$ , $G_{5} = 4.5$ , $H_{5} \approx 5.27155841$
$F_{6} = 5$ , $G_{6} = 5$ , $H_{6} = 3 + 2 \sqrt{2}$
$F_{7} = 6$ , $G_{7} = 2 + 2 \sqrt{3} \approx 5.4641$ , $H_{7} = 1 + 2 \sqrt{7}$
$F_{8} = 7$ , $G_{8} = 6$ , $H_{8} = 7$
Observe that with at most one exception, all of these values $F_{d}, G_{d}, H_{d}$ are algebraic half integers. This indicates that the fitness function that we maximize to produce $F_{d}, G_{d}, H_{d}$ behaves mathematically and can be used to produce new tuples $(X_{1}, \dots, X_{r})$ from old ones $(A_{1}, \dots, A_{r})$ . Furthermore, an AI can determine whether something notable is going on with the new tuple $(X_{1}, \dots, X_{r})$ in several ways. For example, if $∥ (A_{1}, \dots, A_{r}) ≃ (X_{1}, \dots, X_{r}) ∥^{2}$ has low algebraic degree at the local maximum, then $(X_{1}, \dots, X_{r})$ is likely notable and likely behaves mathematically (and is probably quite interpretable too).
The good behavior of $F_{d}, G_{d}, H_{d}$ demonstrates that the octonions are compatible with the $L_{2}$ -spectral radius similarity. The operators $(A_{0}, \dots, A_{7})$ are all orthogonal, and one can take the tuple $(A_{0}, \dots, A_{7})$ as a mixed unitary quantum channel that is very similar to the completely depolarizing channel. The completely depolarizing channel completely mixes every quantum state while the mixture of orthogonal mappings $(A_{0}, \dots, A_{7})$ completely mixes every real state. The $L_{2}$ -spectral radius similarity works very well with the completely depolarizing channel, so one should expect for the $L_{2}$ -spectral radius similarity to also behave well with the octonions.