Joseph Van Name comments on Joseph Van Name’s Shortform

Joseph Van Name 29 Oct 2023 23:33 UTC
1 point
0
Suppose that $q, r, d$ are natural numbers. Let $1 < p < \infty$ . Let $z_{i, j}$ be a complex number whenever $1 \leq i \leq q, 1 \leq j \leq r$ . Let $L : M_{d} (C)^{r} ∖ {0} \to [- \infty, \infty)$ be the fitness function defined by letting $L (X_{1}, \dots, X_{r})$ $= (\sum_{i = 1}^{q} log (ρ (\sum_{j = 1}^{r} z_{i, j} X_{j})) / q) - log (∥ \sum_{j = 1}^{r} X_{j} X_{j}^{*} ∥_{p}) / 2$ . Here, $ρ (X)$ denotes the spectral radius of a matrix $X$ while $∥ X ∥_{p}$ denotes the Schatten $p$ -norm of $X$ .
Now suppose that $(A_{1}, \dots, A_{r}) \in M_{d} (C)^{r} ∖ {0}$ is a tuple that maximizes $L (A_{1}, \dots, A_{r})$ . Let $M : C^{r} ∖ {0} \to [- \infty, \infty)$ be the fitness function defined by letting $M (w_{1}, \dots, w_{r}) = log (ρ (w_{1} A_{1} + \dots + w_{r} A_{r})) - log (∥ (w_{1}, \dots, w_{r}) ∥_{2})$ . Then suppose that $(v_{1}, \dots, v_{r}) \in C^{r} ∖ {0}$ is a tuple that maximizes $M (v_{1}, \dots, v_{r})$ . Then we will likely be able to find an $ℓ \in {1, \dots, q}$ and a non-zero complex number $α$ where $(v_{1}, \dots, v_{r}) = α \cdot (x_{ℓ, 1}, \dots, x_{ℓ, r})$ .
In this case, $(z_{i, j})_{i, j}$ represents the training data while the matrices $A_{1}, \dots, A_{r}$ is our learned machine learning model. In this case, we are able to recover some original data values from the learned machine learning model $A_{1}, \dots, A_{r}$ without any distortion to the data values.
I have just made this observation, so I am still exploring the implications of this observation. But this is an example of how mathematical spectral machine learning algorithms can behave, and more mathematical machine learning models are more likely to be interpretable and they are more likely to have a robust mathematical/empirical theory behind them.
- Joseph Van Name 30 Oct 2023 23:34 UTC
  3 points
  0
  Parent
  I think that all that happened here was the matrices $A_{1}, \dots, A_{r}$ just ended up being diagonal matrices. This means that this is probably an uninteresting observation in this case, but I need to do more tests before commenting any further.