Joseph Van Name comments on Joseph Van Name’s Shortform

Joseph Van Name 14 Dec 2023 23:23 UTC
2 points
0
So in my research into machine learning algorithms that I can use to evaluate small block ciphers for cryptocurrency technologies, I have just stumbled upon a dimensionality reduction for tensors in tensor products of inner product spaces that according to my computer experiments exists, is unique, and which reduces a real tensor to another real tensor even when the underlying field is the field of complex numbers. I would not be too surprised if someone else came up with this tensor dimensionality reduction before since it has a rather simple description and it is in a sense a canonical tensor dimensionality reduction when we consider tensors as homogeneous non-commutative polynomials. But even if this tensor dimensionality reduction is not new, this dimensionality reduction algorithm belongs to a broader class of new algorithms that I have been studying recently such as LSRDRs.
Suppose that $K$ is either the field of real numbers or the field of complex numbers. Let $V_{1}, \dots, V_{n}$ be finite dimensional inner product spaces over $K$ with dimensions $d_{1}, \dots, d_{n}$ respectively. Suppose that $V_{i}$ has basis $e_{i, 1}, \dots, e_{i, d_{i}}$ . Given $v \in V_{1} \otimes \dots \otimes V_{n}$ , we would sometimes want to approximate the tensor $v$ with a tensor that has less parameters. Suppose that $(m_{0}, \dots, m_{n})$ is a sequence of natural numbers with $m_{0} = m_{n} = 1$ . Suppose that $X_{i, j}$ is a $m_{i - 1} \times m_{i}$ matrix over the field $K$ for $1 \leq i \leq n$ and $1 \leq j \leq d_{i}$ . From the system of matrices $(X_{i, j})_{i, j}$ , we obtain a tensor $T ((X_{i, j})_{i, j}) = \sum_{i_{1}, \dots, i_{n}} e_{i_{1}} \otimes \dots \otimes e_{i_{n}} \cdot X_{1, i_{1}} \dots X_{n, i_{n}}$ . If the system of matrices $(X_{i, j})_{i, j}$ locally minimizes the distance $∥ v - T ((X_{i, j})_{i, j}) ∥$ , then the tensor $T ((X_{i, j})_{i, j})$ is a dimensionality reduction of $v$ which we shall denote by $u$ .
Intuition: One can associate the tensor product $V_{1} \otimes \dots \otimes V_{n}$ with the set of all degree $n$ homogeneous non-commutative polynomials that consist of linear combinations of the monomials of the form $x_{1, i_{1}} \dots x_{n, i_{n}}$ . Given, our matrices $X_{i, j}$ , we can define a linear functional $ϕ : V_{1} \otimes \dots \otimes V_{n} \to K$ by setting $ϕ (p) = p ((X_{i, j})_{i, j})$ . But by the Reisz representation theorem, the linear functional $ϕ$ is dual to some tensor in $V_{1} \otimes \dots \otimes V_{n}$ . More specifically, $ϕ$ is dual to $T ((X_{i, j})_{i, j})$ . The tensors of the form $T ((X_{i, j})_{i, j})$ are therefore the
Advantages:
1. In my computer experiments, the reduced dimension tensor $u$ is often (but not always) unique in the sense that if we calculate the tensor $u$ twice, then we will get the same tensor. At least, the distribution of reduced dimension tensors $u$ will have low Renyi entropy. I personally consider the partial uniqueness of the reduced dimension tensor to be advantageous over total uniqueness since this partial uniqueness signals whether one should use this tensor dimensionality reduction in the first place. If the reduced tensor is far from being unique, then one should not use this tensor dimensionality reduction algorithm. If the reduced tensor is unique or at least has low Renyi entropy, then this dimensionality reduction works well for the tensor $v$ .
2. This dimensionality reduction does not depend on the choice of orthonormal basis $e_{i, 1}, \dots, e_{i, d_{i}}$ . If we chose a different basis for each $V_{i}$ , then the resulting tensor $u$ of reduced dimensionality will remain the same (the proof is given below).
3. If $K$ is the field of complex numbers, but all the entries in the tensor $v$ happen to be real numbers, then all the entries in the tensor $u$ will also be real numbers.
4. This dimensionality reduction algorithm is intuitive when tensors are considered as homogeneous non-commutative polynomials.
Disadvantages:
1. This dimensionality reduction depends on a canonical cyclic ordering the inner product spaces $V_{1}, \dots, V_{n}$ .
2. Other notions of dimensionality reduction for tensors such as the CP tensor dimensionality reduction and the Tucker decompositions are more well-established, and they are obviously attempted generalizations of the singular value decomposition to higher dimensions, so they may be more intuitive to some.
3. The tensors of reduced dimensionality $T ((X_{i, j})_{i, j})$ have a more complicated description than the tensors in the CP tensor dimensionality reduction.
Proposition: The set of tensors of the form $\sum_{i_{1}, \dots, i_{n}} e_{1, i_{1}} \otimes \dots \otimes e_{n, i_{n}} X_{1, i_{1}} \dots X_{n, i_{n}}$ does not depend on the choice of bases $(e_{i, 1}, \dots, e_{i, d_{i}})_{i}$ .
Proof: For each $i$ , let $f_{i, 1}, \dots, f_{i, d_{i}}$ be an alternative basis for $V_{i}$ . Then suppose that $e_{i, j} = \sum_{k} u_{i, j, k} f_{i, k}$ for each $i, j$ . Then
$\sum_{i_{1}, \dots, i_{n}} e_{1, i_{1}} \otimes \dots \otimes e_{n, i_{n}} X_{1, i_{1}} \dots X_{n, i_{n}}$
$= \sum_{i_{1}, \dots, i_{n}} \sum_{k_{1}} u_{1, i_{1}, k_{1}} f_{1, i_{1}} \otimes \dots \otimes \sum_{k_{n}} u_{n, i_{n}, k_{n}} f_{n, i_{n}} X_{1, i_{1}} \dots X_{n, i_{n}}$
$= \sum_{k_{1}, \dots, k_{n}} f_{1, k_{1}} \otimes \dots \otimes f_{n, k_{n}} \sum_{i_{1}, \dots, i_{n}} u_{1, i_{1}, k_{1}} \dots u_{n, i_{n}, k_{n}} X_{1, i_{1}} \dots X_{n, i, n}$
$= \sum_{k_{1}, \dots, k_{n}} f_{1, k_{1}} \otimes \dots \otimes f_{n, k_{n}} (\sum_{i_{1}} u_{1, i_{1}, k_{1}} X_{1, i_{1}}) \dots (\sum_{i_{n}} u_{n, i_{n}, k_{n}} X_{i_{n}})$ . Q.E.D.
A failed generalization: An astute reader may have observed that if we drop the requirement that $m_{n} = 1$ , then we get a linear functional defined by letting
$ϕ (p) = Tr (p ((X_{i, j})_{i, j}))$ . This is indeed a linear functional, and we can try to approximate $v$ using a the dual to $ϕ$ , but this approach does not work as well.