Take {0,1}n this set contains exponentially many points. Is there Any function f:{0,1}n→Rpolyn such that all exponentially many xor combos can be found by a linear probe?
This is a question of pure maths, it involves no neural networks. And I think it would be highly informative.
FWIW, I thought about this for an hour and didn’t really get anywhere. This is totally not my field though, so I might be missing something easy or well-known.
My thoughts on the subject are below in case someone is interested.
Problem statement:
“Is it possible to embed the 2n points {0,1}n to Rpoly(n) (i.e. construct a function f:{0,1}n→Rpoly(n)) such that, for any XOR-subset S of {0,1}n, the set f(S) and its complement f(Sc) are linearly separable (i.e. there is a hyperplane separating them)?”
Here XOR-subset refers to a set of form
S={x∈{0,1}n|xi1⊕xi2⊕…xik=0},
where k is a positive integer and ij∈{1,…,n} are indices.
Observations I made:
One can embed {0,1}2 to R² in the desired way: take a triangle and a point from within it.
One cannot embed {0,1}n to Rn−1.
(Proof: consider the n XOR-subsets where k=1. The corresponding hyperplanes divide Rn−1 to at most 2n−1 regions; hence some two points are embedded in the same region. This is a contradiction.)
If you have m points in Rd, only at most md+1 of the exponentially many subsets of the m points can be linearly separated from their complement.
(Proof: If a subset can be separated, it can also be separated by a hyperplane parallel to a hyperplane formed by some d of our points. This limits our hyperplanes to (md)≈md orientations. Any orientation gives you at most m subsets.)
This doesn’t really help: the number of XOR-subsets is linear in the number of points. In other words, we are looking at only very few specific subsets, and general observations about all subsets are too crude.
XOR-subsets have the property “if A and B are XOR-subsets, then (A∩B)∪(Ac∩Bc) is too” (the “XOR of XOR-subsets is a XOR-subset”)
This has a nice geometric interpretation: if you draw the separating hyperplanes, which divide the space into 4 regions, the points landing in the “diagonally opposite” regions form a XOR-subset.
This feels like, informally, that you need to use a new dimension for the XOR, meaning that you need a lot of dimensions.
I currently lean towards the answer to the problem being “no”, though I’m quite uncertain. (The connection to XORs of features is also a bit unclear to me, as I have the same confusion as DanielVarga in another comment.)
The connection to features is that if the answer is no, there is no possible way the network could have arbitrary X-or combos of features that are linearly represented. It must be only representing some small subset of them. (probably the xor’s of 2 or 3 features, but not 100 features.)
Also, your maths description of the question matches what I was trying to express.
Take {0,1}n this set contains exponentially many points. Is there Any function f:{0,1}n→Rpoly n such that all exponentially many xor combos can be found by a linear probe?
This is a question of pure maths, it involves no neural networks. And I think it would be highly informative.
FWIW, I thought about this for an hour and didn’t really get anywhere. This is totally not my field though, so I might be missing something easy or well-known.
My thoughts on the subject are below in case someone is interested.
Problem statement:
“Is it possible to embed the 2n points {0,1}n to Rpoly(n) (i.e. construct a function f:{0,1}n→Rpoly(n)) such that, for any XOR-subset S of {0,1}n, the set f(S) and its complement f(Sc) are linearly separable (i.e. there is a hyperplane separating them)?”
Here XOR-subset refers to a set of form
S={x∈{0,1}n|xi1⊕xi2⊕…xik=0},
where k is a positive integer and ij∈{1,…,n} are indices.
Observations I made:
One can embed {0,1}2 to R² in the desired way: take a triangle and a point from within it.
One cannot embed {0,1}n to Rn−1.
(Proof: consider the n XOR-subsets where k=1. The corresponding hyperplanes divide Rn−1 to at most 2n−1 regions; hence some two points are embedded in the same region. This is a contradiction.)
If you have m points in Rd, only at most md+1 of the exponentially many subsets of the m points can be linearly separated from their complement.
(Proof: If a subset can be separated, it can also be separated by a hyperplane parallel to a hyperplane formed by some d of our points. This limits our hyperplanes to (md)≈md orientations. Any orientation gives you at most m subsets.)
This doesn’t really help: the number of XOR-subsets is linear in the number of points. In other words, we are looking at only very few specific subsets, and general observations about all subsets are too crude.
XOR-subsets have the property “if A and B are XOR-subsets, then (A∩B)∪(Ac∩Bc) is too” (the “XOR of XOR-subsets is a XOR-subset”)
This has a nice geometric interpretation: if you draw the separating hyperplanes, which divide the space into 4 regions, the points landing in the “diagonally opposite” regions form a XOR-subset.
This feels like, informally, that you need to use a new dimension for the XOR, meaning that you need a lot of dimensions.
I currently lean towards the answer to the problem being “no”, though I’m quite uncertain. (The connection to XORs of features is also a bit unclear to me, as I have the same confusion as DanielVarga in another comment.)
The connection to features is that if the answer is no, there is no possible way the network could have arbitrary X-or combos of features that are linearly represented. It must be only representing some small subset of them. (probably the xor’s of 2 or 3 features, but not 100 features.)
Also, your maths description of the question matches what I was trying to express.