I’ll use the term “rainbox edge”/”rainbow triangle” instead of bi/tri-chromatic.
Some details are glossed over, but only when I am am confident I could fill them in. Sorry if you are reading this and want more detail.
Also, thanks again for these posts. They are really quite helpful. I mean, I’m kinda skeptical about usefulness towards alignment, but regardless I’m very glad you wrote them. I had heard legends of proofs of the Brouwer fixed point theorem from Sperner’s lemma, but didn’t realize how nice it would be—I prefer this to the usual topological proof.
1: I proved this theorem in like 3 related ways until one of the approaches managed to give traction on the generalizations
Suppose we change a node ‘in the bulk’, that is, a node surrounded by two nodes. Then if we change the color of this node, we’ll either change the number of bichromatic edges by 0 (in the case where the nodes on either side have different colors) or 2 (where they have the same color). Thus we won’t affect the parity, so we can willy nilly change the bulk.
Let’s change the bulk to start with a blue on the leftmost, and green on the rightmost. Then we can by induction apply the lemma to this smaller segment. There are no extra bichromatic edges, so we are done.
2:
Imagine subdividing the interval into 1/N size bits, with the endpoints colored green if the function is positive there, and blue if it’s negative. If any are 0 we are already done, so let’s suppose none are zero.
For each N, pick a rainbow edge, then take the sequence of green endpoints from picked edges. This must have a convergent subsequence by Bolzano-Wierestrass. Now take the corresponding blue endpoints from this sequence (connected by the chosen edges). We then also have a sequence of blue endpoints, which must also converge (to the same value) because the edge lengths go to 0. The value of the function there is equal to the limit of function values, because it’s continuous—so therefore it must be both at least 0 and at most 0, and thus is equal to 0
3: Rewrite that’s more in the ‘spirit’ of problem 5 and 6, but not the first I came up with.
Color each point by whether the function sends it somewhere more “right” or “left” (that is, greater or lesser) than it. First suppose that there’s at least one point of each color. Then we can use the argument in the previous to get two converging sequences, one of “right” elements and one of “left” elements, that converge to the same value. When fed to the function, continuity means that we must be no bigger and no smaller than that value, and so must be equal.
If every point has the same color, then we send either 1 or 0 to something above/below itself, which is impossible because the function maps everything to [0,1].
It fails for (0,1) because you never need to plug in 0 or 1. For example, f(x) = 1⁄2 (1 - x) + 1⁄2 x is a counterexample—the fixed point is 1, which is outside (0,1). The function takes a point and then moves it halfway to 1. So everything is colored “right’, which is why we fail.
You could more directly use the previous problem:
Apply IVT to e(x) = f(x) - x
but I prefer the other way.
4: Huh, this got progressively less messy as I tweaked my proof, and in the process bugs were fixed. And then made even simpler for problem 9. Simplicity is king, remember that.
What group should we use for the 2D case?
Each color will be a group element.
Color an edge by the ‘missing’ color. This is the same as comparing our edge to the sum of all three colors. That is, (R + G + B) - (difference of endpoints). This also tells us what 2 of the same color should be—it should be R + G + B. Having the same color should clearly give us 0, so we want R + G + B = 0. Then, the color of an edge is just the sum of endpoint colors (at least, we’ll see that in a second when we show that -X = X).
Now, if we add the edges of a path, we should get the thing corresponding to the start and end endpoints. So we want e.g. G-B-G to be the same as G-G. That is, R + R = 0. Likewise, G + G = 0, B + B = 0, and G B R gives us that R + G = B. This is what we in the biz call the Klein four group, but you don’t have to worry about that if you don’t want to.
Any loop sums to 0. You can also think of this as counting each node twice. thus cancelling them.
Suppose we have a node in the “bulk” of the triangle, that is, one that’s fully surrounded (like a hexagon). We’ll show that we can change the color of the node without affecting the parity of the number of rainbow triangles. I’ll call this the “bulk lemma”.
WLOG imagine the center starts red and becomes blue, so our hexagon looks like (the Xs are arbitrary color values, not necessarily equal to other Xs) X X X R X X X and becomes X X X B X X X
Now, how many rainbow triangles are at the beginning? Well, it’s actually just the number of B-G aka red in the loop around the center—that is, the number of bordering edges that are missing a red. Likewise, at the end it’s the number of R-B aka blue edges in the loop. But neither of these numbers depend on the center! So as long as the number of red and blue edges in the loop has the same parity, we can freely change the center node.
Now, to prove the bulk lemma, we’ll show that parity differences must be 0 for any loop. This is kinda like saying that not only do we know that the sum/center of mass of the vectors are 0, but we also know something about the angles (don’t take it too literally though).
Red is like the vector 1,0, Green is like 0,1, and Blue is like 1,1 (addition is mod 2). When we add in a loop, the total sum is 0,0. Note that Blue’s don’t screw up parity differences between Red and Green, cuz they add 1 to both.
So the difference/sum of the two coordinates is equal to the parity difference of reds with greens, because the blues don’t affect it. Since the vector is 0,0, it has sum 0, which is 0. So we’re good.
We could’ve just as easily swapped our basis to use a different clor as the all ones, and so we get that all parity differences are 0.
Since for hexagons the parity of the number of rainbow triangles changes according to the parity difference of a loop, and since those differences are always 0, we can freely change the center node.
Note that it’s quite important that we have a loop that doesn’t go through the center—as you can see by looking at the following half-hexagon pattern from about the middle of the left side of the image for the problem:
R B B B G
Changing the center B to a G increases the number of rainbow triangles by 1.
Alright, now for the theorem. We can change the “bulk” nodes in such a way that we can do an induction argument. That is, we can change them so that the left side of the inner triangle has no green, the right side of the inner triangle has no blue, and the bottom side of the inner triangle has no red. Now there are an odd number of rainbow triangles inside the inner triangle. Because of our coloring of the sides, the only possible way to have extra rainbow triangles are the “outtermost” ones near an inner corner, e.g.
R X X O R O Y B G Z B Y O Z G
Where the X, Y, Z’s mark the two extra nodes of each of the 6 triangles being referenced here, while the O’s mark extraneous nodes that don’t matter.
Now, due to our coloring choice, we see that to get extra rainbow triangles two of the X,Y,Z nodes with the same letter should have different colors. But, this gives us another rainbow triangle, using that as the same side. This is because we picked the corners of our inner triangle to match those of the overall big one.
Therefore, we only have an even number of extra rainbow triangles. So we have an odd number total.
5:
Let’s color points according to where the function sends them. We know the vertices go to R G B as expected. We can subdivide the triangle, giving us something that will satisfy the conditions of Sperner’s lemma no matter how finely we subdivide. For each subdivision, there’ll be a rainbow triangle, giving us a sequence of rainbow triangles.
If we then take the sequence of the red vertex of this sequence of rainbow triangles, there’ll be a convergent subsequence of them. We can then take the associated blue and green vertices, to get three convergent subsequences, and the limiting values of the function must be equal. This value, however, must be a limit point of the green, blue, and red subsets of the disk. The only such point is the center.
6:
Suppose every point of the triangle moves. Then it must be moving ‘closer to’ some side. More precisely: Take a point x. Look at f(x). Draw the ray from x to f(x), and at some point it’ll hit the boundary of the triangle. It will either hit a side, or hit a vertex (aka hit two sides). For the latter case, we can just fix at the outset some choice of colors for the vertices that agrees with the sides. For example, suppose we color the sides R G B, and the RG vertex R, the GB vertex G, and the BR vertex B. Think of this coloring as being on the codomain/second triangle, and then giving rise to a coloring on the inputs. The vertices must move towards the opposing face.
Now, no point on an edge in the domain can be colored the same way the point in the codomain is, as that would mean you moved off the triangle.
This lets us subdivide the triangle successively. We can then get a sequence of red, green, and blue points whose outputs converge to the same value, which then means we must not move at all.
If you aren’t convinced of that last fact: you can express “move closer to” via linear functions (just negative dot product of (f(x) - x) onto the vector from the center to a vertex), and thereby get a generalization of the fact that f(x_i) > 0 for all i implies that f(limit of x_i) >= 0. Alternatively you can directly use problem 5 like in my second proof of problem 3.
7: Sidenote: I think the reason why the projection is well defined is:
You have a convex optimization problem that has a unique solution. More intuitively, you can just do classical geometry:
Since S is closed, you know that there is at least one closest point. (otherwise you can take an infimum of distances, and then take a sequence of progressively closer points, which must converge to a point closer than all points in S since distance is continuous, but since it’s closed you must get a point actually in the set). (alternatively just use the extreme value theorem).
If two points A, B in S are both closest to P then they lie on a sphere around P. Then the interior of the spherical sector APB must not be in S because all of it is closer to P. But the chord AB must be contained in S by convexity, which is a contradiction.
The closest is continuous: Suppose we have some P. Let’s say that X is the closest point in S. The set S must lie to one side of the perpendicular hyperplane to PX (based at X) because otherwise there would be a point that’s closer to P (because the line from a point on the wrong side to X would intersect the inside of the sphere of radius PX centered at P, and all points inside the line are in S).
Suppose we have some other point Q, with closest point Y. If Q is on the line PX, then Y=X. So we can assume Q isn’t on PX. Now, hyperplane cuts need to ‘agree’. Let’s say that the perpendicular hyperplane that PX gives us is H. There is a closest point Q’ on H to Q, given by projecting.
If the closest point in S to Q is ‘further away’ from X than Q’ is then the hyperplane cut corresponding to QY will exclude X. Precisely: Y must lie between the two hyperplanes perpendicular to XQ’ that contain X and that contain Q’ respectively. That is, it lies in a ‘channel’. XQ’ is at most as long as PQ, so it’s a channel at most as wide as PQ is (call it delta). We then have a plane geometry problem on the plane XYQ’. Our ‘agreement’ condition may as well use dot products of projected vectors on this plane (that is, Q’Y and Q’X instead of QY and QX). For the cuts to agree, the perpendicular to YQ’ at Y must intersect XQ’ between the two points, say at Z. Then Q’YZ is a right triangle with hypotenuse Q’Z, so YZ has length less than Q’Z. Finally XY has distance at most XZ + XY ⇐ XZ + ZQ’ = XQ’ ⇐ PQ ⇐ delta
Thus for any epsilon, we can choose some delta < epsilon to ensure that XY is within epsilon whenever PQ is within delta.
Okay, now for the actual problem.
The intuition: Let’s imagine picking a center point of our set S, and then classifying the directions like we would on the triangle.
Here’s what we actually do, which is close enough: find a triangle that is a superset of the set S (we can do this because it’s compact and so bounded). Then we can compose with projection to get a continuous function from the triangle to (the subset S of) itself. Since we know this has a fixed point, yet moves every point that is outside S (but still in the triangle), we know it has a fixed point in S.
8:
Reflecting: Imagine someone gave you the colors of a path like in 1. To find a bichromatic edge, you might have to look at all (but one) edge first! Likewise, with the intermediate value theorem/1D Brouwer, you’d have to check ‘infinitely’ many points to pin down the intermediate value/fixed point (or to get epsilon = 1/N accuracy, you’d have to check about N edges). 2D Sperner requires you to check all but one tiny triangle as well—as we can for example force it to be any of the triangles. Etc.
For functions: let’s look at it in the e(x) = f(x) - x formulation, where fixed points are roots and the problem forces -x ⇐ e(x) ⇐ 1 - x. The idea is to give a choice between two roots, force a commitment to one of them by destroying the other, recreate it, and then destroy the committed one.
Let’s force a commitment by ‘stomping’. We start as a ‘hump’ that’s 0 at both x=0,1. Then, we lift our left leg up. We can do this easily by linear interpolation. Then, we turn our hump into a valley (what a poor camel). For an instant, half the points will be fixed points—but because this will only be true for an instant, our adversary won’t be able to take advantage of this to jump to the other root. Then, we pull our right ‘arm’ down, thus leaving our adversary stranded.
That is, for 0 ⇐ t ⇐ 1⁄3, let e_t(0) = 3⁄2 t -- goes from 0 to 1⁄2. T e_t(1/2) = 1⁄2 e_t(1) = 0.
In between, we’ll do linear interpolation. The left leg stops being on the ground at t=0. So any continuous choice of root has to commit to choosing x=1 for this period.
Now, we’ll turn the hump into a valley: For 1⁄3 ⇐ t ⇐ 2/3: e_t(0) = 1⁄2 e_t(1/2) = 3 (1/2 - t) -- goes from 1⁄2 to −1/2 e_t(1) = 0
and we’ll interpolate in x the same way as before. The continuous choice of root has to keep its commitment—it can’t use the instant where [1/2,1] are all roots to jump to the left, as it is only an instant.
Now, we’ll pull our right arm down: For 2⁄3 ⇐ t ⇐ 1: e_t(0) = 1⁄2 e_t(1/2) = −1/2 e_t(1) = 1 − 3⁄2 t—goes from 0 to −1/2
and we’ll again interpolate in x. The right arm stops being at the ‘ground’ at t=2/3, so now the choice of root is screwed.
To make our fixed point function: observe that the function is always bounded between u(x) = 1 - x (which is always at least 1⁄2 for x in [0, 1⁄2]) and l(x) = x − 1 (which is always at most −1/2 for x in [1/2, 1]). So f_t(x) = e_t(x) + x is a legitimate function from the interval [0,1] to itself.
9:
Suppose each face of an nD simplex that’s subdivided like in the 2D case is known to only use all but 1 color, going through the sequence. e.g. for n=3 there’s a ABC face, a BCD face, a ACD face, and a BCD face. We should have n+1 colors, as the simplex has n+1 vertices.
Now, what will be the analog of paths, what group do we use, and what’s our version of a loop? Well, I claim we should be using faces. We can put faces together to make a closed volume. Another way to look at the sum over a loop of edges is to notice that we double count every node when we add the edges together. Likewise we’ll double count each (n-2)D face (‘edge’ in the n=3 case) of the volume when we add the (n-1)D faces together.
We want this to be 0, so our group elements that we assign to faces should satisfy g + g = 0. For the tiniest face, we have n nodes in a (n-1)D face, and there so we can just label by the missing color/label by sum of nodes. The group is (Z/2Z)^n.
Now, if we take some tiny faces, and put them together in a way that closes a volume, then we will indeed have double counted every node and thus get 0 in our sum.
Onto the bulk lemma. Let’s show that if we change a node ‘in the bulk’, the parity of the rainbow simplices is unchanged. We’re looking at a ‘honeycomb’ like hexagonal like thing around a center. WLOG let’s suppose we change the center color from R to G. The number of rainbow simplices at the start is the number of R faces surrounding the center; and at the end it’s the number of G faces. Therefore, we’ll want to show the parity difference between the number of R faces and G faces is 0.
Imagine an enclosed volume. We can think of our group this way: assign the first n colors to one hot nD basis vectors that have a 1 in the ith coordinate, while the (n+1)th color goes to the all ones vector (and addition is mod 2). Pick the basis so that R and G are the first two colors. The parity difference between the number of R faces and the number of G faces is the sum of the first two coordinates of the total (group) sum—that is, the group sum’s first two coordinates won’t get mucked up by adding the all ones vector, because that changes the parity of every other color by 1; and any other color besides R or G just doesn’t change those two coordinates. The group sum must be the 0 vector; so the sum of the first two coordinates must be 0.
No matter what two colors we want, we can choose a basis where those are the first two. Therefore the parity difference between the number of any two colors in a face sum is unchanged when we add a closed volume.
Thus, the parity of rainbow nD simplices is unchanged when we change the color of a node in the bulk.
Now, we can proceed like in problem 4, by coloring the overall faces of the inner simplex like that of the outer one, and use induction to say that there are an odd number of rainbow simplices in there. Due to how the inner bottom face is missing the same color that the outer bottom is, and etc., the only possible simplices to worry about are the (n+1) that correspond to each corner vertex, or the paired ones that share a tiny face with those. But since the corner vertices of the inner simplex are the same as the outer, we’ll only pick up a rainbow simplex if we pick up a paired one. Thus we get an even number of extra ones, and so the total parity is still odd
10:
First, the IVT analogue (2 and 5):
We can color the n-ball like we did the ndisk. We can do this by taking a coloring of the (n-1)ball, and then ‘inflating’ it, and then adding a nth color below it, that takes up only a volume and outside of the ball (without taking up an internal face, like how blue doesn’t in the image for 5). In the image for 5, you can imagine a green point, that inflates to a green line and is then connected to a red line (where the intersection point is green), which inflates to the two sectors of the image for 5 and is then connected to the blue sector. You can repeat this process.
Then given a continuous function from a closed simplex to that ball, that sends each ith face to non i-colored points, we can then subdivide the simplex, take sequences of rainbow triangles and thus points, yadayada it’s the same argument as before.
Then, the version of 3 and 6:
Suppose we have a continuous function from the closed nD simplex to itself. If every point ‘moves’ then it must be moving closer to some ‘face’, which we can determine by casting a ray from x to f(x) and seeing which side it hits, and then coloring appropriately. The points on each face cannot move outside the triangle, and so must move towards any face but itself—and so that face’s colors in the domain must not have the color we gave to it in the codomain (aka the face’s colors in the domain must not be the group element of “missing color” we gave to it). Vertices must move towards the opposing face, which also works.
We then have a simplex satisfying the conditions of Sperner’s lemma, so we can get the sequences wewant
Then, the version of 7:
Just take a bounding simplex of the compact subset, then projection is a retraction map (continuous map that sends to a subspace and fixes the subspace) because it’s a convex subset, and so if we compose with our continous function f then we’ll get a function from the triangle to itself, which must have a fixed point, which must then be in the subset.
11:
First, just thinking about continuous Hausdorff limit. Let’s say we had f(x) = [-1,1] for x in [0,1], that is, a graph that’s a rectangle. To get a limit, we could try sine waves of increasing frequency. Since they are in that rectangle, d(a,B) is 0; while max_b d(b,A) depends on how well the sine wave fills the box. If we fill the box well enough, that max will be small (cuz it’s 2D distance—in general, product distance) - so we just need to fill the box.
I believe the statement also implicitly assumes that the graph of f is a (nonempty) compact set—as otherwise you couldn’t use the Hausdorff distance when stating that it’s a continuous Hausdorff limit.
A continuous Hausdorff limit f: T → 2^T where T is a compact convex subset will be given by a sequence of continuous functions f_n: T → T. Each function in this sequence has a fixed point by the Brouwer fixed point theorem—and the question is whether f does too.
Let B be the graph of f, and A_n the graph of f_n. Note that B is compact (by assumption), and A_n is too (by the extreme value theorem and compactness of T).
We know sup_{x in A_n} d((x, f_n(x)), B) converges to 0 as n goes to infinity. If we take the sequence (x_n, f_n(x_n)) where x_n is a fixed point of f_n, then d((x_n, x_n), B) = d((x_n, f_n(x_n)), B) ⇐ the sup → 0. Since B is compact, we know that the sequence is bounded. We can then take a convergent subsequence, giving us some limit I’ll call (x’, x’). This limit must be of distance 0 from B, but then it is in B since B is closed.
But if (x’,x’) is in B, then x’ is in f(x’), and we are done.
12: TODO. Current progress.
First let’s note that S \times T is compact, and so if the graph of f is closed then the graph is also compact. Note also that for any x, f(x) is closed, and thus also compact.
Let’s do the given example. In my head, I’m thinking of approximating an infinitely steep ramp by making increasing faster ramps. If instead we tried to make x ⇐ 1⁄2 go to {0} while x > 1⁄2 goes to {1}, we’d to be closed. If instead x = 1⁄2 goes to {0,1} then we’d fail convexity. If we try f(x) = {1} if x is rational else {0}, then we fail to be closed.
Suppose f only assigned one element to each input. We’d like to show that the function assigning each x to the only element of f(x) is continuous.
Consider the projection to the y-axis. This is a continuous function, so the preimage of a closed set of y-values is a closed set of the graph. Then the projection to the x-axis must also be closed, because there’s only one element in f(x), and so if a sequence of points converges in the graph they must have their x values converge. Thus the function assigning each x to the only element of f(x) is continuous, as the preimage of a closed set under it is closed.
More generally: Any closed subset of the graph that passes the vertical line test corresponds to a continuous function. Since each vertical slice of the graph is convex, we know that we must have an entire line segment anywhere the vertical line test fails.
Suppose now that there’s exactly one point c where f(c) has more than one element. The left and right limits of f(x) as x goes to this point must be contained in f(c). If we take the maximum and the minimum height points of f(c), then f(c) is just a line segment. We can make our limit with a “heartbeat” pulse that follows the curve of f(x) to the left of f(c), then near it dips down to the minimum, goes back up to the maximum, and then rejoins the curve. That is, for each N, we’ll follow f until f(c − 1/2N), then dip down to min_height f(c) at c − 1/4N, then go up to max_height f(c) at c + 1/4N, then fall back down to f at f(c + 1/2N). We’ll do our movements via linear interpolation. We can always choose big enough N such that f(c − 1/2N) is always ‘almost’ (at most epsilon from not being) above min f(c), because we know the limiting value must be. Likewise, the right side will always be ‘almost’ above the max for large enough N. We can do this because the limiting values are contained in f(c), and we can pick N big enough so that the function is within epsilon of the limiting values. Thus the ‘directional’ words used are accurate.
We now need to check that this converges in Hausdorff distance. We’ll need to check that both of the sups converge. First let’s do sup_b d(b,A_n). We don’t have to worry about the b’s that are outside of our little pulse interval, because those are distance 0 away from A_n. For values (x,y) that are to the left of c, we know that (c-1/2N, f(c − 1/2N)) is in A_n. If we choose N small enough, we can ensure that f(c-1/2N) is within epsilon of our value f(x). Since we also know that the x values are close enough, we have a good bound. Likewise this works for the values that are to the right of c. For the values at c, we know that each is at most 1/4N away from some point on our ramp at the same height. Thus we have sufficient bounds on each d(b,A_n).
For sup_{a in A_n} d(a,B): if a is outside our pulse interval, then we are fine. For big enough N, we know that f(c) has y values at most epsilon away from every y value between the smallest and largest value of f in our pulse region. We thus know that every point a on our pulse graph is at most 1/2N away from an equal height point at c. This gives us our other bound.
Therefore, when there’s exactly one ‘thick’ point, we can stil find a limit.
Clearly, this generalizes to finitely many thick points, by just stitching functions together. Now we need to be able to deal with countably infinitely many thick points (think of putting an interval at every point 1/2^n, including 0) and ‘thick widths’ like a square. Let’s try the thick widths.
We can try ‘bouncing’ a line up and down through a region, progressively filling it. To do this, we might try to take the Upper curve U(x) = max_height f(x) and the Lower curve L(x) = min_height f(x). These always exist, by compactness of f(x). Note that these curves might not be continuous. For example, if there’s a straight line rise, then U has a jump discontinuity that at the discontinuity returns the higher point.
Now, for every N, let’s subdivide the interval into N parts, and then linearly bounce up and down between the values of U and L. Note that we can actually cross U and L sometimes, while en route to a U or L value.
There’s however a problem when we try our limits. e.g. for the sup over b, what if our b value has a taller interval than the U and L values near it?
So instead, let’s bounce our line like our heartbeat from earlier: in an inteval [a,b], we can try bouncing a line from L(a) to the tallest point in the whole interval, and then down to the lowest point, and then back to U(b). If instead the tallest comes after the lowest, we can bounce …
I’ll use the term “rainbox edge”/”rainbow triangle” instead of bi/tri-chromatic.
Some details are glossed over, but only when I am am confident I could fill them in. Sorry if you are reading this and want more detail.
Also, thanks again for these posts. They are really quite helpful. I mean, I’m kinda skeptical about usefulness towards alignment, but regardless I’m very glad you wrote them. I had heard legends of proofs of the Brouwer fixed point theorem from Sperner’s lemma, but didn’t realize how nice it would be—I prefer this to the usual topological proof.
1: I proved this theorem in like 3 related ways until one of the approaches managed to give traction on the generalizations
Suppose we change a node ‘in the bulk’, that is, a node surrounded by two nodes. Then if we change the color of this node, we’ll either change the number of bichromatic edges by 0 (in the case where the nodes on either side have different colors) or 2 (where they have the same color). Thus we won’t affect the parity, so we can willy nilly change the bulk.
Let’s change the bulk to start with a blue on the leftmost, and green on the rightmost. Then we can by induction apply the lemma to this smaller segment. There are no extra bichromatic edges, so we are done.
2:
Imagine subdividing the interval into 1/N size bits, with the endpoints colored green if the function is positive there, and blue if it’s negative. If any are 0 we are already done, so let’s suppose none are zero.
For each N, pick a rainbow edge, then take the sequence of green endpoints from picked edges. This must have a convergent subsequence by Bolzano-Wierestrass. Now take the corresponding blue endpoints from this sequence (connected by the chosen edges). We then also have a sequence of blue endpoints, which must also converge (to the same value) because the edge lengths go to 0. The value of the function there is equal to the limit of function values, because it’s continuous—so therefore it must be both at least 0 and at most 0, and thus is equal to 0
3: Rewrite that’s more in the ‘spirit’ of problem 5 and 6, but not the first I came up with.
Color each point by whether the function sends it somewhere more “right” or “left” (that is, greater or lesser) than it. First suppose that there’s at least one point of each color. Then we can use the argument in the previous to get two converging sequences, one of “right” elements and one of “left” elements, that converge to the same value. When fed to the function, continuity means that we must be no bigger and no smaller than that value, and so must be equal.
If every point has the same color, then we send either 1 or 0 to something above/below itself, which is impossible because the function maps everything to [0,1].
It fails for (0,1) because you never need to plug in 0 or 1. For example, f(x) = 1⁄2 (1 - x) + 1⁄2 x is a counterexample—the fixed point is 1, which is outside (0,1). The function takes a point and then moves it halfway to 1. So everything is colored “right’, which is why we fail.
You could more directly use the previous problem:
Apply IVT to e(x) = f(x) - x
but I prefer the other way.
4: Huh, this got progressively less messy as I tweaked my proof, and in the process bugs were fixed. And then made even simpler for problem 9. Simplicity is king, remember that.
What group should we use for the 2D case?
Each color will be a group element.
Color an edge by the ‘missing’ color. This is the same as comparing our edge to the sum of all three colors. That is, (R + G + B) - (difference of endpoints). This also tells us what 2 of the same color should be—it should be R + G + B. Having the same color should clearly give us 0, so we want R + G + B = 0. Then, the color of an edge is just the sum of endpoint colors (at least, we’ll see that in a second when we show that -X = X).
Now, if we add the edges of a path, we should get the thing corresponding to the start and end endpoints. So we want e.g. G-B-G to be the same as G-G. That is, R + R = 0. Likewise, G + G = 0, B + B = 0, and G B R gives us that R + G = B. This is what we in the biz call the Klein four group, but you don’t have to worry about that if you don’t want to.
Any loop sums to 0. You can also think of this as counting each node twice. thus cancelling them.
Suppose we have a node in the “bulk” of the triangle, that is, one that’s fully surrounded (like a hexagon). We’ll show that we can change the color of the node without affecting the parity of the number of rainbow triangles. I’ll call this the “bulk lemma”.
WLOG imagine the center starts red and becomes blue, so our hexagon looks like (the Xs are arbitrary color values, not necessarily equal to other Xs)
X X
X R X
X X
and becomes
X X
X B X
X X
Now, how many rainbow triangles are at the beginning? Well, it’s actually just the number of B-G aka red in the loop around the center—that is, the number of bordering edges that are missing a red. Likewise, at the end it’s the number of R-B aka blue edges in the loop. But neither of these numbers depend on the center! So as long as the number of red and blue edges in the loop has the same parity, we can freely change the center node.
Now, to prove the bulk lemma, we’ll show that parity differences must be 0 for any loop. This is kinda like saying that not only do we know that the sum/center of mass of the vectors are 0, but we also know something about the angles (don’t take it too literally though).
Red is like the vector 1,0, Green is like 0,1, and Blue is like 1,1 (addition is mod 2). When we add in a loop, the total sum is 0,0. Note that Blue’s don’t screw up parity differences between Red and Green, cuz they add 1 to both.
So the difference/sum of the two coordinates is equal to the parity difference of reds with greens, because the blues don’t affect it. Since the vector is 0,0, it has sum 0, which is 0. So we’re good.
We could’ve just as easily swapped our basis to use a different clor as the all ones, and so we get that all parity differences are 0.
Since for hexagons the parity of the number of rainbow triangles changes according to the parity difference of a loop, and since those differences are always 0, we can freely change the center node.
Note that it’s quite important that we have a loop that doesn’t go through the center—as you can see by looking at the following half-hexagon pattern from about the middle of the left side of the image for the problem:
R
B B
B G
Changing the center B to a G increases the number of rainbow triangles by 1.
Alright, now for the theorem. We can change the “bulk” nodes in such a way that we can do an induction argument. That is, we can change them so that the left side of the inner triangle has no green, the right side of the inner triangle has no blue, and the bottom side of the inner triangle has no red. Now there are an odd number of rainbow triangles inside the inner triangle. Because of our coloring of the sides, the only possible way to have extra rainbow triangles are the “outtermost” ones near an inner corner, e.g.
R
X X
O R O
Y B G Z
B Y O Z G
Where the X, Y, Z’s mark the two extra nodes of each of the 6 triangles being referenced here, while the O’s mark extraneous nodes that don’t matter.
Now, due to our coloring choice, we see that to get extra rainbow triangles two of the X,Y,Z nodes with the same letter should have different colors. But, this gives us another rainbow triangle, using that as the same side. This is because we picked the corners of our inner triangle to match those of the overall big one.
Therefore, we only have an even number of extra rainbow triangles. So we have an odd number total.
5:
Let’s color points according to where the function sends them. We know the vertices go to R G B as expected. We can subdivide the triangle, giving us something that will satisfy the conditions of Sperner’s lemma no matter how finely we subdivide. For each subdivision, there’ll be a rainbow triangle, giving us a sequence of rainbow triangles.
If we then take the sequence of the red vertex of this sequence of rainbow triangles, there’ll be a convergent subsequence of them. We can then take the associated blue and green vertices, to get three convergent subsequences, and the limiting values of the function must be equal. This value, however, must be a limit point of the green, blue, and red subsets of the disk. The only such point is the center.
6:
Suppose every point of the triangle moves. Then it must be moving ‘closer to’ some side. More precisely: Take a point x. Look at f(x). Draw the ray from x to f(x), and at some point it’ll hit the boundary of the triangle. It will either hit a side, or hit a vertex (aka hit two sides). For the latter case, we can just fix at the outset some choice of colors for the vertices that agrees with the sides. For example, suppose we color the sides R G B, and the RG vertex R, the GB vertex G, and the BR vertex B. Think of this coloring as being on the codomain/second triangle, and then giving rise to a coloring on the inputs. The vertices must move towards the opposing face.
Now, no point on an edge in the domain can be colored the same way the point in the codomain is, as that would mean you moved off the triangle.
This lets us subdivide the triangle successively. We can then get a sequence of red, green, and blue points whose outputs converge to the same value, which then means we must not move at all.
If you aren’t convinced of that last fact: you can express “move closer to” via linear functions (just negative dot product of (f(x) - x) onto the vector from the center to a vertex), and thereby get a generalization of the fact that f(x_i) > 0 for all i implies that f(limit of x_i) >= 0. Alternatively you can directly use problem 5 like in my second proof of problem 3.
7: Sidenote: I think the reason why the projection is well defined is:
You have a convex optimization problem that has a unique solution. More intuitively, you can just do classical geometry:
Since S is closed, you know that there is at least one closest point. (otherwise you can take an infimum of distances, and then take a sequence of progressively closer points, which must converge to a point closer than all points in S since distance is continuous, but since it’s closed you must get a point actually in the set). (alternatively just use the extreme value theorem).
If two points A, B in S are both closest to P then they lie on a sphere around P. Then the interior of the spherical sector APB must not be in S because all of it is closer to P. But the chord AB must be contained in S by convexity, which is a contradiction.
The closest is continuous: Suppose we have some P. Let’s say that X is the closest point in S. The set S must lie to one side of the perpendicular hyperplane to PX (based at X) because otherwise there would be a point that’s closer to P (because the line from a point on the wrong side to X would intersect the inside of the sphere of radius PX centered at P, and all points inside the line are in S).
Suppose we have some other point Q, with closest point Y. If Q is on the line PX, then Y=X. So we can assume Q isn’t on PX. Now, hyperplane cuts need to ‘agree’. Let’s say that the perpendicular hyperplane that PX gives us is H. There is a closest point Q’ on H to Q, given by projecting.
If the closest point in S to Q is ‘further away’ from X than Q’ is then the hyperplane cut corresponding to QY will exclude X.
Precisely: Y must lie between the two hyperplanes perpendicular to XQ’ that contain X and that contain Q’ respectively. That is, it lies in a ‘channel’. XQ’ is at most as long as PQ, so it’s a channel at most as wide as PQ is (call it delta). We then have a plane geometry problem on the plane XYQ’. Our ‘agreement’ condition may as well use dot products of projected vectors on this plane (that is, Q’Y and Q’X instead of QY and QX). For the cuts to agree, the perpendicular to YQ’ at Y must intersect XQ’ between the two points, say at Z. Then Q’YZ is a right triangle with hypotenuse Q’Z, so YZ has length less than Q’Z. Finally XY has distance at most XZ + XY ⇐ XZ + ZQ’ = XQ’ ⇐ PQ ⇐ delta
Thus for any epsilon, we can choose some delta < epsilon to ensure that XY is within epsilon whenever PQ is within delta.
Okay, now for the actual problem.
The intuition: Let’s imagine picking a center point of our set S, and then classifying the directions like we would on the triangle.
Here’s what we actually do, which is close enough: find a triangle that is a superset of the set S (we can do this because it’s compact and so bounded). Then we can compose with projection to get a continuous function from the triangle to (the subset S of) itself. Since we know this has a fixed point, yet moves every point that is outside S (but still in the triangle), we know it has a fixed point in S.
8:
Reflecting: Imagine someone gave you the colors of a path like in 1. To find a bichromatic edge, you might have to look at all (but one) edge first! Likewise, with the intermediate value theorem/1D Brouwer, you’d have to check ‘infinitely’ many points to pin down the intermediate value/fixed point (or to get epsilon = 1/N accuracy, you’d have to check about N edges). 2D Sperner requires you to check all but one tiny triangle as well—as we can for example force it to be any of the triangles. Etc.
For functions: let’s look at it in the e(x) = f(x) - x formulation, where fixed points are roots and the problem forces -x ⇐ e(x) ⇐ 1 - x. The idea is to give a choice between two roots, force a commitment to one of them by destroying the other, recreate it, and then destroy the committed one.
Let’s force a commitment by ‘stomping’. We start as a ‘hump’ that’s 0 at both x=0,1. Then, we lift our left leg up. We can do this easily by linear interpolation. Then, we turn our hump into a valley (what a poor camel). For an instant, half the points will be fixed points—but because this will only be true for an instant, our adversary won’t be able to take advantage of this to jump to the other root. Then, we pull our right ‘arm’ down, thus leaving our adversary stranded.
That is, for 0 ⇐ t ⇐ 1⁄3, let
e_t(0) = 3⁄2 t -- goes from 0 to 1⁄2. T
e_t(1/2) = 1⁄2
e_t(1) = 0.
In between, we’ll do linear interpolation.
The left leg stops being on the ground at t=0. So any continuous choice of root has to commit to choosing x=1 for this period.
Now, we’ll turn the hump into a valley:
For 1⁄3 ⇐ t ⇐ 2/3:
e_t(0) = 1⁄2
e_t(1/2) = 3 (1/2 - t) -- goes from 1⁄2 to −1/2
e_t(1) = 0
and we’ll interpolate in x the same way as before. The continuous choice of root has to keep its commitment—it can’t use the instant where [1/2,1] are all roots to jump to the left, as it is only an instant.
Now, we’ll pull our right arm down:
For 2⁄3 ⇐ t ⇐ 1:
e_t(0) = 1⁄2
e_t(1/2) = −1/2
e_t(1) = 1 − 3⁄2 t—goes from 0 to −1/2
and we’ll again interpolate in x.
The right arm stops being at the ‘ground’ at t=2/3, so now the choice of root is screwed.
To make our fixed point function: observe that the function is always bounded between u(x) = 1 - x (which is always at least 1⁄2 for x in [0, 1⁄2]) and l(x) = x − 1 (which is always at most −1/2 for x in [1/2, 1]). So f_t(x) = e_t(x) + x is a legitimate function from the interval [0,1] to itself.
9:
Suppose each face of an nD simplex that’s subdivided like in the 2D case is known to only use all but 1 color, going through the sequence. e.g. for n=3 there’s a ABC face, a BCD face, a ACD face, and a BCD face. We should have n+1 colors, as the simplex has n+1 vertices.
Now, what will be the analog of paths, what group do we use, and what’s our version of a loop? Well, I claim we should be using faces. We can put faces together to make a closed volume. Another way to look at the sum over a loop of edges is to notice that we double count every node when we add the edges together. Likewise we’ll double count each (n-2)D face (‘edge’ in the n=3 case) of the volume when we add the (n-1)D faces together.
We want this to be 0, so our group elements that we assign to faces should satisfy g + g = 0. For the tiniest face, we have n nodes in a (n-1)D face, and there so we can just label by the missing color/label by sum of nodes. The group is (Z/2Z)^n.
Now, if we take some tiny faces, and put them together in a way that closes a volume, then we will indeed have double counted every node and thus get 0 in our sum.
Onto the bulk lemma. Let’s show that if we change a node ‘in the bulk’, the parity of the rainbow simplices is unchanged. We’re looking at a ‘honeycomb’ like hexagonal like thing around a center. WLOG let’s suppose we change the center color from R to G. The number of rainbow simplices at the start is the number of R faces surrounding the center; and at the end it’s the number of G faces. Therefore, we’ll want to show the parity difference between the number of R faces and G faces is 0.
Imagine an enclosed volume. We can think of our group this way: assign the first n colors to one hot nD basis vectors that have a 1 in the ith coordinate, while the (n+1)th color goes to the all ones vector (and addition is mod 2). Pick the basis so that R and G are the first two colors. The parity difference between the number of R faces and the number of G faces is the sum of the first two coordinates of the total (group) sum—that is, the group sum’s first two coordinates won’t get mucked up by adding the all ones vector, because that changes the parity of every other color by 1; and any other color besides R or G just doesn’t change those two coordinates.
The group sum must be the 0 vector; so the sum of the first two coordinates must be 0.
No matter what two colors we want, we can choose a basis where those are the first two. Therefore the parity difference between the number of any two colors in a face sum is unchanged when we add a closed volume.
Thus, the parity of rainbow nD simplices is unchanged when we change the color of a node in the bulk.
Now, we can proceed like in problem 4, by coloring the overall faces of the inner simplex like that of the outer one, and use induction to say that there are an odd number of rainbow simplices in there. Due to how the inner bottom face is missing the same color that the outer bottom is, and etc., the only possible simplices to worry about are the (n+1) that correspond to each corner vertex, or the paired ones that share a tiny face with those. But since the corner vertices of the inner simplex are the same as the outer, we’ll only pick up a rainbow simplex if we pick up a paired one. Thus we get an even number of extra ones, and so the total parity is still odd
10:
First, the IVT analogue (2 and 5):
We can color the n-ball like we did the ndisk. We can do this by taking a coloring of the (n-1)ball, and then ‘inflating’ it, and then adding a nth color below it, that takes up only a volume and outside of the ball (without taking up an internal face, like how blue doesn’t in the image for 5). In the image for 5, you can imagine a green point, that inflates to a green line and is then connected to a red line (where the intersection point is green), which inflates to the two sectors of the image for 5 and is then connected to the blue sector. You can repeat this process.
Then given a continuous function from a closed simplex to that ball, that sends each ith face to non i-colored points, we can then subdivide the simplex, take sequences of rainbow triangles and thus points, yadayada it’s the same argument as before.
Then, the version of 3 and 6:
Suppose we have a continuous function from the closed nD simplex to itself. If every point ‘moves’ then it must be moving closer to some ‘face’, which we can determine by casting a ray from x to f(x) and seeing which side it hits, and then coloring appropriately. The points on each face cannot move outside the triangle, and so must move towards any face but itself—and so that face’s colors in the domain must not have the color we gave to it in the codomain (aka the face’s colors in the domain must not be the group element of “missing color” we gave to it). Vertices must move towards the opposing face, which also works.
We then have a simplex satisfying the conditions of Sperner’s lemma, so we can get the sequences wewant
Then, the version of 7:
Just take a bounding simplex of the compact subset, then projection is a retraction map (continuous map that sends to a subspace and fixes the subspace) because it’s a convex subset, and so if we compose with our continous function f then we’ll get a function from the triangle to itself, which must have a fixed point, which must then be in the subset.
11:
First, just thinking about continuous Hausdorff limit. Let’s say we had f(x) = [-1,1] for x in [0,1], that is, a graph that’s a rectangle. To get a limit, we could try sine waves of increasing frequency. Since they are in that rectangle, d(a,B) is 0; while max_b d(b,A) depends on how well the sine wave fills the box. If we fill the box well enough, that max will be small (cuz it’s 2D distance—in general, product distance) - so we just need to fill the box.
I believe the statement also implicitly assumes that the graph of f is a (nonempty) compact set—as otherwise you couldn’t use the Hausdorff distance when stating that it’s a continuous Hausdorff limit.
A continuous Hausdorff limit f: T → 2^T where T is a compact convex subset will be given by a sequence of continuous functions f_n: T → T. Each function in this sequence has a fixed point by the Brouwer fixed point theorem—and the question is whether f does too.
Let B be the graph of f, and A_n the graph of f_n. Note that B is compact (by assumption), and A_n is too (by the extreme value theorem and compactness of T).
We know sup_{x in A_n} d((x, f_n(x)), B) converges to 0 as n goes to infinity. If we take the sequence (x_n, f_n(x_n)) where x_n is a fixed point of f_n, then d((x_n, x_n), B) = d((x_n, f_n(x_n)), B) ⇐ the sup → 0. Since B is compact, we know that the sequence is bounded. We can then take a convergent subsequence, giving us some limit I’ll call (x’, x’). This limit must be of distance 0 from B, but then it is in B since B is closed.
But if (x’,x’) is in B, then x’ is in f(x’), and we are done.
12: TODO. Current progress.
First let’s note that S \times T is compact, and so if the graph of f is closed then the graph is also compact.
Note also that for any x, f(x) is closed, and thus also compact.
Let’s do the given example. In my head, I’m thinking of approximating an infinitely steep ramp by making increasing faster ramps. If instead we tried to make x ⇐ 1⁄2 go to {0} while x > 1⁄2 goes to {1}, we’d to be closed. If instead x = 1⁄2 goes to {0,1} then we’d fail convexity. If we try f(x) = {1} if x is rational else {0}, then we fail to be closed.
Suppose f only assigned one element to each input. We’d like to show that the function assigning each x to the only element of f(x) is continuous.
Consider the projection to the y-axis. This is a continuous function, so the preimage of a closed set of y-values is a closed set of the graph. Then the projection to the x-axis must also be closed, because there’s only one element in f(x), and so if a sequence of points converges in the graph they must have their x values converge. Thus the function assigning each x to the only element of f(x) is continuous, as the preimage of a closed set under it is closed.
More generally: Any closed subset of the graph that passes the vertical line test corresponds to a continuous function. Since each vertical slice of the graph is convex, we know that we must have an entire line segment anywhere the vertical line test fails.
Suppose now that there’s exactly one point c where f(c) has more than one element. The left and right limits of f(x) as x goes to this point must be contained in f(c). If we take the maximum and the minimum height points of f(c), then f(c) is just a line segment. We can make our limit with a “heartbeat” pulse that follows the curve of f(x) to the left of f(c), then near it dips down to the minimum, goes back up to the maximum, and then rejoins the curve. That is, for each N, we’ll follow f until f(c − 1/2N), then dip down to min_height f(c) at c − 1/4N, then go up to max_height f(c) at c + 1/4N, then fall back down to f at f(c + 1/2N). We’ll do our movements via linear interpolation. We can always choose big enough N such that f(c − 1/2N) is always ‘almost’ (at most epsilon from not being) above min f(c), because we know the limiting value must be. Likewise, the right side will always be ‘almost’ above the max for large enough N. We can do this because the limiting values are contained in f(c), and we can pick N big enough so that the function is within epsilon of the limiting values. Thus the ‘directional’ words used are accurate.
We now need to check that this converges in Hausdorff distance. We’ll need to check that both of the sups converge. First let’s do sup_b d(b,A_n). We don’t have to worry about the b’s that are outside of our little pulse interval, because those are distance 0 away from A_n. For values (x,y) that are to the left of c, we know that (c-1/2N, f(c − 1/2N)) is in A_n. If we choose N small enough, we can ensure that f(c-1/2N) is within epsilon of our value f(x). Since we also know that the x values are close enough, we have a good bound. Likewise this works for the values that are to the right of c. For the values at c, we know that each is at most 1/4N away from some point on our ramp at the same height. Thus we have sufficient bounds on each d(b,A_n).
For sup_{a in A_n} d(a,B): if a is outside our pulse interval, then we are fine. For big enough N, we know that f(c) has y values at most epsilon away from every y value between the smallest and largest value of f in our pulse region. We thus know that every point a on our pulse graph is at most 1/2N away from an equal height point at c. This gives us our other bound.
Therefore, when there’s exactly one ‘thick’ point, we can stil find a limit.
Clearly, this generalizes to finitely many thick points, by just stitching functions together. Now we need to be able to deal with countably infinitely many thick points (think of putting an interval at every point 1/2^n, including 0) and ‘thick widths’ like a square. Let’s try the thick widths.
We can try ‘bouncing’ a line up and down through a region, progressively filling it. To do this, we might try to take the Upper curve U(x) = max_height f(x) and the Lower curve L(x) = min_height f(x). These always exist, by compactness of f(x). Note that these curves might not be continuous. For example, if there’s a straight line rise, then U has a jump discontinuity that at the discontinuity returns the higher point.
Now, for every N, let’s subdivide the interval into N parts, and then linearly bounce up and down between the values of U and L. Note that we can actually cross U and L sometimes, while en route to a U or L value.
There’s however a problem when we try our limits. e.g. for the sup over b, what if our b value has a taller interval than the U and L values near it?
So instead, let’s bounce our line like our heartbeat from earlier: in an inteval [a,b], we can try bouncing a line from L(a) to the tallest point in the whole interval, and then down to the lowest point, and then back to U(b). If instead the tallest comes after the lowest, we can bounce …
13:
Just look at 11 and 12.