This is the third of three sets of fixed point exercises. The first post in this sequence is here, giving context.
Note: Questions 1-5 form a coherent sequence and questions 6-10 form a separate coherent sequence. You can jump between the sequences.
Let (X,d) be a complete metric space. A function f:X→X is called a contraction if there exists a q<1 such that for all x,y∈X, d(f(x),f(y))≤q⋅d(x,y). Show that if f is a contraction, then for any x, the sequence {xn=fn(x0)} converges. Show further that it converges exponentially quickly (i.e. the distance between the nth term and the limit point is bounded above by c⋅an for some a<1)
(Banach contraction mapping theorem) Show that if (X,d) is a complete metric space and f is a contraction, then f has a unique fixed point.
If we only require that d(f(x),f(y))<d(x,y) for all x≠y, then we say f is a weak contraction. Find a complete metric space (X,d) and a weak contraction f:X→X with no fixed points.
A function f:Rn→R is convex if f(tx+(1−t)y)≤tf(x)+(1−t)f(y), for all t∈[0,1] and x,y∈Rn. A function f is strongly convex if you can subtract a positive parabaloid from it and it is still convex. (i.e.f is strongly convex if x↦f(x)−ε||x||2 is convex for some ε>0.) Let f be a strongly convex smooth function from Rn to R, and suppose that the magnitude of the second derivative ∥∇2f∥ is bounded. Show that there exists an ε>0 such that the function g:Rn→Rn given by x↦x−ε(∇f)(x) is a contraction. Conclude that gradient descent with a sufficiently small constant step size converges exponentially quickly on a strongly convex smooth function.
A finite stationary Markov chain is a finite set S of states, along with probabilistic rule A:S→ΔS for transitioning between the states, where ΔS represents the space of probability distributions on S. Note that the transition rule has no memory, and depends only on the previous state. If for any pair of states s,t∈ΔS, the probability of passing from s to t in one step is positive, then the Markov chain (S,A) is ergodic. Given an ergodic finite stationary Markov chain, use the Banach contraction mapping theorem to show that there is a unique distribution over states which is fixed under application of transition rule. Show that, starting from any state s, the limit distribution limn→∞An(s) exists and is equal to the stationary distribution.
A function f from a partially ordered set to another partially ordered set is called monotonic if x≤y implies that f(x)≤f(y). Given a partially ordered set (P,≤) with finitely many elements, and a monotonic function from P to itself, show that if f(x)≥x or f(x)≤x, then fn(x) is a fixed point of f for all n>|P|.
A complete lattice (L,≤) is a partially ordered set in which each subset of elements has a least upper bound and greatest lower bound. Under the same hypotheses as the previous exercise, extend the notion of fn(x) for natural numbers n to fα(x) for ordinals α, and show that fα(x) is a fixed point of f for all x∈X with f(x)≤x or f(x)≥x and all |α|>|L| (|A|≤|B| means there is an injection from A to B, and |A|>|B| means there is no such injection).
(Knaster-Tarski fixed point theorem) Show that the set of fixed points of a monotonic function on a complete lattice themselves form a complete lattice. (Note that since the empty set is always a subset, a complete lattice must be nonempty.)
Show that for any set A, (P(A),⊆) forms a complete lattice, and that any injective function from A to B defines a monotonic function from (P(A),⊆) to (P(B),⊆). Given injections f:A→B and g:B→A, construct a subset A′ of A and a subset of B′ of B such that B′=f(A′) and A−A′=g(B−B′).
(Cantor–Schröder–Bernstein theorem) Given sets A and B, show that if |A|≤|B| and |A|≥|B|, then |A|=|B|. (|A|≤|B| means there is an injection from A to B, and |A|=|B| means there is a bijection)
Please use the spoilers feature—the symbol ‘>’ followed by ‘!’ followed by space -in your comments to hide all solutions, partial solutions, and other discussions of the math. The comments will be moderated strictly to hide spoilers!
I recommend putting all the object level points in spoilers and including metadata outside of the spoilers, like so: “I think I’ve solved problem #5, here’s my solution <spoilers>” or “I’d like help with problem #3, here’s what I understand <spoilers>” so that people can choose what to read.
Tomorrow’s AI Alignment Forum Sequences post will be “Approval-directed agents: overview” by Paul Christiano in the sequence Iterated Amplification.
The next post in this sequence will be released on Saturday 24th November, and will be ‘Fixed Point Discussion’.
Iteration Fixed Point Exercises
This is the third of three sets of fixed point exercises. The first post in this sequence is here, giving context.
Note: Questions 1-5 form a coherent sequence and questions 6-10 form a separate coherent sequence. You can jump between the sequences.
Let (X,d) be a complete metric space. A function f:X→X is called a contraction if there exists a q<1 such that for all x,y∈X, d(f(x),f(y))≤q⋅d(x,y). Show that if f is a contraction, then for any x, the sequence {xn=fn(x0)} converges. Show further that it converges exponentially quickly (i.e. the distance between the nth term and the limit point is bounded above by c⋅an for some a<1)
(Banach contraction mapping theorem) Show that if (X,d) is a complete metric space and f is a contraction, then f has a unique fixed point.
If we only require that d(f(x),f(y))<d(x,y) for all x≠y, then we say f is a weak contraction. Find a complete metric space (X,d) and a weak contraction f:X→X with no fixed points.
A function f:Rn→R is convex if f(tx+(1−t)y)≤tf(x)+(1−t)f(y), for all t∈[0,1] and x,y∈Rn. A function f is strongly convex if you can subtract a positive parabaloid from it and it is still convex. (i.e.f is strongly convex if x↦f(x)−ε||x||2 is convex for some ε>0.) Let f be a strongly convex smooth function from Rn to R, and suppose that the magnitude of the second derivative ∥∇2f∥ is bounded. Show that there exists an ε>0 such that the function g:Rn→Rn given by x↦x−ε(∇f)(x) is a contraction. Conclude that gradient descent with a sufficiently small constant step size converges exponentially quickly on a strongly convex smooth function.
A finite stationary Markov chain is a finite set S of states, along with probabilistic rule A:S→ΔS for transitioning between the states, where ΔS represents the space of probability distributions on S. Note that the transition rule has no memory, and depends only on the previous state. If for any pair of states s,t∈ΔS, the probability of passing from s to t in one step is positive, then the Markov chain (S,A) is ergodic. Given an ergodic finite stationary Markov chain, use the Banach contraction mapping theorem to show that there is a unique distribution over states which is fixed under application of transition rule. Show that, starting from any state s, the limit distribution limn→∞An(s) exists and is equal to the stationary distribution.
A function f from a partially ordered set to another partially ordered set is called monotonic if x≤y implies that f(x)≤f(y). Given a partially ordered set (P,≤) with finitely many elements, and a monotonic function from P to itself, show that if f(x)≥x or f(x)≤x, then fn(x) is a fixed point of f for all n>|P|.
A complete lattice (L,≤) is a partially ordered set in which each subset of elements has a least upper bound and greatest lower bound. Under the same hypotheses as the previous exercise, extend the notion of fn(x) for natural numbers n to fα(x) for ordinals α, and show that fα(x) is a fixed point of f for all x∈X with f(x)≤x or f(x)≥x and all |α|>|L| (|A|≤|B| means there is an injection from A to B, and |A|>|B| means there is no such injection).
(Knaster-Tarski fixed point theorem) Show that the set of fixed points of a monotonic function on a complete lattice themselves form a complete lattice. (Note that since the empty set is always a subset, a complete lattice must be nonempty.)
Show that for any set A, (P(A),⊆) forms a complete lattice, and that any injective function from A to B defines a monotonic function from (P(A),⊆) to (P(B),⊆). Given injections f:A→B and g:B→A, construct a subset A′ of A and a subset of B′ of B such that B′=f(A′) and A−A′=g(B−B′).
(Cantor–Schröder–Bernstein theorem) Given sets A and B, show that if |A|≤|B| and |A|≥|B|, then |A|=|B|. (|A|≤|B| means there is an injection from A to B, and |A|=|B| means there is a bijection)
Please use the spoilers feature—the symbol ‘>’ followed by ‘!’ followed by space -in your comments to hide all solutions, partial solutions, and other discussions of the math. The comments will be moderated strictly to hide spoilers!
I recommend putting all the object level points in spoilers and including metadata outside of the spoilers, like so: “I think I’ve solved problem #5, here’s my solution <spoilers>” or “I’d like help with problem #3, here’s what I understand <spoilers>” so that people can choose what to read.
Tomorrow’s AI Alignment Forum Sequences post will be “Approval-directed agents: overview” by Paul Christiano in the sequence Iterated Amplification.
The next post in this sequence will be released on Saturday 24th November, and will be ‘Fixed Point Discussion’.