(This post was originally published on March 31st 2017, and has been brought forwarded as part of the AI Alignment Forum launch sequence on fixed points.)

In this post, I present a new formal open problem. A positive answer would be valuable for decision theory research. A negative answer would be helpful, mostly for figuring out what is the closest we can get to a positive answer. I also give some motivation for the problem, and some partial progress.

Open Problem: Does there exist a topological space X (in some convenient category of topological spaces) such that there exists a continuous surjection from X to the space [0,1]X (of continuous functions from X to [0,1])?

Motivation:

Topological Naturalized Agents: Consider an agent who makes some observations and then takes an action. For simplicity, we assume there are only two possible actions, A and B. We also assume that the agent can randomize, so we can think of this agent as outputting a real number in [0,1], representing its probability of taking action A.

Thus, we can think of an agent as having a policy which is a function from the space Y of possible observations to [0,1]. We will require that our agent behaves continuously as a function of its observations, so we will think of the space of all possible policies as the space of continuous functions from Y to [0,1], denoted [0,1]Y.

We will let X denote the space of all possible agents, and we will have a function f:X→[0,1]Y which takes in an agent, and outputs that agent’s policy.

Now, consider what happens when there are other agents in the environment. For simplicity, we will assume that our agent observes one other agent, and makes no other observations. Thus, we want to consider the case where Y=X, so f:X→[0,1]X.

We want f to be continuous, because we want a small change in an agent to correspond to a small change in the agent’s policy. This is particularly important since other agents will be implementing continuous functions on agents, and we would like any continuous function on policies to be able to be considered valid continuous function on agents.

We also want f to be surjective. This means that our space of agents is sufficiently rich that for any possible continuous policy, there is an agent in our space that implements that policy.

In order to meet all these criteria simultaneously, we need a space X of agents, and a continuous surjection f:X→[0,1]X.

Unifying Fixed Point Theorems: While we are primarily interested in the above motivation, there is another secondary motivation, which may be more compelling for those less interested in agent foundations.

There are (at least) two main clusters of fixed point theorems that have come up many times in decision theory, and mathematics in general.

First, there is the Lawvere cluster of theorems. This includes the Lawvere fixed point theorem, the diagonal lemma, and the existence of Quines and fixed point combinators. These are used to prove Gödel’s incompleteness Theorem, Cantor’s Theorem, Löb’s Theorem, and achieve robust cooperation in the Prisoner’s Dilemma in modal framework and bounded variants. All of these can be seen as corollaries of Lawvere’s fixed point theorem, which states that in a cartesian closed category, if there is a point-surjective map f:X→YX, then every morphism g:Y→Y has a fixed point.

Second, there is the Brouwer cluster of theorems. This includes Brouwer’s fixed point theorem, The Kakutani fixed point theorem, Poincaré–Miranda, and the intermediate value theorem. These are used to prove the existence of Nash Equilibria, Logical Inductors, and Reflective Oracles.

If we had a topological space and a continuous surjection X→[0,1]X, this would allow us to prove the one-dimensional Brouwer fixed point theorem directly using the Lawvere fixed point theorem, and thus unify these two important clusters.

Most Diagonalization Intuitions Do Not Apply: A common initial reaction to this question is to conjecture that such an X does not exist, due to cardinality or diagonalization intuitions. However, note that all of the diagonalization theorems pass through (some modification of) the same lemma: Lawvere’s fixed point theorem. However, this lemma does not apply here!

For example, in the category of sets, the reason that there is no surjection from any set X to the power set, {T,F}X, is because if there were such a surjection, Lawvere’s fixed point theorem would imply that every function from {T,F} to itself has a fixed point (which is clearly not the case, since there is a function that swaps T and F).

However, we already know by Brouwer’s fixed point theorem that every continuous function from the interval [0,1] to itself has a fixed point, so the standard diagonalization intuitions do not work here.

Impossible if You Replace [0,1] with e.g. S1: This also provides a quick sanity check on attempts to construct an X. Any construction that would not be meaningfully different if the interval [0,1] is replaced with the circle S1 is doomed from the start. This is because a continuous surjection X→(S1)X would violate Lawvere’s fixed point theorem, since there is a continuous map from S1 to itself without fixed points.

Impossible if you Require a Homeomorphism: When I first asked this question I asked for a homeomorphism between X and [0,1]X. Sam Eisenstat has given a very clever argument why this is impossible. You can read it here. In short, using a homeomorphism, you would be able to use Lawvere to construct a continuous map that send a function from [0,1] to itself to a fixed point of that function. However, no such continuous map exists.

Notes:

If you prefer not to think about the topology of [0,1]X, you can instead find a space X, and a continuous map h:X×X→[0,1], such that for every continuous function f:X→[0,1], there exists an xf∈X, such that for all x∈X, h(xf,x)=f(x).

Many of the details in the motivation could be different. I would like to see progress on similar questions. For example, you could add some computability condition to the space of functions. However, I am also very curious which way this specific question will go.

This post came out of many conversations, with many people, including: Sam, Qiaochu, Tsvi, Jessica, Patrick, Nate, Ryan, Marcello, Alex Mennen, Jack Gallagher, and James Cook.

This post was originally published on March 31st 2017, and has been brought forwarded as part of the AI Alignment Forum launch sequences.

Tomorrow’s AIAF sequences post will be ‘Iterated Amplification and Distillation’ by Ajeya Cotra, in the sequence on iterated amplification.

I have just now submitted an attempted solution to this problem to “Geometry and Topology”. I claim that the space X you are looking for is 2ω1 (ω1 being the least uncountable cardinal) with the ``generalised Cantor space topology”, that is for each countable well-ordered bit-string b you have a basic open set consisting of all bit-strings of length ω1 with b as an initial fragment. Since this topological space has quite a large cardinality I’m somewhat unclear whether this is helpful for your proposed application and would need to think about it more. (Matthew Barnett just now directed me to this post of yours.) I sent you an early draft of my paper, which argues the point in detail, on FB Messenger, and can send the latest version to you if you wish.

Let A′ be 2ω1 with generalised Cantor space topology, and A′′ be 2ω1 with product topology, X a closed disc in a finite-dimensional Euclidean space. Then there is a continuous surjection A′→XA′′. I don’t know how to show that there is a topological space A with carrier set 2ω1 and a continuous surjection A→XA. Thanks to Alex Mennen for pointing out the problem.

However, because topology on A′ is finer than topology on A′′ here, this still shows how the proof of the Lawvere fixed point theorem can be applied here to give Brouwer fixed point theorem as corollary, which could conceivably be a publishable result (see what “Geometry and Topology” think about that), and this could still be sorta kinda maybe relevant to Scott’s original motivation for looking at the problem (if you’re okay with working with two different topologies on the space of agents, one finer than the other). But this is a very big space of agents you’re talking about here.

Correction: need not only that topology on A′ is finer than topology on A′′, but also, given arbitrary open subset of X, take pre-image under evaluation map in XA′′×A′′, projection onto first factor and then pre-image of that under the continuous surjection A′→XA′′, it needs to be shown that this set is open in both topologies. I believe that this can indeed be done for an appropriate class of spaces X for the pair of topologies in question.

I fixed it. In our editor, use cmd-4/ctrl-4 to do LaTex, not dollar signs. (The thing you did would work in the markdown editor – you can go into settings to change to that editor if you’d like.)

To solve the problem, it would suffice to find a reflexive domain X with a retract onto [0,1].

This is because if you have a reflexive domain X, that is, an X with a continuous surjective map f::X→XX, and A is a retract of X, then there’s also a continuous surjective map g::X→AX.

Proof: If A is a retract of X then we have a retraction r::X→A and a section s::A→X with r∘s=1A. Construct g(x):=r∘f(x). To show that g is a surjection consider an arbitrary q∈AX. Thus, s∘q::X→X. Since f is a surjection there must be some x with f(x)=s∘q. It follows that g(x)=r∘f(x)=r∘s∘q=q. Since q was arbitrary, g is also a surjection.

Let C(I) be the space of continuous functions I→I. Then any element of C(I) defines a unique function Q∩I→I (the converse is not true—most functions Q∩I→I do not correspond to continuous functions I→I). Pulling C(I) back to I via Iω we define the set Y⊂I.

Thus Y maps surjectively onto C(I). However, though C(I) maps into C(Y) by restriction (any function from I is a function from Y), this map is not onto (for example, there are more continuous functions from I−{1/2} than there are from I, because of the potential discontinuity at 1/2).

Now, there are elements of I−Y that map to functions in C(Y) but not in C(I). So there’s a hope that there may exist an X with Y⊂X⊂I, C(I)⊂C(X)⊂C(Y), and X mapping onto C(X). Basically, as X `gets bigger’, its image in C(Y) grows, while C(X) itself shrinks, and hopefully they’ll meet.

I will agree that Iwis connected and locally connected. I’m not sure if its second countable. It is not compact.

Just to be clear Iw={(x1,x2,⋯)∀i:xi∈[0,1]]} Now let Hi={x∈Iw|xi<0.6} . Clearly each Hi is open. Let G={x∈Iw|∀i:xi>0.4} And F={G,H1,H2,⋯}. Now clearly this family covers all of Iw. However, remove any Hi from F and x=(i0.9,0.9,⋯0.9,0.1,0.9,0.9,⋯) is no longer covered. So F is a family of open sets, which cover Iw and don’t have any finite subcover.

For your proof, I think that G is not open in the product topology. The product topology is the coarsest topology where all the projection maps are continuous.

To make all the projection maps continuous we need all sets in S to be open, where we define σ∈S iff there exists an i, such that pi(σ) is open in [0,1] and σ={x=(x1,x2,…)|xi∈pi(σ),0≤xj≤1 for i≠j}.

Let S′ be the set of finite intersection of these sets. For any σ′∈S′, there exists a finite set Nσ′⊂N such that if x∈σ′ and yi=xi for i∈Nσ′, then y∈σ′ as well.

If we take S′′ to be the arbitrary union of S′, this condition will be preserved. Thus G is not contained in the arbitrary unions and finite intersections of S, so it seems it is not an open sent.

If I were studying this, I would be looking at domain theory, in which (among other things) there has been found a topological space X homeomorphic with XX. The page I linked links to some notes at the bottom. (h/t Qiaochu for pointing out domain theory)

When thinking about agents, the first motivation might not quite work out. Small changes in observation might introduce discontinuous changes in policy—e.g. in the Matching Pennies game. Suppose there are agents (functions) in X that output a fixed P(Heads), no matter their input. If you can continuously vary P(heads) by moving in X, then Matching Pennies play will be discontinuous at P(Heads)=0.5. So right away you’ve committed to some unusual behavior for the agents in X by asking for continuity—they can’t play perfect Matching Pennies at the very least.

Ok, here’s an idea for constructing such a map, with a few key details left unproven; let me know if people see any immediate flaws in the approach, before I spend time filling in the holes.

Let X be a countable collection of open intervals (eg X={x∈R,x∉N}), given the usual topology. Let I=[0,1] be the closed unit interval, and C(X,I) the set of continuous functions from X to I. Give C(X,I) the compact-open topology.

By the properties of the compact-open topology, since I is T3.5 (Tychonoff), then so is C(X,I). I’m hoping that the proof can be extended, at least in this case, to show that C(X,I) is T4 (normal Haussdorff).

It seems clear that C(X,I) is second-countable: let V(K,U) consist of all functions that map K into U, where K⊂X is the intersection of X with a closed interval with rational endpoints, and U⊂I is an open interval with rational endpoints. The set of all such V(K,U) is countable, and forms a subbasis of C(X,I). A countable subbasis means a countable basis, as the set of finite subsets of countable set, is itself countable.

If C(X,I) is T4 and second countable, then it is homeomorphic to a subset of the Hilbert Cube. To simplify notation, we will identify C(X,I) with its image in the Hilbert Cube.

Take the closure ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯C(X,I) of C(X,I) within the Hilbert Cube. This closure is compact and second countable (since the Hilbert Cube itself is both). It seems clear that C(X,I) is connected and locally connected; connected will extend to the closure, we’ll need to prove that locally connected does as well.

A non-empty Hausdorff topological space is a continuous image of the unit interval if and only if it is a compact, connected, locally connected second-countable space.

So there is a continuous surjection ϕ:I→¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯C(X,I). Pull back C(X,I), defining ϕ−1(C(X,I))⊂I. If C(X,I) is open in ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯C(X,I) (with the subspace topology), then ϕ−1(C(X,I)) is open in I. Even if ϕ−1(C(X,I)) is not open, we can hope that, at worst, it consists of a countable collection of points, open intervals, half-closed intervals, and closed intervals (this is not a general property of subsets of the interval, cf the Cantor set, but it feels very likely that it will apply here).

In that case, these is a continuous surjection s from X to ϕ−1(C(X,I)), mapping each open interval to one of the points or intervals (“folding” over the ends when mapping to those with closed end-points).

Then ϕ∘s:X→C(X,I) is the continuous surjection we are looking for.

Note: I’m thinking now that C(X,I) might not be connected, but this would not be a problem as long as it has a countable number of connected components.

I am going to take some license with your question because I think you are asking the wrong question. Arbitrary topological spaces and abstract continuity are rarely the right notions in real-world situations. Rather, uniform continuity on bounded sets usually better corresponds to the intuitive notion of “a small change in input produces a small change in output”.

Thus, suppose that X is a complete separable metric space and that f:X×X→[0,1] is uniformly continuous on bounded sets. Then we can show that there exists a function g:X→[0,1] which is uniformly continuous on bounded sets but not a fiber of f (i.e. there is no x such that f(x,y)=g(y) for all y). Indeed, consider two cases:

X is uniformly discrete. Then every map from X to [0,1] is uniformly continuous, so we get a contradiction from cardinality considerations.

X is not uniformly discrete. Then for each n, since f is uniformly continuous on B(0,n) it has a modulus of continuity on this set, i.e. a continuous increasing function hn:(0,∞)→(0,∞) such that d(f(x),f(y))<hn(d(x,y)) for all x,y∈B(0,n)⊆X×X. Since X is not uniformly discrete, there is a function g:X→[0,1] such that for infinitely many n, there exist x,y such that d(g(x),g(y))>hn(d(x,y)). (To construct it, take pairs (xn,yn) with hn(d(xn,yn))<2−n, extract a subsequence that behaves geometrically nicely, and then find a function g such that d(g(xn),g(yn))>2−n for all n in the subsequence.) Clearly, g cannot be a fiber of f.

g can be a fiber of f, since for each n, xn and yn could be distance greater than n from the basepoint.

Example: let X:={xn,yn|n∈N}, with d(xn,xm)=d(yn,ym)=d(xn,ym)=2|n−m| for n≠m and d(xn,yn)=6−n. Let x0 be the basepoint (so that (x0,x0) is the point you were calling “0”). Let g(xn):=0, g(yn):=1, f(z,w):=g(z), and hn(r):=3nr.

I also don’t see how to even construct the function g, or, relatedly, what you mean by “geometrically nicely”, but I guess it doesn’t matter.

Also, I’m not convinced that metric spaces with uniform continuity on bounded subsets is a better framework than topological spaces with continuity.

This is intended as a reply to David Simmons’s comment, which for some reason I can’t reply to directly.

In the new version of your proof, how do we know Yk isn’t too close to Xl for some l>k? And how do we know that g is uniformly continuous on bounded subsets?

About continuity versus uniform continuity on bounded sets:

It seems to me that your point 1 is just a pithier version of your point 4, and that these points support paying attention to uniform continuity, rather than uniform continuity restricted to bounded sets. This version of the problem seems like it would be a less messy version of the “fixed modulus of continuity” version of the problem you mentioned (which I did not understand your solution to, but will look again later).

I’m not sure what you’re getting at about singularities in point 3. I wouldn’t have asked why you were considering uniform continuity on bounded sets instead of uniform continuity away from singularities (in fact, I don’t know what that means). I would ask, though, why uniform continuity on bounded sets instead of uniform continuity on compact sets? As you point out, the latter is the same as continuity.

Your point 2 is completely wrong, and in fact this is the primary reason I was convinced that continuity is a better thing to pay attention to than uniform continuity on bounded sets. The type of object you are describing is an effective Polish space that remembers its metric. Typically, descriptive set theorists forget the metric, and the isomorphisms between Polish spaces are homeomorphisms (and the isomorphisms between effective Polish spaces are computable homeomorphisms). Changing the metric in a computably homeomorphic way does not change what can be done when points are represented as descending chains of basic open sets with singleton intersection. So the thing you described was really topological rather than metric in nature, even though it involves introducing a metric in the setup. I am not aware of any notions of computability in metric spaces in which the metric matters in the way you are suggesting. It is not true that uniform continuity gives you an algorithm for computing the function. As a counterexample, let a be an uncomputable number, and let f:R→R be given by f(x)=a for every x. f is clearly uniformly continuous, but not computable. It is also not true that uniform continuity is necessary in order for the function to be computable. For instance, sin(1/x) is computable on (0,1]. Of course, (0,1] is not complete, but for another example, consider an effective infinite-dimensional Hilbert space, and let {en|n∈N} be an effective orthonormal basis. Let fn(x):=max(0,12−∥x−en∥). This sequence of functions is computable, they have disjoint support, and for any point, a sufficiently small neighborhood around it will be disjoint from the supports of all but at most one of these functions. Thus f(x):=∑nnfn(x) is computable, but of course it is not uniformly continuous on the unit ball, which is bounded. However, it is true that every computable function is continuous, and conversely, every continuous function is computable with respect to some oracle. Of course, we really want computability, not computability with respect to some oracle, but this still seems to show that continuity is at least a good metaphor for computability, whereas uniform continuity on bounded sets doesn’t seem so to me.

Of course, all this about continuity as a metaphor for computability makes the most sense in the context of Polish spaces, and we can only talk about actual computability in the context of effective Polish spaces. Scott’s problem involves a space X and the exponential [0,1]X. If X is a locally compact Polish space, then [0,1]X is also Polish. I think that this might be necessary (that is, if X and [0,1]X are Polish, then X is locally compact), although I’m not sure. If so, and if your proof is correct, it seems plausible that your proof could be adapted to show that there is no locally compact Polish space with the property that Scott was looking for, and that would show that there is no solution to the problem in which X and [0,1]X are both Polish spaces, and hence no computable solution, if computability is formalized as in effective descriptive set theory.

I don’t know why my comment doesn’t have a reply button. Maybe it is related to the fact that my comment shows up as “deleted” when I am not logged in.

Sorry, I seem to be getting a little lazy with these proofs. Hopefully I haven’t missed anything this time.

New proof: … We can extract a subsequence (nk) such that if Xk=xnk and Yk=ynk, then d(Xk+1,Yk+1)≤(1/6)d(Xk,Yk) for all k, and for all k and l>k, either (A) d(Xk,Yl)≥(1/3)d(Xk,Yk) and d(Xk,Xl)≥(1/3)d(Xk,Yk) or (B) d(Yk,Xl)≥(1/3)d(Xk,Yk) and d(Yk,Yl)≥(1/3)d(Xk,Yk). By extracting a further subsequence we can assume that which of (A) or (B) holds depends only on k and not on l. By swapping Xk and Yk if necessary we can assume that case (A) always holds.

Lemma: For each z there is at most one k such that d(z,Xk)≤(1/6)d(Xk,Yk).

Proof: Suppose d(z,Xk)≤(1/6)d(Xk,Yk) and d(z,Xl)≤(1/6)d(Xl,Yl), with k<l. Then d(Xk,Xl)<(1/3)d(Xk,Yk), a contradiction.

It follows that by extracting a further subsequence we can assume that d(Yk,Xl)≥(1/6)d(Xl,Yl) for all l>k.

Now let j:[0,∞)→[0,∞) be an increasing uniformly continuous function such that j(0)=0 and j((1/6)d(Xk,Yk))>2−nk for all k. Finally, let g(x)=infkj(d(x,Yk)). Then for all k we have g(Yk)=0. On the other hand, for all k<l we have d(Xk,Yl)≥(1/3)d(Xk,Yk), for k=l we have d(Xk,Yl)=d(Xk,Yk), and for k>l we have d(Xk,Yl)≥(1/6)d(Xk,Yk). Thus g(Xk)=inflj(d(Xk,Yl))≥j((1/6)d(Xk,Yk))>2−nk. Clearly, g cannot be a fiber of f. Moreover, since the functions j and x↦infkd(x,Yk) are both uniformly continuous, so is g.

Regarding your responses to my points:

I guess I don’t disagree with what you write regarding my points 1 and 4.

It seems to be harder than expected to explain my intuitions regarding singularities in point 3. Basically, I think the reasons that abstract continuity came to be considered standard are mostly the fact that in concrete applications you have to worry about singularities, and this makes uniform continuity a little more technically annoying. But in the kind of problem we are considering here, it seems that continuity is really more annoying to deal with than uniform continuity, with little added benefit. I guess it also depends on what kinds of functions you expect to actually come up, which is a heuristic judgement. Anyway it might not be productive to continue this line of reasoning further as maybe our disagreements just come down to intuitions.

Regarding my point 2, I wasn’t very clear when I said that uniform continuity gives you an algorithm, what I meant was that if you have an algorithm for computing the images of points in the dense sequence and for computing the modulus of continuity function, then uniform continuity gives you an algorithm. The function x↦sin(1/x) would be the kind of thing I would handle with uniform continuity away from singularities (to fix a definition for this, let us say that you are uniformly continuous away from singularities if you are uniformly continuous on sets of the form B(0,n)∖B(S,1/n), where S is some set of singularities).

In your definition of fn, I think you mean to write max instead of min. But I see your point, though the example seems a little pathological to me.

Anyway, it seems that you agree that it makes sense to restrict to Polish spaces based on computability considerations, which is part of what I was trying to say in 2.

If you have a locally compact Polish space, then you can find a metric with respect to which the space is proper (i.e. bounded subsets are compact): let d′(x,y)=d(x,y)+|f(x)−f(y)|, where f(x)=1/sup{r:B(x,r) is compact}. With respect to this metric, continuity is the same as uniform continuity on bounded sets, so my proof should work then.

Proposition: Let X be a Polish space that is not locally compact. Then [0,1]X (with the compact-open topology) is not first countable.

Proof: Suppose otherwise. Then the function f0≡0 has a countable neighborhood basis of sets of the form F(Kn,Un)={f:f(Kn)⊆Un} where Kn⊆X is compact and Un⊆[0,1] is open. Since X is not locally compact, there exists a point x such that B(x,r) is not compact for any x. For each n, we can choose xn∈B(x,1/n)∖⋃i≤nKi. Let K={xn:n}∪{x}, and note that K is compact. Then F(K,[0,1/2)) is a neighborhood of f0. But then F(K,[0,1/2))⊇⋂i≤nF(Ki,Ui) for some n. This contradicts the fact that xn∈K∖⋃i≤nKi, since we can find a bump function which is 0 on ⋃i≤nKi but 1 at x.

It does still seem to me that most of the useful intuition comes from point 4 of my previous comment, though.

It appears that comments from new users are collapsed by default, and cannot be replied to without a “Like”. These seem like bad features.

Your proof that there’s no uniformly continuous on bounded sets function f:X×X→[0,1] admitting all uniformly continuous on bounded sets functions X→[0,1] as fibers looks correct now. It also looks like it can be easily adapted to show that there is no uniformly continuous f:X×X→[0,1] admitting all uniformly continuous functions X→[0,1] as fibers. Come to think of it, your proof works for arbitrary metric spaces X, not just complete separable metric spaces, though those are nicer.

I see what you mean now about uniform continuity giving you an algorithm, but I still don’t think that’s specific to uniform continuity in an important way. After all, if you have an algorithm for computing images of points in the countable dense set, and a computable “local modulus of continuity” in the sense of a computable function h:X×[0,∞)→[0,∞) with h(x,0)=0 and d(x,y)<r⟹|f(y)−f(x)|<h(x,r), then f is computable, and this does not require f to be uniformly continuous. Although I suppose you could object that this is a bit circular, in that I’m assuming the “local modulus of continuity” is computable only in the standard sense, which does not require uniform continuity.

I’m not sure why you would allow singularities at some points (presumably a uniformly discrete set, or something like that) while still insisting on uniform continuity elsewhere. It still seems to me that the arguments for uniform continuity rather than continuity all point to wanting uniform continuity entirely, rather than some sense of local uniform continuity in most places.

Thanks for pointing out the error in my definition of fn; I’ve fixed it.

In your argument that locally compact Polish spaces can be given metrics with respect to which they are proper, it isn’t true that d′ is necessarily a proper metric. For instance, consider a countably infinite set with d(x,y)=1 for x≠y. This is a locally compact Polish space, but f(x)=1 for every x, so d′=d, and the space is not proper.

Your last proposition looks correct (though with a typo: last ⊆ in the proof should be ⊇). However, if X is not locally compact, then the compact-open topology isn’t necessarily the right topology to consider on [0,1]X. We want a topology making [0,1]X into an exponential object, and it isn’t clear that such a topology even exists, or that it is the compact-open topology if it does exist (though it must be a refinement of the compact-open topology if it does exist). Maybe asking about non-locally compact Polish spaces X with a Polish exponential space [0,1]X is a kind of weird question, though, and if we’re even considering non-locally compact Polish spaces, we should turn to the version of the question where we just want a continuous function X×X→[0,1] admitting all continuous functions X→[0,1] as fibers.

I will have to think more about the issue of continuity vs uniform continuity. I suppose my last remaining argument would be the fact that Bishop—Bridges’ classic book on constructive analysis uses uniform continuity on bounded sets rather than continuity, which suggests that it is probably better for constructive analysis at least. But maybe they did not analyze the issue carefully enough, or maybe the relevant issues here are for some reason different.

To fix the argument that every locally compact Polish space admits a proper metric, let f be as before and let F(x,y)=∞ if d(x,y)≥f(x) and F(x,y)=f(x)/[f(x)−d(x,y)] if d(x,y)<f(x). Next, let g(y)=minn[n+F(xn,y)], where (xn) is a countable dense sequence. Then g is continuous and everywhere finite. Moreover, if S=g−1([0,N]), then S⊆⋃n≤NB(xn,(1−1/N)f(xn)) and thus S is compact. It follows that the metric d′(x,y)=d(x,y)+|g(y)−g(x)| is proper.

Hm, perhaps I should figure out what the significance of uniform continuity on bounded sets is in constructive analysis before dismissing it, even though I don’t see the appeal myself, since constructive analysis is not a field I know much about, but could potentially be relevant here.

f is the reciprocal of what it was before, but yes, this looks good. I am happy with this proof.

Ah, you’re right. The proof can be fixed by changing the division between the two cases. So here is the new proof, with more details added regarding the construction of g:

B(0,m) is uniformly discrete for all m. Then every map from X to [0,1] is uniformly continuous on bounded sets, so we get a contradiction from cardinality considerations.

B(0,m) is not uniformly discrete for some m. Then for each n≥m, since f is uniformly continuous on B(0,n) it has a modulus of continuity on this set, i.e. a continuous increasing function hn:(0,∞)→(0,∞) such that d(f(x),f(y))<hn(d(x,y)) for all x,y∈B(0,n)⊆X×X. Since B(0,m) is not uniformly discrete, there exist xn,yn∈B(0,m) such that hn(d(xn,yn))<2−n and d(xn,yn)<2−n. We can extract a subsequence (nk) such that if Xk=xnk and Yk=ynk, then d(Xk+1,Yk+1)≤(1/3)d(Xk,Yk) for all k, and for all k and ℓ>k, either (A) d(Xk,Yℓ)≥(1/3)d(Xk,Yk) or (B) d(Yk,Xℓ)≥(1/3)d(Xk,Yk). By extracting a further subsequence we can assume that which of (A) or (B) holds depends only on k and not on ℓ. By swapping Xk and Yk if necessary we can assume that case (A) always holds. Now let j:[0,∞)→[0,∞) be an increasing continuous function such that j((1/3)d(Xk,Yk))>2−nk for all k. Finally, let g(y)=infkj(d(Xk,y)). Then for all k we have g(Xk)=0 but g(Yk)>2−nk. Clearly, g cannot be a fiber of f.

Regarding the appropriateness of metric spaces / uniform continuity rather than topological spaces / abstract continuity, here are some of the reasons behind my intuition here (developed working in mathematical analysis, specifically Diophantine approximation, and also constructive mathematics):

The obvious: metric spaces are explicitly meant to represent the intuitive notion of alikeness as a quantitative concept (i.e. distance), whereas topological spaces have no explicit notion of alikeness.

In computability theory, one is interested in the question of how to computationally represent a point or an approximation to a point in a space. The standard way to do this is via restricting to the class of complete separable metric spaces, fixing a countable dense sequence (xn) (assumed to be representative of the structure of the metric space), and defining a computational approximation to a point to be an expression of the form B(xn,1/m). Since n and m are integers this expression can be coded as finite data. One then defines a computational encoding of a point to be an infinite bitstream consisting of computational approximations that converge to the point.

In practical applications, in the end you will want everything to be computable. So it makes sense to work in a framework where there are natural notions of computability. I am not aware of any such notions for general topological spaces.

Regarding continuity vs uniform continuity in metric spaces, both are saying that if two points are close in the domain, their images are also close. But the latter gives you a straightforward estimate as to how close, whereas the former says that the degree of closeness may depend on one of the points. Now, there are good reasons to consider such dependence, since even natural functions on the real numbers (such as x2 or 1/x) have “singularities” where they are not uniformly continuous.

So the question is whether to modify the notion of uniform continuity to directly account for singularities, or to use the standard definition of continuity instead. But if one works with the standard definition, then most of the time one is really looking for ways to sneak back to uniform continuity, e.g. by using the fact that a continuous function on a compact set is uniformly continuous.

An intuitive way of thinking about the fact that a continuous function on a compact set is uniformly continuous is that the notion of compactness means that there are no singularities present “within the space”. For example, if we go back to the functions x2 or 1/x, then the singularity of the first occurs at infinity, while the singularity of the latter occurs at 0. If we take a compact subset of the domain of either function, then what it really means is that we are avoiding the singularity.

By contrast, non-compactness should mean that there are singularities. In some cases like (0,1) it is easy to identify what the singularities are. But if we are dealing with spaces that are not locally compact like NN or an infinite-dimensional Hilbert space, then it is not as clear what the singularities are, there is just a general sense that they are dispersed “throughout the space” (because the space is not not locally compact).

But you have to ask yourself, are these singularities real or just imagined? In many cases, imagined. For example, in the theory of Banach spaces continuous linear maps are always uniformly continuous.

What about a map that is not uniformly continuous, like the inversion map f(x)=x/∥x∥2 in infinite-dimensional Hilbert space? In this case, there is still a singularity—at 0 -- and the definition of continuity needs to reflect that. But it doesn’t help to imagine all sorts of other singularities dispersed throughout the space, because that prevents you from making useful statements like: if x,y are at least α away from 0 and d(x,y)≤ε, then d(f(x),f(y))≤Cε/α2, where C is an absolute constant.

Now the example in the previous paragraph is an example of quantitative continuity, which is stronger than uniform continuity away from singularities. But the point is that it can be seen as an extension of uniform continuity away from singularities.

Maybe my last reason will be the most relevant from a naturalized agent perspective. The notion of uniform continuity is important because it introduces the modulus of continuity, which can be viewed as a measure of how continuous a function is. The restriction that an agent must be uniformly continuous can be then thought of in a quantitative sense, with “better” agents less having to follow this restriction. So a more powerful agent may have a looser (larger) modulus of continuity, because it can react more precisely to different possible inputs.

In this terminology, my proof can be thought of as giving an intuitive reason for why the agent cannot implement every possible policy: the agent has limited resources to distinguish different inputs, so it can only implement those policies that can be implemented with these limited resources.

The obvious followup question would be whether if you restrict your attention to the policies that the agent isn’t prevented from implementing due to its limited resources, then can it implement every possible policy? Or in other words, if you fix a modulus of continuity from the outset, can you include all functions with that modulus of continuity as fibers?

If you allow the every-policy function to have an arbitrary modulus of continuity unrelated to the modulus of continuity you are trying to imitate, then it is not hard to see that this is possible at least for some spaces. (By Arzela-Ascoli the space of functions with a fixed modulus of continuity is compact, so there exists a continuous surjection from 2N to this space.) But this may require greatly increasing the resources that the agent must spend to differentiate inputs. On the other hand, requiring the exact same modulus of continuity seems like too rigid an assumption. So the right question is probably to ask how close can the modulus of continuity of the every-policy function be to the modulus it is trying to imitate.

For this kind of question it is probably better to work with a concrete example rather than trying to prove something in generality, so I will work with the Cantor space X=2N with the metric d((xn),(yn))=2−min{n:xn≠yn}. Suppose we want to imitate all functions g:X→{0,1} such that d(x,y)<ε implies g(x)=g(y). (I know this is not quite the same as the original question, but I think it is close enough.) If ε=2−n then there are N=22n such functions. So if we have a single function f:X×X→{0,1} that has all of them as fibers, then by the pigeonhole principle there is some ball of the form B(x,2−N+1) that contains two such fibers. But then if x1 and x2 are the two fibers, then there exists y such that f(x1,y)≠f(x2,y). It follows that if we want to choose ε′ such that d(x,y)<ε′ implies f(x)=f(y) (i.e. the analogue of the assumption on g but with ε replaced by ε′) then we need ε′≤2−N+2.

In conclusion, the required accuracy ε′ of f is doubly exponential with respect to the required accuracy ε of g. Thus, it is not feasible to implement such a function.

“Self-Reference and Fixed Points: A Discussion and an Extension of Lawvere’s Theorem” by Jorge Soto-Andrade and Francisco J. Varela seems like a potentially relevant result. In particular, they prove a converse Lawvere result in the category of posets (though they mention doing this for [0,1] in an unsolved problem.) I’m currently reading through this and related papers with an eye to adapting their construction to [0,1] (I think you can’t just use it straight-forwardly because even though you can build a reflexive domain with a retract to an arbitrary poset, the paper uses a different notion of continuity for posets.)

Can you argue that X must have a semi-metric compatible with the topology by using d(x,y)=supz∈X|h(x,z)−h(y,z)|?

I’m wondering if you can generalise this to some sort of argument that goes like this. Using X, project down via π from X0=X to X1=X0/d. Let ϕ be our initial surjection; it’s now a bijection between X1 and maps from X0 to [0,1].

If the projection is continuous, then every map from X1 to [0,1] lifts to a map from X0 to [0,1]. Restricting to the subset of maps that are lifts like this, and applying ϕ−1, gives a subset X2⊂X1. We now have a new equivalence relationship, maps from X1 that are equal to each other on X2. Project down from X2 by this relationship, to generate X3. Continue this transfinitely often (?) to generate a space X′ where ϕ is a homeomorphism, and find a contradiction?

## Formal Open Problem in Decision Theory

(This post was originally published on March 31st 2017, and has been brought forwarded as part of the AI Alignment Forum launch sequence on fixed points.)In this post, I present a new formal open problem. A positive answer would be valuable for decision theory research. A negative answer would be helpful, mostly for figuring out what is the closest we can get to a positive answer. I also give some motivation for the problem, and some partial progress.

Open Problem:Does there exist a topological space X (in some convenient category of topological spaces) such that there exists a continuous surjection from X to the space [0,1]X (of continuous functions from X to [0,1])?Motivation:Topological Naturalized Agents:Consider an agent who makes some observations and then takes an action. For simplicity, we assume there are only two possible actions, A and B. We also assume that the agent can randomize, so we can think of this agent as outputting a real number in [0,1], representing its probability of taking action A.Thus, we can think of an agent as having a policy which is a function from the space Y of possible observations to [0,1]. We will require that our agent behaves continuously as a function of its observations, so we will think of the space of all possible policies as the space of continuous functions from Y to [0,1], denoted [0,1]Y.

We will let X denote the space of all possible agents, and we will have a function f:X→[0,1]Y which takes in an agent, and outputs that agent’s policy.

Now, consider what happens when there are other agents in the environment. For simplicity, we will assume that our agent observes one other agent, and makes no other observations. Thus, we want to consider the case where Y=X, so f:X→[0,1]X.

We want f to be continuous, because we want a small change in an agent to correspond to a small change in the agent’s policy. This is particularly important since other agents will be implementing continuous functions on agents, and we would like any continuous function on policies to be able to be considered valid continuous function on agents.

We also want f to be surjective. This means that our space of agents is sufficiently rich that for any possible continuous policy, there is an agent in our space that implements that policy.

In order to meet all these criteria simultaneously, we need a space X of agents, and a continuous surjection f:X→[0,1]X.

Unifying Fixed Point Theorems:While we are primarily interested in the above motivation, there is another secondary motivation, which may be more compelling for those less interested in agent foundations.There are (at least) two main clusters of fixed point theorems that have come up many times in decision theory, and mathematics in general.

First, there is the Lawvere cluster of theorems. This includes the Lawvere fixed point theorem, the diagonal lemma, and the existence of Quines and fixed point combinators. These are used to prove Gödel’s incompleteness Theorem, Cantor’s Theorem, Löb’s Theorem, and achieve robust cooperation in the Prisoner’s Dilemma in modal framework and bounded variants. All of these can be seen as corollaries of Lawvere’s fixed point theorem, which states that in a cartesian closed category, if there is a point-surjective map f:X→YX, then every morphism g:Y→Y has a fixed point.

Second, there is the Brouwer cluster of theorems. This includes Brouwer’s fixed point theorem, The Kakutani fixed point theorem, Poincaré–Miranda, and the intermediate value theorem. These are used to prove the existence of Nash Equilibria, Logical Inductors, and Reflective Oracles.

If we had a topological space and a continuous surjection X→[0,1]X, this would allow us to prove the one-dimensional Brouwer fixed point theorem directly using the Lawvere fixed point theorem, and thus unify these two important clusters.

Thanks to Qiaochu Yuan for pointing out the connection to Lawvere’s fixed point theorem (and actually asking this question three years ago).

Partial Progress:Most Diagonalization Intuitions Do Not Apply:A common initial reaction to this question is to conjecture that such an X does not exist, due to cardinality or diagonalization intuitions. However, note that all of the diagonalization theorems pass through (some modification of) the same lemma: Lawvere’s fixed point theorem. However, this lemma does not apply here!For example, in the category of sets, the reason that there is no surjection from any set X to the power set, {T,F}X, is because if there were such a surjection, Lawvere’s fixed point theorem would imply that every function from {T,F} to itself has a fixed point (which is clearly not the case, since there is a function that swaps T and F).

However, we already know by Brouwer’s fixed point theorem that every continuous function from the interval [0,1] to itself has a fixed point, so the standard diagonalization intuitions do not work here.

Impossible if You Replace [0,1] with e.g. S1:This also provides a quick sanity check on attempts to construct an X. Any construction that would not be meaningfully different if the interval [0,1] is replaced with the circle S1 is doomed from the start. This is because a continuous surjection X→(S1)X would violate Lawvere’s fixed point theorem, since there is a continuous map from S1 to itself without fixed points.Impossible if you Require a Homeomorphism:When I first asked this question I asked for a homeomorphism between X and [0,1]X. Sam Eisenstat has given a very clever argument why this is impossible. You can read it here. In short, using a homeomorphism, you would be able to use Lawvere to construct a continuous map that send a function from [0,1] to itself to a fixed point of that function. However, no such continuous map exists.Notes:If you prefer not to think about the topology of [0,1]X, you can instead find a space X, and a continuous map h:X×X→[0,1], such that for every continuous function f:X→[0,1], there exists an xf∈X, such that for all x∈X, h(xf,x)=f(x).

Many of the details in the motivation could be different. I would like to see progress on similar questions. For example, you could add some computability condition to the space of functions. However, I am also very curious which way this specific question will go.

This post came out of many conversations, with many people, including: Sam, Qiaochu, Tsvi, Jessica, Patrick, Nate, Ryan, Marcello, Alex Mennen, Jack Gallagher, and James Cook.

This post was originally published on March 31st 2017, and has been brought forwarded as part of the AI Alignment Forum launch sequences.Tomorrow’s AIAF sequences post will be ‘Iterated Amplification and Distillation’ by Ajeya Cotra, in the sequence on iterated amplification.I have just now submitted an attempted solution to this problem to “Geometry and Topology”. I claim that the space X you are looking for is 2ω1 (ω1 being the least uncountable cardinal) with the ``generalised Cantor space topology”, that is for each countable well-ordered bit-string b you have a basic open set consisting of all bit-strings of length ω1 with b as an initial fragment. Since this topological space has quite a large cardinality I’m somewhat unclear whether this is helpful for your proposed application and would need to think about it more. (Matthew Barnett just now directed me to this post of yours.) I sent you an early draft of my paper, which argues the point in detail, on FB Messenger, and can send the latest version to you if you wish.

Let A′ be 2ω1 with generalised Cantor space topology, and A′′ be 2ω1 with product topology, X a closed disc in a finite-dimensional Euclidean space. Then there is a continuous surjection A′→XA′′. I don’t know how to show that there is a topological space A with carrier set 2ω1 and a continuous surjection A→XA. Thanks to Alex Mennen for pointing out the problem.

However, because topology on A′ is finer than topology on A′′ here, this still shows how the proof of the Lawvere fixed point theorem can be applied here to give Brouwer fixed point theorem as corollary, which could conceivably be a publishable result (see what “Geometry and Topology” think about that), and this could still be sorta kinda maybe relevant to Scott’s original motivation for looking at the problem (if you’re okay with working with two different topologies on the space of agents, one finer than the other). But this is a very big space of agents you’re talking about here.

Correction: need not only that topology on A′ is finer than topology on A′′, but also, given arbitrary open subset of X, take pre-image under evaluation map in XA′′×A′′, projection onto first factor and then pre-image of that under the continuous surjection A′→XA′′, it needs to be shown that this set is open in both topologies. I believe that this can indeed be done for an appropriate class of spaces X for the pair of topologies in question.

When I look at my post the LaTeX code isn’t formatting properly; if anyone can let me know how to fix that.

I fixed it. In our editor, use cmd-4/ctrl-4 to do LaTex, not dollar signs. (The thing you did would work in the markdown editor – you can go into settings to change to that editor if you’d like.)

From discussions I had with Sam, Scott, and Jack:

To solve the problem, it would suffice to find a reflexive domain X with a retract onto [0,1].

This is because if you have a reflexive domain X, that is, an X with a continuous surjective map f::X→XX, and A is a retract of X, then there’s also a continuous surjective map g::X→AX.

Proof: If A is a retract of X then we have a retraction r::X→A and a section s::A→X with r∘s=1A. Construct g(x):=r∘f(x). To show that g is a surjection consider an arbitrary q∈AX. Thus, s∘q::X→X. Since f is a surjection there must be some x with f(x)=s∘q. It follows that g(x)=r∘f(x)=r∘s∘q=q. Since q was arbitrary, g is also a surjection.

A small note: it’s not hard to construct spaces that are a bit too big, or a bit too small (raising the possibility that a true X lies between them).

For instance, if I is the unit interval, then we can map I onto the countable hypercube Iω ( https://en.wikipedia.org/wiki/Space-filling_curve#The_Hahn.E2.80.93Mazurkiewicz_theorem ). Then if we pick an ordering of the dimensions of the hypercube and an ordering of Q∩I, we can see any element of Iω - hence any element of I - as a function from Q∩I to I.

Let C(I) be the space of continuous functions I→I. Then any element of C(I) defines a unique function Q∩I→I (the converse is not true—most functions Q∩I→I do not correspond to continuous functions I→I). Pulling C(I) back to I via Iω we define the set Y⊂I.

Thus Y maps surjectively onto C(I). However, though C(I) maps into C(Y) by restriction (any function from I is a function from Y), this map is not onto (for example, there are more continuous functions from I−{1/2} than there are from I, because of the potential discontinuity at 1/2).

Now, there are elements of I−Y that map to functions in C(Y) but not in C(I). So there’s a hope that there may exist an X with Y⊂X⊂I, C(I)⊂C(X)⊂C(Y), and X mapping onto C(X). Basically, as X `gets bigger’, its image in C(Y) grows, while C(X) itself shrinks, and hopefully they’ll meet.

I think you made a mistake here

The Hahn–Mazurkiewicz theorem states that

I will agree that Iwis connected and locally connected. I’m not sure if its second countable. It is not compact.

Just to be clear Iw={(x1,x2,⋯)∀i:xi∈[0,1]]} Now let Hi={x∈Iw | xi<0.6} . Clearly each Hi is open. Let G={x∈Iw | ∀i:xi>0.4} And F={G,H1,H2,⋯}. Now clearly this family covers all of Iw. However, remove any Hi from F and x=(i0.9,0.9,⋯0.9,0.1,0.9,0.9,⋯) is no longer covered. So F is a family of open sets, which cover Iw and don’t have any finite subcover.

Hum, Iω should be compact by Tychonoff’s theorem (see also the Hilbert Cube, which is homeomorphic to Iω).

For your proof, I think that G is not open in the product topology. The product topology is the coarsest topology where all the projection maps are continuous.

To make all the projection maps continuous we need all sets in S to be open, where we define σ∈S iff there exists an i, such that pi(σ) is open in [0,1] and σ={x=(x1,x2,…)|xi∈pi(σ),0≤xj≤1 for i≠j}.

Let S′ be the set of finite intersection of these sets. For any σ′∈S′, there exists a finite set Nσ′⊂N such that if x∈σ′ and yi=xi for i∈Nσ′, then y∈σ′ as well.

If we take S′′ to be the arbitrary union of S′, this condition will be preserved. Thus G is not contained in the arbitrary unions and finite intersections of S, so it seems it is not an open sent.

Also, Iω is second-countable. From the wikipedia article on second-countable:

I’ve figured out the difference, I was using the box topology https://en.wikipedia.org/wiki/Box_topology , while you were using the https://en.wikipedia.org/wiki/Product_topology.

You are correct. I knew about finite topological products and made a natural generalization, but it turns out not to be the standard meaning of Iw.

Thanks for introducing me to the box topology—seeing it defined so explicitly, and seeing what properties it fails, cleared up a few of my intuitions.

If I were studying this, I would be looking at domain theory, in which (among other things) there has been found a topological space X homeomorphic with XX. The page I linked links to some notes at the bottom. (h/t Qiaochu for pointing out domain theory)

When thinking about agents, the first motivation might not quite work out. Small changes in observation might introduce discontinuous changes in policy—e.g. in the Matching Pennies game. Suppose there are agents (functions) in X that output a fixed P(Heads), no matter their input. If you can continuously vary P(heads) by moving in X, then Matching Pennies play will be discontinuous at P(Heads)=0.5. So right away you’ve committed to some unusual behavior for the agents in X by asking for continuity—they can’t play perfect Matching Pennies at the very least.

EDIT: This idea iswrong: https://www.lesswrong.com/posts/eqi83c2nNSX7TFSfW/no-surjection-onto-function-space-for-manifold-xOk, here’s an idea for constructing such a map, with a few key details left unproven; let me know if people see any immediate flaws in the approach, before I spend time filling in the holes.

Let X be a countable collection of open intervals (eg X={x∈R,x∉N}), given the usual topology. Let I=[0,1] be the closed unit interval, and C(X,I) the set of continuous functions from X to I. Give C(X,I) the compact-open topology.

By the properties of the compact-open topology, since I is T3.5 (Tychonoff), then so is C(X,I). I’m hoping that the proof can be extended, at least in this case, to show that C(X,I) is T4 (normal Haussdorff).

It seems clear that C(X,I) is second-countable: let V(K,U) consist of all functions that map K into U, where K⊂X is the intersection of X with a closed interval with rational endpoints, and U⊂I is an open interval with rational endpoints. The set of all such V(K,U) is countable, and forms a subbasis of C(X,I). A countable subbasis means a countable basis, as the set of finite subsets of countable set, is itself countable.

If C(X,I) is T4 and second countable, then it is homeomorphic to a subset of the Hilbert Cube. To simplify notation, we will identify C(X,I) with its image in the Hilbert Cube.

Take the closure ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯C(X,I) of C(X,I) within the Hilbert Cube. This closure is compact and second countable (since the Hilbert Cube itself is both). It seems clear that C(X,I) is connected and locally connected; connected will extend to the closure, we’ll need to prove that locally connected does as well.

Then we can apply the Hahn-Mazurkiewicz theorem:

A non-empty Hausdorff topological space is a continuous image of the unit interval if and only if it is a compact, connected, locally connected second-countable space.

So there is a continuous surjection ϕ:I→¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯C(X,I). Pull back C(X,I), defining ϕ−1(C(X,I))⊂I. If C(X,I) is open in ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯C(X,I) (with the subspace topology), then ϕ−1(C(X,I)) is open in I. Even if ϕ−1(C(X,I)) is not open, we can hope that, at worst, it consists of a countable collection of points, open intervals, half-closed intervals, and closed intervals (this is not a general property of subsets of the interval, cf the Cantor set, but it feels very likely that it will apply here).

In that case, these is a continuous surjection s from X to ϕ−1(C(X,I)), mapping each open interval to one of the points or intervals (“folding” over the ends when mapping to those with closed end-points).

Then ϕ∘s:X→C(X,I) is the continuous surjection we are looking for.

Note: I’m thinking now that C(X,I) might not be connected, but this would not be a problem as long as it has a countable number of connected components.

I am going to take some license with your question because I think you are asking the wrong question. Arbitrary topological spaces and abstract continuity are rarely the right notions in real-world situations. Rather, uniform continuity on bounded sets usually better corresponds to the intuitive notion of “a small change in input produces a small change in output”.

Thus, suppose that X is a complete separable metric space and that f:X×X→[0,1] is uniformly continuous on bounded sets. Then we can show that there exists a function g:X→[0,1] which is uniformly continuous on bounded sets but not a fiber of f (i.e. there is no x such that f(x,y)=g(y) for all y). Indeed, consider two cases:

X is uniformly discrete. Then every map from X to [0,1] is uniformly continuous, so we get a contradiction from cardinality considerations.

X is not uniformly discrete. Then for each n, since f is uniformly continuous on B(0,n) it has a modulus of continuity on this set, i.e. a continuous increasing function hn:(0,∞)→(0,∞) such that d(f(x),f(y))<hn(d(x,y)) for all x,y∈B(0,n)⊆X×X. Since X is not uniformly discrete, there is a function g:X→[0,1] such that for infinitely many n, there exist x,y such that d(g(x),g(y))>hn(d(x,y)). (To construct it, take pairs (xn,yn) with hn(d(xn,yn))<2−n, extract a subsequence that behaves geometrically nicely, and then find a function g such that d(g(xn),g(yn))>2−n for all n in the subsequence.) Clearly, g cannot be a fiber of f.

g can be a fiber of f, since for each n, xn and yn could be distance greater than n from the basepoint.

Example: let X:={xn,yn|n∈N}, with d(xn,xm)=d(yn,ym)=d(xn,ym)=2|n−m| for n≠m and d(xn,yn)=6−n. Let x0 be the basepoint (so that (x0,x0) is the point you were calling “0”). Let g(xn):=0, g(yn):=1, f(z,w):=g(z), and hn(r):=3nr.

I also don’t see how to even construct the function g, or, relatedly, what you mean by “geometrically nicely”, but I guess it doesn’t matter.

Also, I’m not convinced that metric spaces with uniform continuity on bounded subsets is a better framework than topological spaces with continuity.

This is intended as a reply to David Simmons’s comment, which for some reason I can’t reply to directly.

In the new version of your proof, how do we know Yk isn’t too close to Xl for some l>k? And how do we know that g is uniformly continuous on bounded subsets?

About continuity versus uniform continuity on bounded sets:

It seems to me that your point 1 is just a pithier version of your point 4, and that these points support paying attention to uniform continuity, rather than uniform continuity restricted to bounded sets. This version of the problem seems like it would be a less messy version of the “fixed modulus of continuity” version of the problem you mentioned (which I did not understand your solution to, but will look again later).

I’m not sure what you’re getting at about singularities in point 3. I wouldn’t have asked why you were considering uniform continuity on bounded sets instead of uniform continuity away from singularities (in fact, I don’t know what that means). I would ask, though, why uniform continuity on bounded sets instead of uniform continuity on compact sets? As you point out, the latter is the same as continuity.

Your point 2 is completely wrong, and in fact this is the primary reason I was convinced that continuity is a better thing to pay attention to than uniform continuity on bounded sets. The type of object you are describing is an effective Polish space that remembers its metric. Typically, descriptive set theorists forget the metric, and the isomorphisms between Polish spaces are homeomorphisms (and the isomorphisms between effective Polish spaces are computable homeomorphisms). Changing the metric in a computably homeomorphic way does not change what can be done when points are represented as descending chains of basic open sets with singleton intersection. So the thing you described was really topological rather than metric in nature, even though it involves introducing a metric in the setup. I am not aware of any notions of computability in metric spaces in which the metric matters in the way you are suggesting. It is not true that uniform continuity gives you an algorithm for computing the function. As a counterexample, let a be an uncomputable number, and let f:R→R be given by f(x)=a for every x. f is clearly uniformly continuous, but not computable. It is also not true that uniform continuity is necessary in order for the function to be computable. For instance, sin(1/x) is computable on (0,1]. Of course, (0,1] is not complete, but for another example, consider an effective infinite-dimensional Hilbert space, and let {en|n∈N} be an effective orthonormal basis. Let fn(x):=max(0,12−∥x−en∥). This sequence of functions is computable, they have disjoint support, and for any point, a sufficiently small neighborhood around it will be disjoint from the supports of all but at most one of these functions. Thus f(x):=∑nnfn(x) is computable, but of course it is not uniformly continuous on the unit ball, which is bounded. However, it is true that every computable function is continuous, and conversely, every continuous function is computable with respect to some oracle. Of course, we really want computability, not computability with respect to some oracle, but this still seems to show that continuity is at least a good metaphor for computability, whereas uniform continuity on bounded sets doesn’t seem so to me.

Of course, all this about continuity as a metaphor for computability makes the most sense in the context of Polish spaces, and we can only talk about actual computability in the context of effective Polish spaces. Scott’s problem involves a space X and the exponential [0,1]X. If X is a locally compact Polish space, then [0,1]X is also Polish. I think that this might be necessary (that is, if X and [0,1]X are Polish, then X is locally compact), although I’m not sure. If so, and if your proof is correct, it seems plausible that your proof could be adapted to show that there is no locally compact Polish space with the property that Scott was looking for, and that would show that there is no solution to the problem in which X and [0,1]X are both Polish spaces, and hence no computable solution, if computability is formalized as in effective descriptive set theory.

I don’t know why my comment doesn’t have a reply button. Maybe it is related to the fact that my comment shows up as “deleted” when I am not logged in.

Sorry, I seem to be getting a little lazy with these proofs. Hopefully I haven’t missed anything this time.

New proof: … We can extract a subsequence (nk) such that if Xk=xnk and Yk=ynk, then d(Xk+1,Yk+1)≤(1/6)d(Xk,Yk) for all k, and for all k and l>k, either (A) d(Xk,Yl)≥(1/3)d(Xk,Yk) and d(Xk,Xl)≥(1/3)d(Xk,Yk) or (B) d(Yk,Xl)≥(1/3)d(Xk,Yk) and d(Yk,Yl)≥(1/3)d(Xk,Yk). By extracting a further subsequence we can assume that which of (A) or (B) holds depends only on k and not on l. By swapping Xk and Yk if necessary we can assume that case (A) always holds.

Lemma: For each z there is at most one k such that d(z,Xk)≤(1/6)d(Xk,Yk).

Proof: Suppose d(z,Xk)≤(1/6)d(Xk,Yk) and d(z,Xl)≤(1/6)d(Xl,Yl), with k<l. Then d(Xk,Xl)<(1/3)d(Xk,Yk), a contradiction.

It follows that by extracting a further subsequence we can assume that d(Yk,Xl)≥(1/6)d(Xl,Yl) for all l>k.

Now let j:[0,∞)→[0,∞) be an increasing uniformly continuous function such that j(0)=0 and j((1/6)d(Xk,Yk))>2−nk for all k. Finally, let g(x)=infkj(d(x,Yk)). Then for all k we have g(Yk)=0. On the other hand, for all k<l we have d(Xk,Yl)≥(1/3)d(Xk,Yk), for k=l we have d(Xk,Yl)=d(Xk,Yk), and for k>l we have d(Xk,Yl)≥(1/6)d(Xk,Yk). Thus g(Xk)=inflj(d(Xk,Yl))≥j((1/6)d(Xk,Yk))>2−nk. Clearly, g cannot be a fiber of f. Moreover, since the functions j and x↦infkd(x,Yk) are both uniformly continuous, so is g.

Regarding your responses to my points:

I guess I don’t disagree with what you write regarding my points 1 and 4.

It seems to be harder than expected to explain my intuitions regarding singularities in point 3. Basically, I think the reasons that abstract continuity came to be considered standard are mostly the fact that in concrete applications you have to worry about singularities, and this makes uniform continuity a little more technically annoying. But in the kind of problem we are considering here, it seems that continuity is really more annoying to deal with than uniform continuity, with little added benefit. I guess it also depends on what kinds of functions you expect to actually come up, which is a heuristic judgement. Anyway it might not be productive to continue this line of reasoning further as maybe our disagreements just come down to intuitions.

Regarding my point 2, I wasn’t very clear when I said that uniform continuity gives you an algorithm, what I meant was that if you have an algorithm for computing the images of points in the dense sequence and for computing the modulus of continuity function, then uniform continuity gives you an algorithm. The function x↦sin(1/x) would be the kind of thing I would handle with uniform continuity away from singularities (to fix a definition for this, let us say that you are uniformly continuous away from singularities if you are uniformly continuous on sets of the form B(0,n)∖B(S,1/n), where S is some set of singularities).

In your definition of fn, I think you mean to write max instead of min. But I see your point, though the example seems a little pathological to me.

Anyway, it seems that you agree that it makes sense to restrict to Polish spaces based on computability considerations, which is part of what I was trying to say in 2.

If you have a locally compact Polish space, then you can find a metric with respect to which the space is proper (i.e. bounded subsets are compact): let d′(x,y)=d(x,y)+|f(x)−f(y)|, where f(x)=1/sup{r:B(x,r) is compact}. With respect to this metric, continuity is the same as uniform continuity on bounded sets, so my proof should work then.

Proposition: Let X be a Polish space that is not locally compact. Then [0,1]X (with the compact-open topology) is not first countable.

Proof: Suppose otherwise. Then the function f0≡0 has a countable neighborhood basis of sets of the form F(Kn,Un)={f:f(Kn)⊆Un} where Kn⊆X is compact and Un⊆[0,1] is open. Since X is not locally compact, there exists a point x such that B(x,r) is not compact for any x. For each n, we can choose xn∈B(x,1/n)∖⋃i≤nKi. Let K={xn:n}∪{x}, and note that K is compact. Then F(K,[0,1/2)) is a neighborhood of f0. But then F(K,[0,1/2))⊇⋂i≤nF(Ki,Ui) for some n. This contradicts the fact that xn∈K∖⋃i≤nKi, since we can find a bump function which is 0 on ⋃i≤nKi but 1 at x.

It does still seem to me that most of the useful intuition comes from point 4 of my previous comment, though.

It appears that comments from new users are collapsed by default, and cannot be replied to without a “Like”. These seem like bad features.

Your proof that there’s no uniformly continuous on bounded sets function f:X×X→[0,1] admitting all uniformly continuous on bounded sets functions X→[0,1] as fibers looks correct now. It also looks like it can be easily adapted to show that there is no uniformly continuous f:X×X→[0,1] admitting all uniformly continuous functions X→[0,1] as fibers. Come to think of it, your proof works for arbitrary metric spaces X, not just complete separable metric spaces, though those are nicer.

I see what you mean now about uniform continuity giving you an algorithm, but I still don’t think that’s specific to uniform continuity in an important way. After all, if you have an algorithm for computing images of points in the countable dense set, and a computable “local modulus of continuity” in the sense of a computable function h:X×[0,∞)→[0,∞) with h(x,0)=0 and d(x,y)<r⟹|f(y)−f(x)|<h(x,r), then f is computable, and this does not require f to be uniformly continuous. Although I suppose you could object that this is a bit circular, in that I’m assuming the “local modulus of continuity” is computable only in the standard sense, which does not require uniform continuity.

I’m not sure why you would allow singularities at some points (presumably a uniformly discrete set, or something like that) while still insisting on uniform continuity elsewhere. It still seems to me that the arguments for uniform continuity rather than continuity all point to wanting uniform continuity entirely, rather than some sense of local uniform continuity in most places.

Thanks for pointing out the error in my definition of fn; I’ve fixed it.

In your argument that locally compact Polish spaces can be given metrics with respect to which they are proper, it isn’t true that d′ is necessarily a proper metric. For instance, consider a countably infinite set with d(x,y)=1 for x≠y. This is a locally compact Polish space, but f(x)=1 for every x, so d′=d, and the space is not proper.

Your last proposition looks correct (though with a typo: last ⊆ in the proof should be ⊇). However, if X is not locally compact, then the compact-open topology isn’t necessarily the right topology to consider on [0,1]X. We want a topology making [0,1]X into an exponential object, and it isn’t clear that such a topology even exists, or that it is the compact-open topology if it does exist (though it must be a refinement of the compact-open topology if it does exist). Maybe asking about non-locally compact Polish spaces X with a Polish exponential space [0,1]X is a kind of weird question, though, and if we’re even considering non-locally compact Polish spaces, we should turn to the version of the question where we just want a continuous function X×X→[0,1] admitting all continuous functions X→[0,1] as fibers.

I will have to think more about the issue of continuity vs uniform continuity. I suppose my last remaining argument would be the fact that Bishop—Bridges’ classic book on constructive analysis uses uniform continuity on bounded sets rather than continuity, which suggests that it is probably better for constructive analysis at least. But maybe they did not analyze the issue carefully enough, or maybe the relevant issues here are for some reason different.

To fix the argument that every locally compact Polish space admits a proper metric, let f be as before and let F(x,y)=∞ if d(x,y)≥f(x) and F(x,y)=f(x)/[f(x)−d(x,y)] if d(x,y)<f(x). Next, let g(y)=minn[n+F(xn,y)], where (xn) is a countable dense sequence. Then g is continuous and everywhere finite. Moreover, if S=g−1([0,N]), then S⊆⋃n≤NB(xn,(1−1/N)f(xn)) and thus S is compact. It follows that the metric d′(x,y)=d(x,y)+|g(y)−g(x)| is proper.

Anyway I have fixed the typo in my previous post.

Hm, perhaps I should figure out what the significance of uniform continuity on bounded sets is in constructive analysis before dismissing it, even though I don’t see the appeal myself, since constructive analysis is not a field I know much about, but could potentially be relevant here.

f is the reciprocal of what it was before, but yes, this looks good. I am happy with this proof.

Ah, you’re right. The proof can be fixed by changing the division between the two cases. So here is the new proof, with more details added regarding the construction of g:

B(0,m) is uniformly discrete for all m. Then every map from X to [0,1] is uniformly continuous on bounded sets, so we get a contradiction from cardinality considerations.

B(0,m) is not uniformly discrete for some m. Then for each n≥m, since f is uniformly continuous on B(0,n) it has a modulus of continuity on this set, i.e. a continuous increasing function hn:(0,∞)→(0,∞) such that d(f(x),f(y))<hn(d(x,y)) for all x,y∈B(0,n)⊆X×X. Since B(0,m) is not uniformly discrete, there exist xn,yn∈B(0,m) such that hn(d(xn,yn))<2−n and d(xn,yn)<2−n. We can extract a subsequence (nk) such that if Xk=xnk and Yk=ynk, then d(Xk+1,Yk+1)≤(1/3)d(Xk,Yk) for all k, and for all k and ℓ>k, either (A) d(Xk,Yℓ)≥(1/3)d(Xk,Yk) or (B) d(Yk,Xℓ)≥(1/3)d(Xk,Yk). By extracting a further subsequence we can assume that which of (A) or (B) holds depends only on k and not on ℓ. By swapping Xk and Yk if necessary we can assume that case (A) always holds. Now let j:[0,∞)→[0,∞) be an increasing continuous function such that j((1/3)d(Xk,Yk))>2−nk for all k. Finally, let g(y)=infkj(d(Xk,y)). Then for all k we have g(Xk)=0 but g(Yk)>2−nk. Clearly, g cannot be a fiber of f.

Regarding the appropriateness of metric spaces / uniform continuity rather than topological spaces / abstract continuity, here are some of the reasons behind my intuition here (developed working in mathematical analysis, specifically Diophantine approximation, and also constructive mathematics):

The obvious: metric spaces are explicitly meant to represent the intuitive notion of alikeness as a quantitative concept (i.e. distance), whereas topological spaces have no explicit notion of alikeness.

In computability theory, one is interested in the question of how to computationally represent a point or an approximation to a point in a space. The standard way to do this is via restricting to the class of complete separable metric spaces, fixing a countable dense sequence (xn) (assumed to be representative of the structure of the metric space), and defining a computational approximation to a point to be an expression of the form B(xn,1/m). Since n and m are integers this expression can be coded as finite data. One then defines a computational encoding of a point to be an infinite bitstream consisting of computational approximations that converge to the point.

In practical applications, in the end you will want everything to be computable. So it makes sense to work in a framework where there are natural notions of computability. I am not aware of any such notions for general topological spaces.

Regarding continuity vs uniform continuity in metric spaces, both are saying that if two points are close in the domain, their images are also close. But the latter gives you a straightforward estimate as to how close, whereas the former says that the degree of closeness may depend on one of the points. Now, there are good reasons to consider such dependence, since even natural functions on the real numbers (such as x2 or 1/x) have “singularities” where they are not uniformly continuous.

So the question is whether to modify the notion of uniform continuity to directly account for singularities, or to use the standard definition of continuity instead. But if one works with the standard definition, then most of the time one is really looking for ways to sneak back to uniform continuity, e.g. by using the fact that a continuous function on a compact set is uniformly continuous.

An intuitive way of thinking about the fact that a continuous function on a compact set is uniformly continuous is that the notion of compactness means that there are no singularities present “within the space”. For example, if we go back to the functions x2 or 1/x, then the singularity of the first occurs at infinity, while the singularity of the latter occurs at 0. If we take a compact subset of the domain of either function, then what it really means is that we are avoiding the singularity.

By contrast, non-compactness should mean that there are singularities. In some cases like (0,1) it is easy to identify what the singularities are. But if we are dealing with spaces that are not locally compact like NN or an infinite-dimensional Hilbert space, then it is not as clear what the singularities are, there is just a general sense that they are dispersed “throughout the space” (because the space is not not locally compact).

But you have to ask yourself, are these singularities real or just imagined? In many cases, imagined. For example, in the theory of Banach spaces continuous linear maps are always uniformly continuous.

What about a map that is not uniformly continuous, like the inversion map f(x)=x/∥x∥2 in infinite-dimensional Hilbert space? In this case, there is still a singularity—at 0 -- and the definition of continuity needs to reflect that. But it doesn’t help to imagine all sorts of other singularities dispersed throughout the space, because that prevents you from making useful statements like: if x,y are at least α away from 0 and d(x,y)≤ε, then d(f(x),f(y))≤Cε/α2, where C is an absolute constant.

Now the example in the previous paragraph is an example of quantitative continuity, which is stronger than uniform continuity away from singularities. But the point is that it can be seen as an extension of uniform continuity away from singularities.

Maybe my last reason will be the most relevant from a naturalized agent perspective. The notion of uniform continuity is important because it introduces the modulus of continuity, which can be viewed as a measure of how continuous a function is. The restriction that an agent must be uniformly continuous can be then thought of in a quantitative sense, with “better” agents less having to follow this restriction. So a more powerful agent may have a looser (larger) modulus of continuity, because it can react more precisely to different possible inputs.

In this terminology, my proof can be thought of as giving an intuitive reason for why the agent cannot implement every possible policy: the agent has limited resources to distinguish different inputs, so it can only implement those policies that can be implemented with these limited resources.

The obvious followup question would be whether if you restrict your attention to the policies that the agent isn’t prevented from implementing due to its limited resources, then can it implement every possible policy? Or in other words, if you fix a modulus of continuity from the outset, can you include all functions with that modulus of continuity as fibers?

If you allow the every-policy function to have an arbitrary modulus of continuity unrelated to the modulus of continuity you are trying to imitate, then it is not hard to see that this is possible at least for some spaces. (By Arzela-Ascoli the space of functions with a fixed modulus of continuity is compact, so there exists a continuous surjection from 2N to this space.) But this may require greatly increasing the resources that the agent must spend to differentiate inputs. On the other hand, requiring the exact same modulus of continuity seems like too rigid an assumption. So the right question is probably to ask how close can the modulus of continuity of the every-policy function be to the modulus it is trying to imitate.

For this kind of question it is probably better to work with a concrete example rather than trying to prove something in generality, so I will work with the Cantor space X=2N with the metric d((xn),(yn))=2−min{n:xn≠yn}. Suppose we want to imitate all functions g:X→{0,1} such that d(x,y)<ε implies g(x)=g(y). (I know this is not quite the same as the original question, but I think it is close enough.) If ε=2−n then there are N=22n such functions. So if we have a single function f:X×X→{0,1} that has all of them as fibers, then by the pigeonhole principle there is some ball of the form B(x,2−N+1) that contains two such fibers. But then if x1 and x2 are the two fibers, then there exists y such that f(x1,y)≠f(x2,y). It follows that if we want to choose ε′ such that d(x,y)<ε′ implies f(x)=f(y) (i.e. the analogue of the assumption on g but with ε replaced by ε′) then we need ε′≤2−N+2.

In conclusion, the required accuracy ε′ of f is doubly exponential with respect to the required accuracy ε of g. Thus, it is not feasible to implement such a function.

I give a stronger version of this problem here.

“Self-Reference and Fixed Points: A Discussion and an Extension of Lawvere’s Theorem” by Jorge Soto-Andrade and Francisco J. Varela seems like a potentially relevant result. In particular, they prove a converse Lawvere result in the category of posets (though they mention doing this for [0,1] in an unsolved problem.) I’m currently reading through this and related papers with an eye to adapting their construction to [0,1] (I think you can’t just use it straight-forwardly because even though you can build a reflexive domain with a retract to an arbitrary poset, the paper uses a different notion of continuity for posets.)

Can you argue that X must have a semi-metric compatible with the topology by using d(x,y)=supz∈X|h(x,z)−h(y,z)|?

I’m wondering if you can generalise this to some sort of argument that goes like this. Using X, project down via π from X0=X to X1=X0/d. Let ϕ be our initial surjection; it’s now a bijection between X1 and maps from X0 to [0,1].

If the projection is continuous, then every map from X1 to [0,1] lifts to a map from X0 to [0,1]. Restricting to the subset of maps that are lifts like this, and applying ϕ−1, gives a subset X2⊂X1. We now have a new equivalence relationship, maps from X1 that are equal to each other on X2. Project down from X2 by this relationship, to generate X3. Continue this transfinitely often (?) to generate a space X′ where ϕ is a homeomorphism, and find a contradiction?

This feels dubious, but maybe worth mentioning...