“Controlling which Everett branch you end up in” is the wrong way to think about decisions, even if many-worlds is true. Brains don’t appear to rely much on quantum randomness, so if you make a certain decision, that probably means that the overwhelming majority of identical copies of you make the same decision. You aren’t controlling which copy you are; you’re controlling what all of the copies do. And even if quantum randomness does end of mattering in decisions, so that a non-trivial proportion of copies of you make different decisions from each other, then you would still presumably want a high proportion of them to make good decisions; you can do your part to bring that about by making good decisions yourself.
Consider reading a real physicist’s take on the issue
This seems phrased to suggest that her view is “the real physicist view” on the multiverse. You could also read what Max Tegmark or David Deutsch, for instance, have to say about multiverse hypotheses and get a “real physicist’s” view from them.
Also, she doesn’t actually say much in that blog post. She points out that when she says that multiverse hypotheses are unscientific, she doesn’t mean that they’re false, so this doesn’t seem especially useful to someone who wants to know whether there actually is a multiverse, or is interested in the consequences thereof. She says “there is no reason to think we live in such multiverses to begin with”, but proponents of multiverse hypotheses have given reasons to support their views, which she doesn’t address.
#1 (at the end) sounds like complexity theory.
Some of what von Neumann says makes it sound like he’s interested in a mathematical foundation for analog computing, which I think has been done by now.
On several occasions, the authors emphasize how the intuitive nature of “effective computability” renders futile any attempt to formalize the thesis. However, I’m rather interested in formalizing intuitive concepts and therefore wondered why this hasn’t been attempted.
Formalizing the intuitive notion of effective computability was exactly what Turing was trying to do when he introduced Turing machines, and Turing’s thesis claims that his attempt was successful. If you come up with a new formalization of effective computability and prove it equivalent to Turing computability, then in order to use this as a proof of Turing’s thesis, you would need to argue that your new formalization is correct. But such an argument would inevitably be informal, since it links a formal concept to an informal concept, and there already have been informal arguments for Turing’s thesis, so I don’t think there is anything really fundamental to be gained from this.
Consider the halting set; … is not enumerable / computable. …Here, we should be careful with how we interpret “information”. After all, coNP-complete problems are trivially Cook reducible to their NP-complete counterparts (e.g., query the oracle and then negate the output), but many believe that there isn’t a corresponding Karp reduction (where we do a polynomial amount of computation before querying the oracle and returning its answer). Since we aren’t considering complexity but instead whether it’s enumerable at all, complementation is fine.
You’re using the word “enumerable” in a nonstandard way here, which might indicate that you’ve missed something (and if not, then perhaps at least this will be useful for someone else reading this). “Enumerable” is not usually used as a synonym for computable. A set is computable if there is a program that determines whether or not its input is in the set. But a set is enumerable if there is a program that halts if its input is in the set, and does not halt otherwise. Every computable set is enumerable (since you can just use the output of the computation to decide whether or not to halt). But the halting set is an example of a set that is enumerable but not computable (it is enumerable because you can just run the program coded by your input, and halt if/when it halts). Enumerable sets are not closed under complementation; in fact, an enumerable set whose complement is enumerable is computable (because you can run the programs for the set and its complement in parallel on the same input; eventually one of them will halt, which will tell you whether or not the input is in the set).
The distinction between Cook and Karp reductions remains meaningful when “polynomial-time” is replaced by “Turing computable” in the definitions. Any set that an enumerable set is Turing-Karp reducible to is also enumerable, but an enumerable set is Turing-Cook reducible to its complement.
The reason “enumerable” is used for this concept is that a set is enumerable iff there is a program computing a sequence that enumerates every element of the set. Given a program that halts on exactly the elements of a given set, you can construct an enumeration of the set by running your program on every input in parallel, and adding an element to the end of your sequence whenever the program halts on that input. Conversely, given an enumeration of a set, you can construct a program that halts on elements of the set by going through the sequence and halting whenever you find your input.
I don’t follow the analogy to 1/x being a partial function that you’re getting at.
Maybe a better way to explain what I’m getting at is that it’s really the same issue that I pointed out for the two-envelopes problem, where you know the amount of money in each envelope is finite, but the uniform distribution up to an infinite surreal would suggest that the probability that the amount of money is finite is infinitesimal. Suppose you say that the size of the ray [0,∞) is an infinite surreal number n. The size of the portion of this ray that is distance at least r from 0 is n−r when r is a positive real, so presumably you would also want this to be so for surreal r. But using, say, r:=√n, every point in [0,∞) is within distance √n of 0, but this rule would say that the measure of the portion of the ray that is farther than √n from 0 is n−√n; that is, almost all of the measure of [0,∞) is concentrated on the empty set.
The latter. It doesn’t even make sense to speak of maximizing the expectation of an unbounded utility function, because unbounded functions don’t even have expectations with respect to all probability distributions.
There is a way out of this that you could take, which is to only insist that the utility function has to have an expectation with respect to probability distributions in some restricted class, if you know your options are all going to be from that restricted class. I don’t find this very satisfying, but it works. And it offers its own solution to Pascal’s mugging, by insisting that any outcome whose utility is on the scale of 3^^^3 has prior probability on the scale of 1/(3^^^3) or lower.
It’s a bad bullet to bite. Its symmetries are essential to what makes Euclidean space interesting.
And here’s another one: are you not bothered by the lack of countable additivity? Suppose you say that the volume of Euclidean space is some surreal number n. Euclidean space is the union of an increasing sequence of balls. The volumes of these balls are all finite, in particular, less than n2, so how can you justify saying that their union has volume greater than n2?
Why? Plain sequences are a perfectly natural object of study. I’ll echo gjm’s criticism that you seem to be trying to “resolve” paradoxes by changing the definitions of the words people use so that they refer to unnatural concepts that have been gerrymandered to fit your solution, while refusing to talk about the natural concepts that people actually care about.
I don’t think think your proposal is a good one for indexed sequences either. It is pretty weird that shifting the indices of your sequence over by 1 could change the size of the sequence.
What about rotations, and the fact that we’re talking about destroying a bunch of symmetry of the plane?
There are measurable sets whose volumes will not be preserved if you try to measure them with surreal numbers. For example, consider [0,∞)⊆R. Say its measure is some infinite surreal number n. The volume-preserving left-shift operation x↦x−1 sends [0,∞) to [−1,∞), which has measure 1+n, since [−1,0) has measure 1. You can do essentially the same thing in higher dimensions, and the shift operation in two dimensions ((x,y)↦(x−1,y)) can be expressed as the composition of two rotations, so rotations can’t be volume-preserving either. And since different rotations will have to fail to preserve volumes in different ways, this will break symmetries of the plane.
I wouldn’t say that volume-preserving transformations fail to preserve volume on non-measurable sets, just that non-measurable sets don’t even have measures that could be preserved or not preserved. Failing to preserve measures of sets that you have assigned measures to is entirely different. Non-measurable sets also don’t arise in mathematical practice; half-spaces do. I’m also skeptical of the existence of non-measurable sets, but the non-existence of non-measurable sets is a far bolder claim than anything else I’ve said here.
Indeed Pascal’s Mugging type issues are already present with the more standard infinities.
Right, infinity of any kind (surreal or otherwise) doesn’t belong in decision theory.
“Surreal numbers are not the right tool for measuring the volume of Euclidean space or the duration of forever”—why?
How would you? If you do something like taking an increasing sequence of bounded subsets that fill up the space you’re trying to measure, find a formula f(n) for the volume of the nth subset, and plug in f(ω), the result will be highly dependent on which increasing sequence of bounded subsets you use. Did you have a different proposal? It’s sort of hard to explain why no method for measuring volumes using surreal numbers can possibly work well, though I am confident it is true. At the very least, volume-preserving transformations like shifting everything 1 meter to the left or rotating everything around some axis cease to be volume-preserving, though I don’t know if you’d find this convincing.
You want to conceive of this problem as “a sequence whose order-type is ω”, but from the surreal perspective this lacks resolution. Is the number of elements (surreal) ω, ω+1 or ω+1000? All of these are possible given that in the ordinals 1+ω=ω so we can add arbitrarily many numbers to the start of a sequence without changing its order type.
It seems to me that measuring the lengths of sequences with surreals rather than ordinals is introducing fake resolution that shouldn’t be there. If you start with an infinite constant sequence 1,1,1,1,1,1,..., and tell me the sequence has size ω, and then you add another 1 to the beginning to get 1,1,1,1,1,1,1,..., and you tell me the new sequence has size ω+1, I’ll be like “uh, but those are the same sequence, though. How can they have different sizes?”
Surreal numbers are useless for all of these paradoxes.
Infinitarian paralysis: Using surreal-valued utilities creates more infinitarian paralysis than it solves, I think. You’ll never take an opportunity to increase utility by x because it will always have higher expected utility to focus all of your attention on trying to find ways to increase utility by >ωx, since there’s some (however small) probability p>0 that such efforts would succeed, so the expected utility of focusing your efforts on looking for ways to increase utility by >ωx will have expected utility >pωx, which is higher than x. I think a better solution would be to note that for any person, a nonzero fraction of people are close enough to identical to that person that they will make the same decisions, so any decision that anyone makes affects a nonzero fraction of people. Measure theory is probably a better framework than surreal numbers for formalizing what is meant by “fraction” here.
Paradox of the gods: The introduction of surreal numbers solves nothing. Why wouldn’t he be able to advance more than 2−ω miles if no gods erect any barriers until he advances 2−n miles for some finite n?
Two-envelopes paradox: it doesn’t make sense to model your uncertainty over how much money is in the first envelope with a uniform surreal-valued probability distribution on [1n,n] for an infinite surreal n, because then the probability that there is a finite amount of money in the envelope is infinitesimal, but we’re trying to model the situation in which we know there’s a finite amount of money in the envelope and just have no idea which finite amount.
Sphere of suffering: Surreal numbers are not the right tool for measuring the volume of Euclidean space or the duration of forever.
Hilbert hotel: As you mentioned, using surreals in the way you propose changes the problem.
Trumped, Trouble in St. Petersburg, Soccer teams, Can God choose an integer at random?, The Headache: Using surreals in the way you propose in each of these changes the problems in exactly the same way it does for the Hilbert hotel.
St. Petersburg paradox: If you pay infinity dollars to play the game, then you lose infinity dollars with probability 1. Doesn’t sound like a great deal.
Banach-Tarski Paradox: The free group only consists of sequences of finite length.
The Magic Dartboard: First, a nitpick: that proof relies on the continuum hypothesis, which is independent of ZFC. Aside from that, the proof is correct, which means any resolution along the lines you’re imagining that imply that no magic dartboards exist is going to imply that the continuum hypothesis is false. Worse, the fact that for any countable ordinal, there are countably many smaller countable ordinals and uncountably many larger countable ordinals follows from very minimal mathematical assumptions, and is often used in descriptive set theory without bringing in the continuum hypothesis at all, so if you start trying to change math to make sense of “the second half of the countable ordinals”, you’re going to have a bad time.
Parity paradoxes: The lengths of the sequences involved here are the ordinal ω, not a surreal number. You might object that there is also a surreal number called ω, but this is different from the ordinal ω. Arithmetic operations act differently on ordinals than they do on the copies of those ordinals in the surreal numbers, so there’s no reasonable sense in which the surreals contain the ordinals. Example: if you add another element to the beginning of either sequence (i.e. flip the switch at t=−2, or add a −1 at the beginning of the sum, respectively), then you’ve added one thing, so the surreal number should increase by 1, but the order-type is unchanged, so the ordinal remains the same.
The agent could be programmed to have a certain hard-coded ontology rather than searching through all possible hypotheses weighted by description length.
I haven’t heard the term “platonic goals” before. There’s been plenty written on capability control before, but I don’t know of anything written before on the strategy I described in this post (although it’s entirely possible that there’s been previous writing on the topic that I’m not aware of).
Are you worried about leaks from the abstract computational process into the real world, leaks from the real world into the abstract computational process, or both? (Or maybe neither and I’m misunderstanding your concern?)
There will definitely be tons of leaks from the abstract computational process into the real world; just looking at the result is already such a leak. The point is that the AI should have no incentive to optimize such leaks, not that the leaks don’t exist, so the existence of additional leaks that we didn’t know about shouldn’t be concerning.
Leaks from the outside world into the computational abstraction would be more concerning, since the whole point is to prevent those from existing. It seems like it should be possible to make hardware arbitrarily reliable by devoting enough resources to error detection and correction, which would prevent such leaks, though I’m not an expert, so it would be good to know if this is wrong. There may be other ways to get the AI to act similarly to the way it would in the idealized toy world even when hardware errors create small differences. This is certainly the sort of thing we would want to take seriously if hardware can’t be made arbitrarily reliable.
Incidentally, that story about accidental creation of a radio with an evolutionary algorithm was part of what motivated my post in the first place. If the evolutionary algorithm had used tests of its oscillator design in a computer model, rather than in the real world, then it would have have built a radio receiver, since radio signals from nearby computers would not have been included in the computer model of the environment, even though they were present in the actual environment.
What I meant was that the computation isn’t extremely long in the sense of description length, not in the sense of computation time. Also, we aren’t doing policy search over the set of all turing machines, we’re doing policy search over some smaller set of policies that can be guaranteed to halt in a reasonable time (and more can be added as time goes on)
Wouldn’t the set of all action sequences have lower description length than some large finite set of policies? There’s also the potential problem that all of the policies in the large finite set you’re searching over could be quite far from optimal.
Ok, understood on the second assumption.U is not a function to [0,1], but a function to the set of [0,1]-valued random variables, and your assumption is that this random variable is uncorrelated with certain claims about the outputs of certain policies. The intuitive explanation of the third condition made sense; my complaint was that even with the intended interpretation at hand, the formal statement made no sense to me.
I’m pretty sure you’re assuming that ϕ is resolved on day n, not that it is resolved eventually.
Searching over the set of all Turing machines won’t halt in a reasonably short amount of time, and in fact won’t halt ever, since the set of all Turing machines is non-compact. So I don’t see what you mean when you say that the computation is not extremely long.