Problem is, there isn’t necessarily a modular procedure used to identify yourself. It may just be some sort of hard-coded index. A Solomonoff inductor will reason over all possible such indices by reasoning over all programs, and throw out any which turn out to not be consistent with the data. But that behavior is packaged with the inductor, which is not itself a program.
I’m about 80% on board with that argument.
The main loophole I see is that number-of-embedded-agents may not be decidable. That would make a lot of sense, since embedded-agent-detectors are exactly the sort of thing which would help circumvent diagonalization barriers. That does run into the second part of your argument, but notice that there’s no reason we need to detect all the agents using a single program in order for the main problem setup to work. They can be addressed one-by-one, by ad-hoc programs, each encoding one of the hypotheses (world model, agent location).
(Personally, though, I don’t expect number-of-embedded-agents to be undecidable, at least for environments with some kind of private random bit sources.)
Asserting that there are n people takes at least K(n) bits, so large universe sizes have to get less likely at some point.
The problem setup doesn’t necessarily require asserting the existence of n people. It just requires setting up a universe in which n people happen to exist. That could take considerably less than K(n) bits, if person-detection is itself fairly expensive. We could even index directly to the Solomonoff inductor’s input data without attempting to recognize any agents; that would circumvent the K(number of people) issue.
“Okay, so you’re saying the actual hypotheses that predict my observations, which I should assign probability to according to their complexity, are things like ‘T1 and I’m person #1’ or ‘T2 and I’m person #10^10’?” says the Solomonoff inductor.“Exactly.”“But I’m still confused. Because it still requires information to say that I’m person #1 or person #10^10. Even if we assume that it’s equally easy to specify where a person is in both theories, it just plain old takes more bits to say 10^10 than it does to say 1.”
“Okay, so you’re saying the actual hypotheses that predict my observations, which I should assign probability to according to their complexity, are things like ‘T1 and I’m person #1’ or ‘T2 and I’m person #10^10’?” says the Solomonoff inductor.
“But I’m still confused. Because it still requires information to say that I’m person #1 or person #10^10. Even if we assume that it’s equally easy to specify where a person is in both theories, it just plain old takes more bits to say 10^10 than it does to say 1.”
I think this section is confused about how the question “T1 or T2” gets encoded for a Solomonoff inductor.
Given the first chunk in the quote above, we don’t have two world models; we have one world model for each person in T1, plus one world model for each person in T2. Our models are (T1 & person 1), (T1 & person 2), …, (T2 & person 1), …. To decide whether we’re in T1 or T2, our Solomonoff inductor will compare the total probability of all the T1 hypotheses to the total probability of all the T2 hypotheses.
Assuming T1 and T2 have exactly the same complexity, then presumably (T1 & person N) should have roughly the same complexity as (T2 & person N). That is not necessarily the case; T1/T2 may contain information which makes encoding some numbers cheaper/more expensive. But it does seem like a reasonable approximation for building intuition.
Anyway, point is, “it just plain old takes more bits to say 10^10 than it does to say 1” isn’t relevant here. There’s no particular reason to compare the two hypotheses (T1 & person 1) vs (T2 & person 10^10); that is not the correct formalization of the T1 vs T2 question.
I was under the impression that movie producers DO hire experts for this sort of thing. At the very least, I know they hire science consultants for scientific accuracy problems; I assume they often do the same for historical accuracy.
Let me try another explanation.
The main point is: given a system, we don’t actually have that many degrees of freedom in what abstractions to use in order to reason about the system. That’s a core component of my research: the underlying structure of a system forces certain abstraction-choices; choosing other abstractions would force us to carry around lots of extra data.
However, if we have the opportunity to design a system, then we can choose what abstraction we want and then choose the system structure to match that abstraction. The number of degrees of freedom expands dramatically.
In programming, we get to design very large chunks of the system; in math and the sciences, less so. It’s not a hard dividing line—there are design problems in the sciences and there are problem constraints in programming—but it’s still a major difference.
In general, we should expect that looking for better abstractions is much more relevant to design problems, simply because the possibility space is so much larger. For problems where the system structure is given, the structure itself dictates the abstraction choice. People do still screw up and pick “wrong” abstractions for a given system, but since the space of choices is relatively small, it takes a lot less exploration to converge to pretty good choices over time.
There is a major difference between programming and math/science with respect to abstraction: in programming, we don’t just get to choose the abstraction, we get to design the system to match that abstraction. In math and the sciences, we don’t get to choose the structure of the underlying system; the only choice we have is in how to model it.
Given a fundamental difference that large, we should expect that many intuitions about abstraction-quality in programming will not generalize to math and the sciences, and I think that is the case for the core argument of this post.
The main issue is that reality has structure (especially causal structure), and we don’t get to choose that structure. In programming, abstraction is a social convenience to a much greater extent; we can design the systems to match the chosen abstractions. But if we choose a poor abstraction in e.g. physics or biology, we will find that we need to carry around tons of data in order to make accurate predictions. For instance, the abstraction of a “cell” in biology is useful mainly because the inside of the cell is largely isolated from the outside; interaction between the two takes place only through a relatively small number of defined chemical/physical channels. It’s like a physical embodiment of function scope; we can make predictions about outside-the-cell behavior without having to track all the details of what happens inside the cell.
To draw a proper analogy between abstraction-choice in biology and programming: imagine that you were performing reverse compilation. You take in assembly code, and attempt to provide equivalent, maximally-human-readable code in some other language. That’s basically the right analogy for abstraction-choice in biology.
Picture that, and hopefully it’s clear that there are far fewer degrees of freedom in the choice of abstraction, compared to normal programming problems. That’s why people in math/science don’t experiment with alternative abstractions very often compared to programming: there just aren’t that many options which make any sense at all. That’s not to say that progress isn’t made from time to time; Feynman’s formulation of quantum mechanics was a big step forward. But there’s not a whole continuum of similarly-decent formulations of quantum mechanics like there is a continuum of similarly-decent programming languages; the abstraction choice is much more constrained.
I understood “ritual” here as not just a blackbox process, but a blackbox process which has undergone cultural selection—i.e. metic knowledge. If we “treat baking as a ritual” in that sense, it would mean carefully following some procedure acquired from someone else, on the assumption that some parts are really important and we don’t have a good way to tell which.
Telomere shortening is an interesting case. (I’m going to give my current understanding here without trying to dig up references, so take it all with a grain of salt.)
It’s clearly a plausible root cause—it’s a change which could stick around on long enough timescales to account for aging. On the other hand, it is possible for telomeres to turn over: telomerase is active in stem cells, so telomere length should at least not be an issue for cell types which regularly turn over—the telomeres turn over with the cells, which are ultimately replaced from the stem cells. For long-lived cells, there’s a stronger case that telomere shortening could be an issue.
Telomeres do get shorter with age, BUT they get shorter even in cell types which turn over regularly. That’s a bit of a red flag—either the telomeres aren’t being fully replaced by telomerase in the stem cells (in which case the stem cells ought to die a lot sooner), or some other mechanism is making them short besides accumulated loss over lifetime. The alternative mechanism which jumps out to me is: DNA damage, and oxidative damage in particular, has been observed to rapidly shorten telomeres. DNA damage and oxidative damage rates are generally observed to be much higher in aged cells of most types, so that would explain why telomeres are shorter in older organisms.
In terms of actual experiments, telomerase-boosters have been experimented with a fair bit, and my understanding is that they don’t have much effect on age-related diseases (though of course there’s the usual pile of low-N studies which find barely-significant and blatantly p-hacked results).
Other things will eventually be covered later in this sequence.
Thanks, that’s the right question to ask and some great info on it.
Yeah, to be clear, semipolitical fluff is often valuable, and I agree that that’s likely the case here. But I don’t expect LWers to find anything new or interesting in that part of the book, nor is anything interesting there about how aging works. It’s for a different audience and a different purpose.
Following the citation on wikipedia, sounds like that’s the longest one has lived in captivity. Remember, it’s not that they’re immortal, it’s just that their chance-of-dying-per-unit-time stays flat; that still implies that the number of survivors drops off exponentially over time.
Yup, read it a few months ago. Mini-review:
The core mechanisms they talk about make sense, in particular transposons as root cause, and the picture of stressors competing for sirtuin activity.
The review of aging in yeast was a highlight. That’s a great example where the mechanisms were pretty clearly nailed down, and Sinclair was one of the people who figured it out.
I’m much more skeptical of the particular treatments discussed, and especially the proposal that NAD boosters ameliorate age-related diseases by boosting sirtuin activity specifically. NAD is a fairly general-purpose molecule, there’s a ton of other things it could be doing, and I don’t recall a demonstration that sirtuin activity mediates their effects. (If there is a mediation experiment, then that part is much stronger.)
Sinclair in general seems to fit the “great experimentalist, mediocre theorist” mold; the blabber about information loss and aging being epigenetic was thoroughly confused. He has a bad case of obsession-with-his-favorite-gene-class (namely the sirtuins). In this case that gene class seems to be adjacent to an actual root cause (transposons), but Sinclair himself doesn’t seem to have the conceptual framework in place to organize that knowledge.
The entire second half of the book is semipolitical fluff.
Where did you get the data for this? I’m particularly impressed with the older price series, I imagine that must have come from specialized sources...
See Highlights of Comparative and Evolutionary Aging.
I’d distinguish between “there isn’t an objectively correct answer about what the flower is” and “there isn’t an objectively correct algorithm to answer the question”. There are cases where the OP method’s answer isn’t unique, and examples of this so far (in the other comments) mostly match cases where human intuition breaks down—i.e. the things humans consider edge cases are also things the algorithm considers edge cases. So it can still be the correct algorithm, even though in some cases the “correct” answer is ambiguous.
Let’s start by setting aside the whole mind-uploading problem, and look at something more prosaic: what makes “me” at noon today the same as “me” at 3:00 am one year ago? In fact, let’s set aside what makes this true from my perspective; what makes you think these two bags of chemicals are the “same person”? When you see your mother, and then see her again later on, why do you think of her as the same person?
This is basically the same problem as Pointing to a Flower, except we’ve dragged in a bunch of new intuitions by making it about humans instead of flowers. (It’s also the same question Yudkowsky uses in his post on cryonics in the sequences, although I can’t find a link at the moment.)
The answer from the flower post does a fine job of saying what makes me the same organism as before: draw a boundary around my body in spacetime, it’s a local minimum in terms of summary data required to make predictions far away. Nontrivial, but quantifiable.
But for things like mind uploading, we want to go further than that. Sure, we could simulate my entire body, but it seems like what makes “me” doesn’t require all that. After all, I’m still me if I lose all my limbs and torso and get attached to some sci-fi life support machine. “Me” apparently does not just mean my body. In practice, I think humans use “me”, “you”, etc with several different referents depending on context, and the body is one of them. The referent relevant for mind-uploading purposes is, presumably, the mind—whatever that means.
There’s a few more steps before I’m ready to tackle the referent of “my mind”, but I’m pretty confident that the first step is basically the same as for the flower, and I’m also pretty confident that it is crisply quantifiable.
As to the connection with pattern matching, I’m pretty sure that the flower-approach is roughly equivalent to what a Bayesian learner would learn by looking for patterns in data. But that’s a post for another time.
Socratic Ducking fits the criteria; it’s a good way to get a feel for what kinds of questions someone else asks theirself.