Yeah, I was implicitly assuming that initiating a successor agent would force Omega to update its predictions about the new agent. As you say, that’s actually not very relevant, because it’s a property of a specific decision problem rather than CDT or son-of-CDT.
(I apologize in advance if this is too far afield of the intended purpose of this post)
How does the claim that “group agents require membranes” interact with the widespread support for dramatically reducing or eliminating restrictions to immigration (“open borders” for short) within the EA/LW community? I can think of several possibilities, but I’m not sure which is true:
There actually isn’t much support for open borders
Open borders supporters believe that “group agents require membranes” is a reasonable generaliation, but borders are not a relevant kind of “membrane”, or nations are not “group agents” in the relevant sense
The people who support open borders generally aren’t the same people who are thinking about group agency at all
Open borders supporters have thought about group agency and concluded that “group agents require membranes” is not a reasonable generalization
Open borders supporters believe that there is no need for nations to have group agency
Something else I haven’t thought of
Context: I have an intuition that reduced/eliminated immigration restrictions reduce global coordination, and this post helped me crystallize it (if nations have less group agency, it’s harder to coordinate)
Would trying to become less confused about commitment races before building a superintelligent AI count as a metaphilosophical approach or a decision theoretic one (or neither)? I’m not sure I understand the dividing line between the two.
if you’re interested in anything in particular, I’ll be happy to answer.
I very much appreciate the offer! I can’t think of anything specific, though; the comments of yours that I find most valuable tend to be “unknown unknowns” that suggest a hypothesis I wouldn’t previously have been able to articulate.
Have you written anything like “cousin_it’s life advice”? I often find your comments extremely insightful in a way that combines the best of LW ideas with wisdom from other areas, and would love to read more.
The prior probability ratio is 1:99, and the likelihood ratio is 20:1, so the posterior probability is 120:991 = 20:99, so you have probability of 20/(20+99) of having breast cancer.
What does “120:991” mean here?
After thinking about it some more, I don’t think this is true.
A concrete example: Let’s say there’s a CDT paperclip maximizer in an environment with Newcomb-like problems that’s deciding between 3 options.
1. Don’t hand control to any successor
2. Hand off control to a “LDT about correlations formed after 7am, CDT about correlations formed before 7am” successor
3. Hand off control to a LDT successor.
My understanding is that the CDT agent would take the choice that causes the highest number of paperclips to be created (in expectation). If both successors are aligned with the CDT agent, I would expect the CDT agent to choose option #3. The LDT successor agent would be able to gain more resources (and thus create more paperclips) than the other two possible agents, when faced with a Newcomb-like problem with correlations formed before the succession time. The CDT agent can cause this outcome to happen if and only if it chooses option #3.
I’m not at all sure that son-of-CDT resembles any known logical decision theory, but I don’t see why it would resemble “LDT about correlations formed after 7am, CDT about correlations formed before 7am”.
Edit: I agree that a CDT agent will never agree to precommit to acting like a LDT agent for correlations that have already been created, but I don’t think that determines what kind of successor agent they would choose to create.
That makes sense to me, but unfortunately I’m no closer to understanding the quoted passage. Some specific confusions:
What’s the link between death rate and time preference? My best guess is that declining life expectancy implies scarcity, but I also don’t get....
the link between scarcity and time preference? My best guess is that high time preference means people don’t put the work in to ensure sufficient future productive capacity, but that doesn’t help me understand the quote so I think I’m missing something.
I get why emergency mobilization increases time preference, but not why high time preference is strong evidence of emergency mobilization (as opposed to other possible explanations)
Can someone explain/point me to useful resources to understand the idea of time preference as expresed in this post? In particular, I’m struggling to understand these sentences:
This suggests that near the center time preference has increased to the point where we’re creating scarcity faster than we’re alleviating it, while at the periphery scarcity is still actually being alleviated because there’s enough scarcity to go around, or perhaps marginal areas do not suffer so much from total mobilization.
I also don’t understand why having an internal rate of return of 10% is evidence that we’re in an emergency state of mobilization (relative to the hypothesis that managers are poorly incentivized to do long-term planning for other reasons).
I think quantitative easing is an example (if I understood the post correctly, which I’m not sure about). By buying up bonds, the government is putting more dollars into the economy, which reduces the “amount of stuff produced per dollar”, thus creating scarcity (in other words, QE increases aggregate demand). To alleviate this pressure, people make more stuff in order to meet the excess demand (i.e. unemployment rates go down). Forcing the unemployment rate down is the same as “requiring almost everyone to do things”
Maybe the claim that climate scientists are liars? I don’t know if it’s true, but if I knew it were false I’d definitely downvote the post...
I understand that, but I don’t see why #2 is likely to be achievable. Corrigibility seems very similar to Wei Dai’s translation example, so it seems like there could be many deceptive actions that humans would intuitively recognize as not corrigible, but which would fool an early-stage LBO tree into assigning a high reward. This seems like it would be a clear example of “giving a behaviour a high reward because it is bad”. Unfortunately I can’t think of any good examples, so my intuition may simply be mistaken.
Incidentally, it seems like Ought could feasibly test whether meta-execution is sufficient to ensure corrigibility; for example, a malicious expert could recommend deceptive/influence-seizing actions to an agent in a simulated environment, and the meta-execution tree would have to detect every deceptive action without any contextual knowledge . Are there any plans to do this?
That makes sense; so it’s a general method that’s applicable whenever the bandwidth is too low for an individual agent to construct the relevant ontology?
plus maybe other properties
That makes sense; I hadn’t thought of the possibility that a security failure in the HBO tree might be acceptable in this context. OTOH, if there’s an input that corrupts the HBO tree, isn’t it possible that the corrupted tree could output a supposed “LBO overseer” that embeds the malicious input and corrupts us when we try to verify it? If the HBO tree is insecure, it seems like a manual process that verifies its output must be insecure as well.
I don’t understand the argument that a speed prior wouldn’t work: wouldn’t the abstract reasoner still have to simulate the aliens in order to know what output to read from the zoo earths? I don’t understand how “simulate a zoo earth with a bitstream that is controlled by aliens in a certain way” would ever get a higher prior weight than “simulate an earth that never gets controlled by aliens”. Is the idea that each possible zoo earth with simple-to-describe aliens has a relatively similar prior weight to the real earth, so they collectively have a much higher prior weight?
I think it’s likely that these markets would quickly converge to better predictions than existing political prediction markets
Why would you expect this to be true? I (and presumably many others) spend a lot of time researching questions on existing political prediction markets because I can win large sums ($1k+ per question) doing so. I don’t see why anyone would have an incentive to put in a similar amount of time to win Internet Points, and as a result I don’t see why these markets would outperform existing political prediction markets. Is the idea that many people contributing a minimally-informed opinion will lead to more efficient results than a few people contributing a well-informed opinion + a bunch of noise traders?
Is there any information on how Von Neumann came to believe Catholicism was the correct religion for Pascal Wager purposes? “My wife is Catholic” doesn’t seem like very strong evidence...
How do you ensure that property #3 is satisfied in the early stages of the amplification process? Since no agent in the tree will have context, and the entire system isn’t very powerful yet, it seems like there could easily be inputs that would naively generate a high reward “by being bad”, which the overseer couldn’t detect.