Speedup on evolution?Maybe? Might work okayish, but doubt the best solution is that speculative.
It seems to me that in a Trivial Decision Theory Problem, the list of stories is generated but then one of the stories hogs all the pros with everything else getting all the cons.
That wasn’t how I defined it. I defined it as a decision theory problem with literally one option.
The “screw this” option is available when we don’t insist that an agent is actually in a situation, just that a situation be simulated.
I use the term Trivial Decision Theory Problem to refer to circumstances when an agent can only make one decision.
I guess some of the decision theories could have different rationale how and why it would be prudent to produce what stories
Yeah, my approach is to note that the stories are the result of intuitions and/or instincts that are evolved and so that they aren’t purely arbitrary.
As in, you could score some actions, but then there isn’t a sense in which you “can” choose one according to any criterion.
I’ve noticed that issue as well. Counterfactuals are more a convenient model/story than something to be taken literally. You’ve grounded decision by taking counterfactuals to exist a priori. I ground them by noting that our desire to construct counterfactuals is ultimately based on evolved instincts and/or behaviours so these stories aren’t just arbitrary stories but a way in which we can leverage the lessons that have been instilled in us by evolution. I’m curious, given this explanation, why do we still need choices to be actual?
That works if we have the counterfactuals/stories, but how do we determine what these should be? Assuming we reject modal realism, they don’t directly correspond to anything real, so what should they be?
I commented directly on your post.
Let A be some action. Consider the statement: “I will take action A”. An agent believing this statement may falsify it by taking any action B not equal to A. Therefore, this statement does not hold as a law. It may be falsified at will.
If you believe determinism then an agent can sometimes falsify it, sometimes not.
See footnote 3: “Quantum mechanics only shifts us from the state of the world being deterministic, to the probability distribution being deterministic. It doesn’t provide scope for free will, so it doesn’t avoid the ontological shift.”
There may not be an open round, as we may find connections through our networks. However, if there is, we will post it on the forum.
I’m actually about to announce an AI Safety microgrant initiative for people who want to are looking to commit at least $1000USD for every year they choose to be involved. The post will be out in the next few days, let me know if you want me to link you when it’s ready.
Yes, I think that there is a time and place for these two stances toward agents
Agreed. The core lesson for me is that you can’t mix and match—you need to clearly separate out when you are using one stance or another.
I don’t especially care.
I can understand this perspective, but perhaps if there’s a relatively accessible way of explaining why this (or something similar to this) isn’t self-defeating, then maybe we should go with that?
Is naive thinking about the troll bridge problem a counterexample to this? There, the counterfactual stems from a contradiction.
I don’t quite your point. Any chance you could clarify? Like sure we can construct counterfactuals within an inconsistent system and sometimes this may even be a nifty trick for getting the right answer if we can avoid the inconsistency messing us up, but outside of this, why is this something that we should we care about this?
I think that no general type of decision theory worth two cents always does recommend itself
Good point, now that you’ve said it I have to agree that I was too quick to assume that the outside-of-the-universe decision theory should be the same as the inside-of-the-universe decision theory.
Thinking this through, if we use CDT as our outside decision theory to pick an inside decision theory, then we need to be able to justify why we were using CDT. Similarly, if we were to use another decision theory.
One thing I’ve just realised is that we don’t actually have to use CDT, EDT or FDT to make our decision. Since there’s no past for the meta-decider, we can just use our naive decision theory which ignores the past altogether. And we can justify this choice based on the fact that we are reasoning from where we are. This seems like it would avoid the recursion.
Except I don’t actually buy this, as we need to be able to provide a justification of why we would care about the result of a meta-decider outside of the universe when we know that isn’t the real scenario. I guess what we’re doing is making an analogy with inside the universe situations where we can set the source code of a robot before it goes and does some stuff. And we’re noting that a robot probably has a good algorithm if its code matches what a decider would choose if they had to be prepared for a wide variety of circumstances and then trying to apply this more broadly.
I don’t think I’ve got this precise yet, but I guess the key point is that this model doesn’t appear out of thin air, but that the model has a justification and that this justification involves a decision and hence some kind of decision theory where the actual decision is inside of the universe. So there is after all a reason to want the inside and outside theories to match up.
I guess we seem to differ on whether CDT dealt a bad hand vs. playing it badly. CDT, as usually argued for, doesn’t seem to engage with the artificial nature of counterfactuals, and I suspect that when you engage with consideration this won’t lead to CDT.
Questions in decision theory are not questions about what choices you should make with some sort of unpredictable free will. They are questions about what type of source code you should be running.
This seems like a reasonable hypothesis, but I have to point out that there’s something rather strange in imagining a situation where we make a decision outside of the universe, I think we should boggle at this using CFAR’s term. Indeed I agree that if we accept the notion of a meta-decision theory that FDT does not invoke backwards causation (elegant explanation btw!).
Comparing this to my explanation, we both seem to agree that there are two separate views—in your terms an “object” view and an “agent” view. I guess my explanation is based upon the “agent” view being artificial and it is more general as I avoid making too many assumptions about what exactly a decision is, while your view takes on an additional assumption (that we should model decisions in a meta-causal way) in exchange for being more concrete and easier to grasp/explain.
With your explanation, however, I do think you elided over this point too quickly, as it isn’t completely clear what’s going on there/why that makes sense:
FDT is actually just what happens when you use causal decision theory to select what type of source code you want to enter a Newcombian game with
There’s a sense in which this is self-defeating b/c if CDT implies that you should pre-commit to FDT, then why do you care what CDT recommends as it appears to have undermined itself?
My answer is that even though it appears this way, I don’t actually think it is self-defeating and this becomes clear when we consider this as a process of engaging in reflective equilibrium until our views are consistent. CDT doesn’t recommend itself, but FDT does, so this process leads us to replace our initial starting assumption of CDT with FDT.
In other words, we’re engaging in a form of circular epistemology as described here. We aren’t trying to get from the View from Nowhere to a model of counterfactuals—to prove everything a priori like Decartes—instead all we can do is start of with some notions, beliefs or intuitions about counterfatuals and then make them consistent. I guess I see making these mechanics explicit useful.
In particular, by making this move it seems as though, at least on the face of it, that we are embracing the notion that counterfactuals only make sense from within themselves.I’m not claiming at this stage that it is in fact correct to shift from CDT to FDT as part of the process of reflective equilibrium as it is possible to resolve inconsistencies in a different order, with different assumptions held fixed, but this is plausibly the correct way to proceed. I guess the next step would be to map out the various intuitions that we have about how to handle these kinds of situations and then figure out if there are any other possible ways of resolving the inconsistency.
I think it’s quite clear how shifting ontologies could break a specification of values. And sometimes you just need a formalisation, any formalisation, to play around with. But I suppose it depends more of the specific details of your investigation.
Yeah, I had a similar thought with capping both the utility and the percent chance, but maybe capping expected utility is better. Then again, maybe we’ve just reproduced quantization.
One thing that’s worth sharing is that if it’s connected to the internet it’ll be able to spread a bunch of copies and these copies can pursue independent plans. Some copies may be pursuing plans that are intentionally designed as distractions.
I strongly disagree with your notion of how privileging the hypothesis works. It’s not absurd to think that techniques for making AIXI-tl value diamonds despite ontological shifts could be adapted for other architectures. I agree that there are other examples of people working on solving problems within a formalisation that seem rather formalisation specific, but you seem to have cast the net too wide.
I tend to agree that burning up the timeline is highly costly, but more because Effective Altruism is an Idea Machine that has only recently started to really crank up. There’s a lot of effort being directed towards recruiting top students from uni groups, but these projects require time to pay off.
I’m giving this example not to say “everyone should go do agent-foundations-y work exclusively now!”. I think it’s a neglected set of research directions that deserves far more effort, but I’m far too pessimistic about it to want humanity to put all its eggs in that basket.
If it is the case that more people should go into Agent Foundations research then perhaps MIRI should do more to enable it?
I’ll be in the Bay area from Monday 25th to Sunday 31st as I’m attending EA Global.