“create an AI that minimizes the expected amount of astronomical waste”
Of course, this is still just a proxy measure… say that we’re “in a simulation”, or that there are already superintelligences in our environment who won’t let us eat the stars, or something like that—we still want to get as good a bargaining position as we possibly can, or to coordinate with the watchers as well as we possibly can, or in a more fundamental sense we want to not waste any of our potential, which I think is the real driving intuition here. (Further clarifying and expanding on that intuition might be very valuable, both for polemical reasons and for organizing some thoughts on AI strategy.) I cynically suspect that the stars aren’t out there for us to eat, but that we can still gain a lot of leverage over the acausal fanfic-writing commun… er, superintelligence-centered economy/ecology, and so, optimizing the hell out of the AGI that might become an important bargaining piece and/or plot point is still the most important thing for humans to do.
Metaphilosophical AI
The thing I’ve seen that looks closest to white-box metaphilosophical AI in the existing literature is Eliezer’s causal validity semantics, or more precisely the set of intuitions Eliezer was drawing on to come up with the idea of causal validity semantics. I would recommend reading the section Story of a Blob and the sections on causal validity semantics in Creating Friendly AI. Note that philosophical intuitions are a fuzzily bordered subset of justification-bearing (i.e. both moral/values-like and epistemic) causes that are theoretically formally identifiable and are traditionally thought of as having a coherent, lawful structure.
we still want to get as good a bargaining position as we possibly can, or to coordinate with the watchers as well as we possibly can, or in a more fundamental sense we want to not waste any of our potential, which I think is the real driving intuition here
It seems that we have more morally important potential in some possible worlds than others, and although we don’t want our language to commit us to the view that we only have morally important potential in possible worlds where we can prevent astronomical waste, neither do we want to suggest (as I think “not waste any of our potential” does) the view that we have the same morally important potential everywhere and that we should just minimize the expected fraction of our potential that is wasted. A more neutral way of framing things could be “minimize wasted potential, especially if the potential is astronomical”, leaving the strength of the “especially” to be specified by theories of how much one can affect the world from base reality vs simulations and zoos, theories of how to deal with moral uncertainty, and so on.
I completely understand your intuition but don’t entirely agree; this comment might seem like quibbling: Having access to astronomical resources is one way to have a huge good impact, but I’m not sure we know enough about moral philosophy or even about what an acausal economy/ecology might look like to be sure that the difference between a non-astronomical possible world and an astronomical possible world is a huge difference. (For what it’s worth, my primary intuition here is “the multiverse is more good-decision-theory-limited/insight-limited than resource-limited”. I’d like to expand on this in a blog post or something later.) Obviously we should provisionally assume that the difference is huge, but I can see non-fuzzy lines of reasoning that suggest that the difference might not be much.
Because we might be wrong about the relative utility of non-astronomical possible worlds it seems like when describing our fundamental driving motivations we should choose language that is as agnostic as possible, in order to have a strong conceptual foundation that isn’t too contingent on our provisional best guess models. E.g., take the principle of decision theory that says we should focus more on worlds that plausibly seem much larger even if it might be less probable that we’re in those worlds: the underlying, non-conclusion-contingent reasons that drive us to take considerations and perspectives such as that one into account are the things we should be putting effort into explaining to others and making clear to ourselves.
Of course, this is still just a proxy measure… say that we’re “in a simulation”, or that there are already superintelligences in our environment who won’t let us eat the stars, or something like that—we still want to get as good a bargaining position as we possibly can, or to coordinate with the watchers as well as we possibly can, or in a more fundamental sense we want to not waste any of our potential, which I think is the real driving intuition here.
Agreed. I was being lazy and using “astronomical waste” as a pointer to this more general concept, probably because I was primed by people talking about “astronomical waste” a bunch recently.
Further clarifying and expanding on that intuition might be very valuable, both for polemical reasons and for organizing some thoughts on AI strategy.
Also agreed, but I currently don’t have much to add to what’s already been said on this topic.
The thing I’ve seen that looks closest to white-box metaphilosophical AI in the existing literature is Eliezer’s causal validity semantics, or more precisely the set of intuitions Eliezer was drawing on to come up with the idea of causal validity semantics.
Ugh, I found CFAI largely impenetrable when I first read it, and have the same reaction reading it now. Can you try translating the section into “modern” LW language?
Of course, this is still just a proxy measure… say that we’re “in a simulation”, or that there are already superintelligences in our environment who won’t let us eat the stars, or something like that—we still want to get as good a bargaining position as we possibly can, or to coordinate with the watchers as well as we possibly can, or in a more fundamental sense we want to not waste any of our potential, which I think is the real driving intuition here. (Further clarifying and expanding on that intuition might be very valuable, both for polemical reasons and for organizing some thoughts on AI strategy.) I cynically suspect that the stars aren’t out there for us to eat, but that we can still gain a lot of leverage over the acausal fanfic-writing commun… er, superintelligence-centered economy/ecology, and so, optimizing the hell out of the AGI that might become an important bargaining piece and/or plot point is still the most important thing for humans to do.
The thing I’ve seen that looks closest to white-box metaphilosophical AI in the existing literature is Eliezer’s causal validity semantics, or more precisely the set of intuitions Eliezer was drawing on to come up with the idea of causal validity semantics. I would recommend reading the section Story of a Blob and the sections on causal validity semantics in Creating Friendly AI. Note that philosophical intuitions are a fuzzily bordered subset of justification-bearing (i.e. both moral/values-like and epistemic) causes that are theoretically formally identifiable and are traditionally thought of as having a coherent, lawful structure.
It seems that we have more morally important potential in some possible worlds than others, and although we don’t want our language to commit us to the view that we only have morally important potential in possible worlds where we can prevent astronomical waste, neither do we want to suggest (as I think “not waste any of our potential” does) the view that we have the same morally important potential everywhere and that we should just minimize the expected fraction of our potential that is wasted. A more neutral way of framing things could be “minimize wasted potential, especially if the potential is astronomical”, leaving the strength of the “especially” to be specified by theories of how much one can affect the world from base reality vs simulations and zoos, theories of how to deal with moral uncertainty, and so on.
I completely understand your intuition but don’t entirely agree; this comment might seem like quibbling: Having access to astronomical resources is one way to have a huge good impact, but I’m not sure we know enough about moral philosophy or even about what an acausal economy/ecology might look like to be sure that the difference between a non-astronomical possible world and an astronomical possible world is a huge difference. (For what it’s worth, my primary intuition here is “the multiverse is more good-decision-theory-limited/insight-limited than resource-limited”. I’d like to expand on this in a blog post or something later.) Obviously we should provisionally assume that the difference is huge, but I can see non-fuzzy lines of reasoning that suggest that the difference might not be much.
Because we might be wrong about the relative utility of non-astronomical possible worlds it seems like when describing our fundamental driving motivations we should choose language that is as agnostic as possible, in order to have a strong conceptual foundation that isn’t too contingent on our provisional best guess models. E.g., take the principle of decision theory that says we should focus more on worlds that plausibly seem much larger even if it might be less probable that we’re in those worlds: the underlying, non-conclusion-contingent reasons that drive us to take considerations and perspectives such as that one into account are the things we should be putting effort into explaining to others and making clear to ourselves.
Agreed. I was being lazy and using “astronomical waste” as a pointer to this more general concept, probably because I was primed by people talking about “astronomical waste” a bunch recently.
Also agreed, but I currently don’t have much to add to what’s already been said on this topic.
Ugh, I found CFAI largely impenetrable when I first read it, and have the same reaction reading it now. Can you try translating the section into “modern” LW language?
CFAI is deprecated for a reason, I can’t read it either.