I indeed upvoted it for the update / generally valuable contribution to the discussion.
a) Agreed, although I don’t find this inappropriate in context.
b) I do agree that the fact that many successful past civilizations are now in ruins with their books lost is a important sign of danger. But surely there is some onus of proof in the opposite direction from the near-monotonic increase in population over the last few millennia?
c) These are certainly extremely important problems going forwards. I would particularly emphasize the nukes.
d) Agreed. But on the centuries scale, there is extreme potential in orbital solar power and fusion.
e) Agreed. But I think it’s easy to underestimate the problems our ancestors faced. In my opinion, some huge ones of past centuries include: ice ages, supervolcanic eruptions, the difficulty of maintaining stable monarchies, the bubonic plague, Columbian smallpox, the ubiquitous oppression of women, harmful theocracies, majority illiteracy, the Malthusian dilemma, and the prevalence of total war as a dominant paradigm. Is there evidence that past problems were easier than 2019 ones?
It sounds like your perspective is that, before 2100, wars and upcoming increases in resource scarcity will cause a inescapable global economic decline that will bring most of the planet to a 1800s-esque standard of living, followed by a return to slow growth (standard of living, infrastructure, food, energy, productivity) for the next couple centuries. Do I correctly understand your perspective?
Epistemics: Yes, it is sound. Not because of claims (they seem more like opinions to me), but because it is appropriately charitable to those that disagree with Paul, and tries hard to open up avenues of mutual understanding.
Valuable: Yes. It provides new third paradigms that bring clarity to people with different views. Very creative, good suggestions.
Should it be in the Best list?: No. It is from the middle of a conversation, and would be difficult to understand if you haven’t read a lot about the ‘Foom debate’.
Improved: The same concepts rewritten for a less-familiar audience would be valuable. Or at least with links to some of the background (definitions of AGI, detailed examples of what fast takeoff might look like and arguments for its plausibility).
Followup: More posts thoughtfully describing positions for and against, etc. Presumably these exist, but i personally have not read much of this discussion in the 2018-2019 era.
This is a little nitpicky, but i feel compelled to point out that the brain in the ‘human safety’ example doesn’t have to run for a billion years consecutively. If the goal is to provide consistent moral guidance, the brain can set things up so that it stores a canonical copy of itself in long-term storage, runs for 30 days, then hands off control to another version of itself, loaded from the canonical copy. Every 30 days control is handed to a instance of the canonical version of this person. The same scheme is possible for a group of people.
But this is a nitpick, because i agree that there are probably weird situations in the universe where even the wisest human groups would choose bad outcomes given absolute power for a short time.
I appreciate this disentangling of perspectives. I had been conflating them before, but i like this paradigm.
I found this uncomfortable and unpleasant to read, but i’m nevertheless glad i read it. Thanks for posting.
I think the abridgement sounds nice but don’t anticipate it affecting me much either way.
I think the ability to turn this on/off in user preferences is a particularly good idea (as mentioned in Raemon’s comment).
I can follow most of this, but i’m confused about one part of the premise.
What if the agent created a low-resolution simulation of its behavior, called it Approximate Self, and used that in its predictions? Is the idea that this is doable, but represents a unacceptably large loss of accuracy? Are we in a ‘no approximation’ context where any loss of accuracy is to be avoided?
My perspective: It seems to me that humans also suffer from the problem of embedded self-reference. I suspect that humans deal with this by thinking about a highly approximate representation of their own behavior. For example, when i try to predict how a future conversation will go, i imagine myself saying things that a ‘reasonable person’ might say. Could a machine use a analogous form of non-self-referential approximation?
Great piece, thanks for posting.
It’s relevant to some forms of utilitarian ethics.
I think this is a clever new way of phrasing the problem.
When you said ‘friend that is more powerful than you’, that also made me think of a parenting relationship. We can look at whether this well-intentioned personification of AGI would be a good parent to a human child. They might be able to give the child a lot of attention, a expensive education, and a lot of material resources, but they might take unorthodox actions in the course of pursuing human goals.
(I’m not zhukeepa; i’m just bringing up my own thoughts.)
This isn’t quite the same as a improvement, but one thing that is more appealing about normal-world metaphilosophical progress than empowered-person metaphilosophical progress is that the former has a track record of working*, while the latter is untried and might not work.
*Slowly and not without reversals.
It implies that the Occamian prior should work well in any universe where the laws of probability hold. Is that really true?
Just to clarify, are you referring to the differences between classical probability and quantum amplitudes? Or do you mean something else?
Why do you think so? It’s a thought experiment about punitive acausal trade from before people realized that benevolent acausal trade was equally possible. I don’t think it’s the most interesting idea to come out of the Less Wrong community anymore.
Sorry, i couldn’t find the previous link here when i searched for it.
Just to be clear, i’m imagining counterfactual cooperation to mean the FAI building vaults full of paperclips in every region where there is a surplus of aluminium (or a similar metal). In the other possibility branch, the paperclip maximizer (which thinks identically) reciprocates by preserving semi-autonomous cities of humans among the mountains of paperclips.
If my understanding above is correct, then yes, i think these two would cooperate IF this type of software agent shares my perspective on acausal game theory and branching timelines.
In the last 48 hours i’ve felt the need for more than one of the abilities above. These would be very useful conversational tools.
I think some of these would be harder than others. This one sounds hard: ‘Letting them now that what they said set off alarms bells somewhere in your head, but you aren’t sure why.’ Maybe we could look for both scripts that work between two people who already trust each other, and scripts that work with semi-strangers. Or scripts that do and don’t require both participants to have already read a specific blog post, etc.
Something like a death risk calibration agency? Could be very interesting. Do any orgs like this exist? I guess the CDC (in the US govt) probably quantitively compares risks within the context of disease.
One quote in your post seems more ambitious than the rest: ‘helping retrain people if a thing that society was worried about seems to not be such a problem’. I think that tons of people evaluate risks based on how scary they seem, not based on numerical research.