Your initial reasoning seems sound, but your proposal seems entirely unrealistic. You can’t control the environment enough to remove sources of status to make this stick. It’s easy to create small places and award status within them, but making that status global is basically impossible unless you can control all aspects of social reality. Otherwise you always have additional avenues along which to gain status since every person is making a choice about how much status they consider a person to have even if it is heavily influenced by information they get from other people about how much status they think people should have. Consider, by way of counter example, that no person today has globally maximal status along all dimensions: there’s just too many dimensions and too many people. Even people with high status along many dimensions, like say the Queen of England, lacks high status along many dimensions where she would be considered low status as an insider, say in playing competitive DOTA 2 or solving open problems in mathematics.
I’m similarly concerned about loose talk about assessing the alignment of specific humans given there seems generally not agreed upon precise criteria by which to assess alignment.
Reading this I was reminded of something. Now, not to say rationality or EA are exactly religions, but the two function in a lot of the same ways especially with respect to providing shared meaning and building community. And if you look at new, not-state-sponsored religions, they typically go through an early period where they are are small and geographically colocated and only later have a chance to grow after sufficient time with everyone together if they are to avoid fracturing such that we would no longer consider the growth “growth” per se and would more call it dispersion. Consider for example Jews in the desert, English Puritans moving to North America, and Mormons settling in Utah. Counterexamples that perhaps prove the rule (because they produced different sorts of communities) include early Christians spread through the Roman empire and various missionaries in the Americas.
To me this suggests that much of the conflict people feel today about Berkeley is around this unhappiness at being rationalists who aren’t living in Berkeley when the rationality movement is getting itself together in preparation for later growth, because importantly for what I think many people are concerned about this is a necessary period that comes prior to growth not to the exclusion of growth (not that anyone is intentionally doing this, but more this is a natural strategy that communities take up under certain conditions because it seems most likely to succeed). Being a rationalist not in Berkeley right now probably feels a lot like being a Mormon not in Utah a century ago or a Puritan who decided to stay behind in England.
Now, if you care about existential risk you might think we don’t have time to wait for the rationality community to coalesce in this way (or to wait to see if it even does!), and that’s fair but that’s a different argument than what I’ve mostly heard. And anyway none of this is necessarily what’s actually going on, but it is an interesting parallel I noticed reading this.
I don’t have a technical answer, but for what it’s worth I’ve thought about this idea and related concerns before and believe it is probably impossible. My reasoning depends on some combo of thinking about MWI, the uncomputability of rationality (cf. Hutter’s related work from AIXI), and the observed evidence that probably P!=NP.
I want to throw some cold water on this notion because it’s dangerously appealing. When I was doing my PhD in graph theory I had a similar feeling that graphs were everything, but this is a more general property of mathematics. Graphs are appealing to a certain kind of thinker, but there is nothing so special about them beyond their efficacy at modeling certain things and many isomorphic models are possible. In particular they admit helpful visualizations but they are ultimately no more powerful (or any less powerful!) than many equivalent mathematical models. I just worry from the tone of your post you might be overvaluing graphs so I’d like to pass down my wisdom that they are valuable but not especially valuable in general.
If that’s the case we’re no longer addressing alignment and are forced to fall back on weaker safety mechanism. People are working in this direction, but alignment remains the best path until we see evidence it’s not possible.
That’s why I say in 2 that this holds all else equal. You’re right that there are competing concerns that may make philosophical conservatism untenable, and I view it as one of the goals of AI policy to make sure that it is by telling us about the race conditions that would make us unable to practice philosophical conservatism.
I agree we must make some assumptions or pre-commitments and don’t expect we can avoid them. In particular there are epistemological issues that force our hands and require we make assumptions because complete knowledge of the universe is beyond the capacity we have to know it. I’ve talked about this idea some and I plan to revisit it as part of this work.
I’m not entirely sure either, and my best approach has been to change what we really mean “ethics” to make the problem tractable without forcing a move to making choices about what is normative. I’ll touch on this more when I describe my package of philosophical ideas I believe we should adopt in AI safety research, so for now I’ll leave it as an example of the kind of assumption that is affected by this line of thinking.
Thanks, that’s a useful clarification of my reasoning that I did not spell out!
For what it’s worth this is very reminiscent of a more general pattern in personal development and learning in general where there is a decrease in function before am increase due to what we might think of as “update costs”.
I agree there is a sense in which AI alignment research is today alchemy, but I think we are making progress to turn it into chemistry. That said, that point doesn’t seem relevant much to the rest of your position, which is more about how humans stay relevant in a world with more powerful beings than unaugmented humans.
dang, everyone wants to talk about this paper!
Ha, while I was typing this up it seems a post with the same title got published! https://www.lesswrong.com/posts/h9ZWrrCBgK64pAvxC/thoughts-on-ai-safety-via-debate
I think you’ll find it useful regardless of how much in relates to MIRI’s program: epistemology is foundational and having a better understanding of it is wildly useful if you have an interest in anything that comes remotely close to touching philosophical questions. In fact, my own take on most existing AI safety research is that it doesn’t enough address issues related to foundational questions of epistemology by choosing to make certain implicit, strong assumptions about the discoverability of truth and as a result you can add a lot of value by more carefully questioning how we know what we think we know as it relates to solving AI safety issues.
On point (e) I know people have written before about how there are many Newcomb-like problems, but do we have any sense of just how many decision problems are enough like Newcomb that this is likely to be an issue? To me this seems whole issue seems troubling (as you suggest) unless Newcomb-like problems are not the norm, even if they feel like the norm to people worried about solving decision problems.
Thanks for explaining about the umwelt! I was unfamiliar with the term but it does a great job of giving a way of talking about what we might otherwise think of as the intersubjective from a biological perspective and ties philosophical concepts in with the example of how they have been implemented by life on Earth.