Most of my posts and comments are about AI and alignment. Posts I’m most proud of, which also provide a good introduction to my worldview:
Without a trajectory change, the development of AGI is likely to go badly
Steering systems, and a follow up on corrigibility.
I also created Forum Karma, and wrote a longer self-introduction here.
PMs and private feedback are always welcome.
NOTE: I am not Max Harms, author of Crystal Society. I’d prefer for now that my LW postings not be attached to my full name when people Google me for other reasons, but you can PM me here or on Discord (m4xed) if you want to know who I am.
Do “subagents” in this paragraph refer to different people, or different reasoning modes / perspectives within a single person? (I think it’s the latter, since otherwise they would just be “agents” rather than subagents.)
Either way, I think this is a neat way of modeling disagreement and reasoning processes, but for me it leads to a different conclusion on the object-level question of AI doom.
A big part of why I find Eliezer’s arguments about AI compelling is that they cohere with my own understanding of diverse subjects (economics, biology, engineering, philosophy, etc.) that are not directly related to AI—my subagents for these fields are convinced and in agreement.
Conversely, I find many of the strongest skeptical arguments about AI doom to be unconvincing precisely because they seem overly reliant on a “current-paradigm ML subagent” that their proponents feel should be dominant, or at least more heavily weighted than I think is justified.
This might be true and useful for getting some kind of initial outside-view estimate, but I think you need some kind of weighting rule to make this work as reasoning strategy even at a meta level. Otherwise, aren’t you vulnerable to other people inventing lots of new frames and disciplines? I think the answer in geometric rationality terms is that some subagents will perform poorly and quickly lose their Nash bargaining resources, and then their contribution to future decision-making / conclusion-making will be down-weighted. But I don’t think the only way for a subagent to “perform” for the purposes of deciding on a weight is by making externally legible advance predictions.