Coming across this series years later, but loving it so far!
I’m no expert on meta-ethics either (nor neuroscience), but I disagree slightly with the meta-ethical takes at the end. In particular, I expect that moral reasoning processes converge a decent amount among most humans (leaving aside true sociopaths) under ideal conditions, although maybe not all the way. In particular, I think I could predict with high confidence the broad strokes of morality in 2100 or 2200, conditional on a few things like humanity’s survival and the arrival of ASI. If I’m correct on this, I think that’s mild evidence for convergence. And I think this causes me to put more weight on something like the importance of a Long Reflection period.
I think we have different intuitions here because I am putting somewhat more weight on a lot of (bad, in my view) moral reasoning being (1) reflectively inconsistent with itself, or (2) based on a fundamentally confused understanding of the universe.
Examples of “bad” moral reasoning I expect to go away due to (1) or (2) above:
Not caring about animal welfare
Not caring / actively wanting harm to certain groups of people
Most moral reasoning based on religious belief
Repulsion around a lot of transhumanist ideas
Thinking badly / demanding punishment of people who have caused harm/suffering (This last one is a bit more controversial, I’m sure, but I stand by it. Arguments against free will + a lack of instrumental reasons for justice in a well-designed Utopia seem enough to undo it for me.)
Now I don’t expect reasoning on everything to converge. I agree that different weights on different innate drives probably cause people to value things a bit differently, and these values may diverge further upon being able to self-modify. But maybe we just disagree on how much this ends up mattering for a Utopia? It seems to me that this would just cause people to seek slightly different flavors of fun.
One caveat: if people find ways to primarily use their upgraded intelligence to further rationalize their existing beliefs, then I agree that convergence is unlikely. I hope this doesn’t happen, and I think a Long Reflection period could be valuable especially with respect to getting people to not do this. In fact, I think a Long Reflection or something similar is probably extremely important for getting the above list to happen, such that a future with a Long Reflection period would be in expectation much “more valuable” than one without such a period.
If you’re interested, I’d be willing to formalize the above prediction a bit more and make a (probably very silly and moot) bet on it. Maybe 1/1000th of my wealth (if there is such a thing) post-singularity (if we’re alive)? Could be your chance to win many star systems ;).
Thank you for the feedback. I am also confused about what exactly counts as a “bad habit” and why bad habits are bad (see footnote 1). I think one way of avoiding this confusion and still being able to reason about this argument is to just say that doing the bad habit less is less bad (or maybe “good in moderation”?), and doing the bad habit more is more bad. The point still stands: when your future actions are correlated with your present action, the singular action feels more weighty.
I agree that future me is not identical to myself either. I think the correlation is high enough that it matters for me, but might not for everyone.
Agreed. A similar point that I wish I had included in the OP is that this situation can be seen through the lens of acausal trade between your past and future selves: running something like UDT with respect to choices that will affect your future self, you’ll find your present self in better situations because your past self also ran UDT, so you continue with running UDT because of <insert favorite motivation of UDT here>. Importantly, this should work even if you have a high (or higher than you would like) discount rate.