Assuming his case is similar to mine: the altruism-sense favours wireheading—it just wants to be satisfied—while other moral intuitions say wireheading is wrong. When I imagine wireheading (like timujin imagines having a constant taste of sweetness in his mouth), I imagine still having that part of the brain which screams “THIS IS FAKE, YOU GOTTA WAKE UP, NEO”. And that part wouldn’t shut up unless I actually believed I was out (or it’s shut off, naturally).
When modeling myself as sub-agents, then in my case at least the anti-wireheading and pro-altruism parts appear to be independent agents by default: “I want to help people/be a good person” and “I want it to actually be real” are separate urges. What the OP seems to be appealing to is a system which says “I want to actually help people” in one go—sympathy, perhaps, as opposed to satisfying your altruism self-image.
Assuming his case is similar to mine: the altruism-sense favours wireheading—it just wants to be satisfied—while other moral intuitions say wireheading is wrong. When I imagine wireheading (like timujin imagines having a constant taste of sweetness in his mouth), I imagine still having that part of the brain which screams “THIS IS FAKE, YOU GOTTA WAKE UP, NEO”. And that part wouldn’t shut up unless I actually believed I was out (or it’s shut off, naturally).
When modeling myself as sub-agents, then in my case at least the anti-wireheading and pro-altruism parts appear to be independent agents by default: “I want to help people/be a good person” and “I want it to actually be real” are separate urges. What the OP seems to be appealing to is a system which says “I want to actually help people” in one go—sympathy, perhaps, as opposed to satisfying your altruism self-image.