I am verbally intelligent enough to spin up true-but-socially-plausible accounts of my thoughts unless I am pressed. It is easy to have cached a charming-but-honest way to respond to various adversarial or socially normative questions; it is difficult to come up with them in real time. Usually when I fail at this it’s because I was visibly thinking, from which the interlocutor discerned that I wanted to say something unflattering.
I do not think I am verbally intelligent enough to pick, practice, and retain a subtle meta-honesty policy. It would increase the thinking time too much.
I also do not like making decisions that constrain large numbers of my counterfactual selves. This is a decision-theoretic matter about which I Might Be Objectively Wrong Somehow due to ignorance of the math. Good thing I’m not updateless, and therefore can resolve uncertainty as logical. See first sentence.
I am capable of recursion and meta, and can even consider things that are meta-meta, but at three layers of recursion I usually lose track of what’s going on. I expect I could learn to do this given fifty heterogenous recursion practice exercises but do not have such a list.
Note: The following is not a policy. It is a description of my existing behavior.
I sort people into basically three buckets: People, Instruments, and Dogs.
I don’t lie to People, because I don’t like lying to people.
I don’t lie to Instruments except on very specific topics on which they want me to lie. Then I sometimes lie. I try to minimize the number of people I treat as Instruments because they’re cognitively expensive. I interact with them “instrumentally.”
I lie to Dogs, because they want me to lie.
As a general rule, it’s easy to tell if someone’s a Person or a Dog, because the Dogs will tell you. If you treat someone like a Person and they don’t respond like a Dog, even if they’re not a Person they’re usually safe to treat as one. Then I do so, because lying is wrong unless you’re really sure the person wants to be lied to. Again, they usually make this obvious.
There’s also a useful trick where many socially-enforced lies are local.
E.g. “How are you doing?” “I’m fine.”
If you’re not fine, you can self-modify to feel fine for five seconds while still not lying.
“That’s rules-lawyering, and not in the spirit of literal truth.”
I disagree. “Literal” is a complicated thing which only seems simple. The correspondence between words and reality is subtle. It is easier to count than it is define bijection: it general it is easier to be right than to understand the underlying correspondences behind an existing instance of correctness. Does “fine” mean “not unhappy?” Or does it mean “unharmed?” Can I say “I’m fine” if I’m safe but unhappy when I’m speaking to a person too victim-brained to conflate unhappiness with someone hurting me? To me there isn’t an “obvious literal interpretation” at all. In the context of this social interaction, it barely even has a meaning and is basically a ritual object.
Occasionally me and another actual honest person have disagreed about the literal meaning of a question in real-life information exchanges; I thought he had an invalid theory of “deception” but in reality he had different desiderata asking the question than I thought, so his different demarcation of a dishonest answer was natural.
However, I recognize that the above logic can turn you evil, so in general I don’t use it when interacting with People, although I will still occasionally use the self-modification trick.
As a child I did not lie and was subtly punished for it. As an adult I started lying and it’s pretty cool and useful, although it’s quite skill-intensive. I might experiment with lying more because it seems very powerful. I do not think lying is like cigarettes.
I’m giving a detailed answer that goes into how I lie, but to be clear I’m exceptionally scrupulous about honesty and expect I lie extremely rarely by “normal standards.” Even now I want to edit that “extremely” to a “very” or “quite” because it’s not possible that regular people lie that often… right? I confidently expect that in the future people who are curious or curious-but-flinching can ask me about a topic and receive a true answer, unless we reach a level of normalized social violence which will make it obvious-to-both-of-us that truth is extremely rare.
I refactored my thinking similarly a while ago.
However, I feel like traditional virtue-ethical notions e.g. “courage,” “integrity,” have the same adversarial Goodharting problem (3) as in your critique of utilitarianism. “Loyalty is when you obey the Master,” “courage is when you go to war against the Enemy without fearing death,” etc. I suspect utilitarianism is maybe only barely worse than other ethical systems in regards to (2). It’s worthwhile to compare EA against regular humans. I don’t really understand either.
I think of virtue ethics as something like “being a healthy and functional cog in the Humanity machine,” where the Humanity machine is ultimately utilitarian.
Further, I think a lot of arguments for utilitarian behaviors “pass through” to virtue ethics insofar as we think that traits like “ambition” and “scope sensitivity” are virtues. I think they are: seeing their characteristic absence has a similar sliminess to seeing a cowardly or slavish person.
(Sometimes it’s just a lack of underlying numeracy, which I would not consider a lack of virtue but rather of education. I spoke to a man who said he wouldn’t suck a dick for a billion dollars, because he just couldn’t. I walked him through the size of a billion, and he changed his mind.)