Gunnar_Zarncke comments on The Friendly Telepath Problems

Gunnar_Zarncke 19 Feb 2026 16:40 UTC
2 points
0
Yes, I meant that
we need opacity with respect to ourselves so that we can’t edit ourselves, which then allows us to have rigid commitments, which is good for coordination.
And most of the post is downstream of that. Thanks for addressing the central claim.
But I think that last strategy [making one’s commitments rigid is by being unable to consciously access them] is just one possibility, and it incentivizes irrationality.

Here’s another one: I know I need to maintain a commitment rigidly in order to coordinate with others. So I have a commitment which in some abstract sense I could edit, but I don’t, at a policy level.
The alternate strategy makes sense to me. I agree that long-term planning is real.
my roommate could kill me. But I know he won’t. Not because he’s literally incapable of it, and not because it happens to be convenient for his local incentives, but because he’s attending to more long-range incentives.
The brain does long term credit assignment. And I think some of that happens via reflection on past and future experiences (“I don’t do X because I’m optimizing long-term Y.”). While not everyone does such a clean rationalist version, I think an approximation of that is often going on.
So my “read-only via opacity” is too simplistic. Let me do better.
I think a position I defend more strongly is: As your long-term reasoning (system 2) is iterated, practiced, and rewarded, it gets condensed into habitual responses by system 1. In the momentari situation of an interaction, the habitual action is selected (and in that moment you do not have access).
To explain, let me unpack your “I could edit, but I don’t, at a policy level.”
I would interpret “I could edit” as “I could spend some time reflecting on my commitments and the related habits. I could change (eg renege on) my commitments in the sense of thinking about such changes, practicing them in hypothetical settings or otherwise see the changes play out. I would cache these thoughts such that in a concrete immediate situation I would respond differently. And then “I don’t, at a policy level.” means that you do not go thru these motions and leave your prosocial habits in place because you know or estimate that in expectation that’s better over many roll-outs of the habits/cached thoughts.
Let me know if that unpacking sounds reasonable to you.
The key thing is that in the moment of a concrete interaction, we actually can’t go thru all these motions of self-editing. There is not enough time. The actions come out automatically, habitually, naturally. And the other party can see that.
The habit is the interface.
That said, I do think it’s helpful to have a barrier between your social interfaces and your conscious processing, and I do think it’s helpful for coordination. It’s just that the mechanism I see goes through the fact that having that barrier lets you think more strategically.
Would you say that the system 1 caching of thoughts is the barrier?
I agree that this barrier is not one of opacity, at least neither in the sense of self-obscuring nor factual unaccessiblity of access. But the access that is there is too slow for on-the-spot rewrite.
I do think that there is opacity involved in the process (see the opacity described in Parameters of Metacognition—The Anesthesia Patient), but its effect on the barrier and it’s enabling of commitment is indirect. And seems more involved than I thought. Thank you for drilling down on the concept.
Tangent 1:
The rewriting process needs an entity that is doing the rewriting intentionally. As I tried to show in Between Entries, reflecting and condensing is a large part of how we build a self-model. So this strengthens the connection to the self as an interface.
Tangent 2:
With enough clarity, you can rederive why prosocial commitment devices make sense, and integrate that truth into your strategies
I agree that this is useful knowledge to gain esp. if you can make it habitual. But “with enough clarity” does a lot of work here. Such commitment devices do not work in all environments. So even if you can derive them (which may require contexts most people never encounter), you might not benefit from it.
(instead of having a bunch of potentially useful thoughts being unthinkable in invisible ways).
Agree. In the cached thought frame these useful thoughts would still be thinkable in principle. Whether a person does so or whether they are opaque to them is then a different question.