jimmy
The Uncertainty That Matters Isn’t Fundamental
But I don’t know whether you’re trying to change cultural or individual rationality here:
Like, I’m mostly optimistic about getting a few individuals to not do Crappy Epistemics, whereas I feel like you’re targeting groups,
If it’s from the criticisms of the rationality community for not berating people into rationality, oops. I should have put those in quotation marks. The point was “you’ll hear this from someone in this mode”, not to assert those things myself. I mostly see those exhortations as examples of the thing they’re railing against.
I’m not really trying to change anything here, just describing what is actually going on.
which seems difficult if I’m one of very few people who get what you’re saying.
Oh, for sure :)
But to be clear, I’m excited that you seem to be picking up on the same thing from another angle, separate from anything I’ve said.
I’ve gotten pretty comfortable with my ability to help people see the things needed to realign themselves on some object level issue, but I’m still new at communicating the meta thing of how to help people see how this alignment process works. Difficult, but fun/interesting, and I don’t think too difficult to learn.
If you’re saying “we need dramatically better instrumental rationality, of which short-term optimization targets are a big component” then yes, strong agree. I feel like you’re saying something else though, maybe about coordination between humans?
Yeah, I think that’s a facet of it. But also, yeah, it’s more than that. The stuff about coordinating between humans is just another facet too.
There’s a failure mode in changework where people try to “fix their irrational fear”, or “walk the client through what they need to do to fix their irrational fear”. These seem like perfectly reasonable responses which is why people do them, but watch it play out enough times and you start to notice that the stubborn attempt to control away the fear is the problem. That once you notice that you don’t actually know your fear to be irrational, you naturally turn towards noticing whether you’re actually safe, and that’s the move that conditions away inappropriate fears. That once you notice that trying to push people towards having a certain “correct” orientation to their pain is actually the thing that causes the suffering, you naturally turn towards “what’s the real problem here?” and that’s what dissolves the suffering in a word or “a few messages”.
This move of “Oh, control isn’t working. I wonder why?” turns out to be very general. Not just applied to one’s own mind, or helping others with their own minds, but also helping others learn to help others with their own minds and so on and so on. When my friend asked for help getting her four year old to take her eyedrops, I was able to play with my friends discomfort which led to her being able to play with her daughters discomfort, which led to her daughter playing with her own discomfort. “You’re a brown belt in jiu jitsu, what do you mean how do I get my child to take her eyedrops!?” → “Sigh. I just feel like I shouldn’t have to use force… and I guess this is another one of those things where my own tension is telling her to be tense, huh?” → “Mommy, can we play the eyedrop game!?”
Valentine has a good post “We are already in AI takeoff”, which I took a stab at putting into my own words in a comment there.
In short, and translating into the language we’re using here, “trying to align AGI” is itself an instance of this same exact failure to align ourselves. Because no one is looking at it like “Oh, yeah, easy peasy. I predict that I will not experience prediction error, because I got this”. It’s all pushing back in attempt to control away prediction error because the consequences of failing are unimaginably bad, while failing to act on the uneasiness coming from predictions of this not panning out. Which turns out to be where all the most useful information is.
When I look instead towards “If this goes well, what does that look like? How did we get here?”, the answer I see is one where the people guiding the development of AI aren’t pushing away from any of the relevant information, let alone the information that they themselves perceive as most important with respect to whether what they’re doing is working and what kind of moves might actually work.
In other words, it’s one where the researchers themselves have enough embodied skill in alignment that they can approach the problem with their full faculties. Not just because “That’s what rationality is, and good instrumental rationality is necessary for succeeding at hard things”, but also “it’s literally the same skill”. In the same way that relating to one’s own mind is the same skill as helping a friend relate to theirs is the same as helping a friend helping their kid relate to theirs.
Same skill, applied on multiple levels. The skill in “becoming rational”/”coordinating groups of humans”/”aligning AI” is all skill in alignment. Borrowing Val’s words again, “It’s fractal”.
When I think about “how to align AI”, I notice that I don’t actually know how to do this. There’s nothing I can see, where I think “Ah, this is the code I need to write” or “here’s the things I need to exhort at people to do” that will predictably yield the outcomes I want. Not through “targeting groups”, or “targeting individuals”, or “targeting code”.
And I notice that stopping to notice this is by far the most important thing I can do, since “trying” would necessarily blind and therefore doomed to success by luck at best. And that one of the better “object level applications” for me right now is to highlight the nature of this move, since a big part of “Why is this control not working?” is “Because people aren’t aware of how control works”—Okay, cool. So if we change that, this part of the problem dissolves.
There’s something even more general and self referential that I’m fumbling towards though, since doing that thing is the thing I can actually expect to lead to the best possible outcomes—structurally and necessarily. But it’s a bit mindbending because “trying to generalize” is itself an instance of the thing I’d be trying to avoid (and so would “trying to not try” or concluding “we shouldn’t try”). “Generalizing is hard. I wonder why?” is the generalization. So I guess that’s the next thing to wonder, once I have some mental room for it.
Anyway, your post on BCI facilitated AI alignment looks to me like a step in the same direction. A step towards noticing that AI alignment is downstream of human alignment (in this case, because aligned and augmented humans are more competent which is instrumentally useful), and that the solutions which actually work have more competent humans more tightly integrated in the alignment process for longer—rather than keeping a stance of “I’m outside the system, aligning THAT THING is what I’m trying to do, dammit”.
I don’t think you’ve been explicitly thinking about it in the same terms I’m laying out here, but it does seem like it might be downstream of beginning to sense and act on the same thing I’m fumbling towards. Like you might be already on the same path fumbling towards the same thing that I’m trying to put a finger on (and noticing myself not fully having yet, in this sentence. Lol). Does this fit?
You’ve probably already mentioned it somewhere, but perceptual control theory relatedly posits that motivations/actions are just a way to control what sorts of things we experience.
I don’t think I have, but yes. Agreed.
If the same machinery underlies factual prediction and normative actions, we confuse them to all heck.
Yep! And there are pretty big practical consequences of this.
This is a clearer, much more precise statement of a somewhat different mechanism than I was originally proposing here. I’ll need to think for a bit about whether this changes my BCI-superintelligence stuff.
Hm. I haven’t had time to read and process your other post yet, but I do think that human alignment is important for having a hope at aligning things bigger and more intelligent/powerful than ourselves. Like, there’s a big “I’m outside the system!” type error, which systematically screws up control attempts because they don’t take into account the inside-the-systemness and attempt to align “them” instead of “us, starting with me”—two boxing AI alignment, basically. It sounds like maybe you’re on a similar track?
But even in situations like this, there’s probably confusion somewhere. Like, “why’s a fellow rationalist confidently Wrong?” is probably bubbling from some part of their mind, even when other stuff is talking over the confusion.
The problem is that this is in direct opposition to the attempt to control. Ask a thermostat why the room is too cold, and the only answer it has is “Because I haven’t added enough FIRE!”. Why is the rationalist confidently wrong? Because he’s a Bad rationalist! Why is he a bad rationalist? Because y’all haven’t called him out for his Badness! More shame! Beat him into shape! Why haven’t we done that? Because y’all are bad rationalists too! That’s why I’m yelling at y’all to fix you!
Pondering “Hmm… I dunno. Maybe he doesn’t see this piece?” requires people to relinquish the control which they’ve already decided is worth doing, so you’re gonna get “control” type answers unless you push back against their control loop hard enough that they let go. I’m not saying you can’t do it, but it’s gonna take some oomph which has to be sourced from somewhere—which makes it trickier to self apply. The question “is it working?” comes from outside the loop and points at the loop itself, which makes it a lot more widely useful and easier to self apply.
I do think we’re mostly in agreement. If I read your post as a blurry gesture at a shape of things, I think you’re getting the picture exactly right which is why I’m so excited to see another person “getting it”. So, “Yes! Strong upvote!”.
If I read your post as speaking technically and precisely, I see details that will need to be changed in order to get things to actually work in practice.I “want” both of us to get to the truth; but I’m locally intuitively trying to get them to change their mind.
I’m saying that this is directionally correct, but that the same problem shows up with respect to “changing people’s minds”.
“I “want” to get them to change their mind (because that’s what gets both of us to the truth which I already have); but I’m locally intuitively trying to push away from experiencing wrongness”What we’re actually optimizing for is something more specific and even more out of alignment. If we were actually optimizing for changing people’s minds, things would play out very differently because it would lead right back to truth seeking.
“What am I confused about” slowly but predictably clarifies where reality is biting your models, so long as you’re actually optimizing for understanding your confusion.
The mental state which we label “confusion” when we notice it in ourselves, is the state of feeling disoriented. If we were to put into words the implicit stance of this state, it’s “I have not been able to properly orient to this situation. I don’t know what to make of it and I’ve noticed this”. Without noticing that your models are failing you, it just feels like the world is wrong. “The problem is that the peg won’t go through the hole!”, not “I can’t figure out which hole this peg goes in”, let alone “Lol, square peg don’t go in round hole”.
In order to enter this state we label confusion, we have to notice that the wrongness we perceive is in our map, not the territory. We have to notice “Ah, I feel wrongness, that means I’m not oriented properly. I don’t know what to make of this”.
When you ask a Rationalist why they’re frustrated in a dialog that isn’t going their way on LW, they’ll tell you they’re frustrated because that other guy is wrong on the internet—and they’re supposed to be rational, dammit. Even in well respected (and otherwise respectable) rationalists, there is often-to-usually no recognition that the feeling of wrongness that they are modeling as “in the other guy” is actually in their own models, necessarily. This is where we go wrong.
And asking “What am I confused about?” won’t actually help. Because there’s no such such thing to notice. An outsider may describe the struggling person as “confused” or “disoriented”, but from the inside they have no feeling of disorientation to notice. In their own model, they are oriented properly—it’s the other guy who isn’t! So far as they’re concerned the problem is the hole, not the fact that they’re trying to shove a square peg into a round hole.
The thing which is actually there to notice is our perceptions of “wrongness”. If we ask “Where am I experiencing wrongness” then we’ll find it immediately. I experience it when the peg won’t go into the hole. When the guy online doesn’t change his mind.
This is the thing that’s actually there to notice, which we can then use to take the next step of “Why is the peg not going into the hole, do you think?”. Why is that guy wrong on the internet? This wrongness I’m feeling… what’s that about?
And this is the step that updates models to better match reality. Including noticing the inner misalignment that has been screwing everything up.
It’s true that sometimes we have tiny little notes of discord. Tiny little hints of “maybe I’m confused here?” that we fail to give due weight. But by the time there’s even a hint of subjective confusion to notice, we’ve already done the hard part. Most of our failures are due to stubbornly externalizing wrongness because that’s how we try to control, and we don’t want to give that up until we see a better way.
I’m excited by this post, because I think you’re onto something big and getting it almost right. Strong upvote!
Let me summarize what I’m reading here and you can tell me if I’m understanding correctly.
We simulate forward, including reward, and this means we can drift away from reality if we don’t re-ground ourselves deliberately.
Rationality techniques often fail because they assume we’re actually optimizing for the things we think we’re optimizing for, and this is empirically false: “If you ask such a person what they’re trying to do, they’ll describe something different from what they’d feel by introspecting during the process.”
Because of this, trying to “install TAPs” is fighting the current, which takes effort, and is less effective than we’d like
Maybe we can figure out how to want truthier beliefs, so that we aim for them automatically the way a baby aims to walk? This means getting actual truth seeking rewarded.
The key is to become aware of what we’re actually optimizing for
But asking “what would it feel like if my model were wrong?” feels underspecified because you’re throwing away your whole model you use to predict
So notice confusion instead.
I’ll start with where I agree. Then give a worked example where I think your details aren’t quite right, then gesture at the general direction.
Consciously tracking what you’re actually optimizing for is one example skill which seems worth developing despite its time cost
This is quite the understatement. You’re talking about inner alignment. This is the skill for rationality IMO, and has strong connections to the big problem of the day once you start to notice it—and I see from your next post that you have.
Say I’m explaining to someone why they’re wrong about X. I “want” both of us to get to the truth; but I’m locally intuitively trying to get them to change their mind. They’re wrong, after all. I’m speaking some words I think will get them to believe not X, but Y...
[...]
So now I’m optimizing, intuitively but not explicitly, to change this person’s beliefs. What do you expect I’ll feel if they give a strong counterargument?
I’ll feel defensive. I don’t endorse this feeling, but my short-term intuitive goal is beset!
This is exciting to see because I rarely see people diagnose the core problem so clearly, but the details are where you’re going to struggle in implementation.
A bit of backstory, I found this stuff from a different angle—by using hypnosis as a starting point to understand how this all works and working up from the “bit banging” towards a more generalizable model of what’s going on. This “Trying to explain to someone they’re wrong, but it doesn’t work” is actually the opening example/challenge I give for my sequence on the topic.
You’re right that we aren’t actually optimizing for “getting both of us to the truth” and that this is why we don’t progress smoothly towards the truth. The problem is that “trying to get them to change their mind” isn’t a sufficiently complete picture either.
We’re trying to get them to change their mind conditional on us being right—in other words, trying to assert our perspective at them, expecting this to change their mind. We’re not drawn towards any possibilities that involve us even examining whether we could be wrong, even if this would be the persuasive thing to do—as evidenced by the predictable defensiveness should we be pressed in that direction.
If we were actually optimizing towards “change their mind”, we’d notice real fast that our own defensiveness gets in the way. Because we’d notice that our first impulses aren’t going to succeed at “changing their mind”, and wonder why that is and what we can do to fix it. And the obvious answer, once we want to find it, is because we’re coming off as closed minded and not open to finding out if we’re actually right.
Which makes sense, because what do we do when people assert their perspective at us, expecting it to change our mind, without even considering what they might be missing? Okay, so the way to change their mind is to demonstrate open mindedness and—shit, yeah, that feels aversive to me. Okay, what now?What we’re locally optimizing for is even less flattering. We’re trying to push away their disagreement. We’re trying to batter it with arguments and reason until it stops reminding us that our expectations we’re placing on the other persons beliefs are wrong. This can kinda sorta work, when social pressure is enough for suppression, but it never actually changes their beliefs let alone helps us find reality.
There are a lot of moving pieces to sort out, which is why I didn’t close the loop on the opening example until the capstone post, but the take away is that aligning our local optimization with our conception of what we’re optimizing for brings both towards “finding truth” and, as a result of bringing us both towards truth, conditional on us being as right as we thought we were, convincing the other person.
So like, expect to actually change their mind if you’re so right, and then notice what happens.But what would it mean to run the negative query “what would it feel like if my current model of this situation were false?”
If I ask “what explains this under my current model?”, my mind has something to roll out; how’s the simulator supposed to run “what explains not(this) under my current model?”
Right, it’s a confusing question to grapple with. It seems unproductive because “If I were wrong” is assumed, not evidenced, so it could go anywhere. Then what am I using to predict? How do I choose which bits to pretend aren’t actually real, in my best guess?
Fortunately, we don’t have to assume, and we don’t even have to notice confusion.
We just have to notice wrongness. Dissatisfaction with reality. Dukkha. Prediction error. It’s everywhere.
When we’re frustrated that the other person isn’t yielding to our obvious rightness, why? We’re expecting them to believe us because we’re right, and they’re not believing us. Our expectation is being empirically falsified, and the path towards becoming less wrong is to start wondering why that is.
We’ll often end up at places like “Because they’re DumbBad, that’s why”, which, okay, fine. Maybe. But then why were we expecting them to listen to our fact and logic? Is that what DumbBad people do? Until all feels right in the world, we’re wrong about something, and that feeling of “not-rightness” is the clue as to where our predictions have diverged from reality.
We might be patting ourselves on the back when we get things that we predict should be rewarding, but unless we’re actually right reality is going to be slapping us in the face about it sooner or later. We might misinterpret the slap at first, but if we keep going it’ll lead us back to ground. And we can update on predicted slaps before they happen, even, and do a lot of the work ahead of time before we actually do anything wrong.
Confusion is what it feels like to notice that our models are insufficient. There won’t be confusion to notice until we notice that our models are insufficient, and it’s noticing “wrongness” that gets us there.
he just seems socially sharper in private emails than public comments
Interesting.
I think it makes sense though, for the reasons you gesture at. I have at times responded to charges of “You’re not wrong, Walter, you’re just an asshole” with “Except I’m also asserting that I’m not the asshole and that you are. Can you tell me I’m wrong now?”. Funny that no one can say “Yes” :P
It’s definitely a mistake to write Walters off as “missing the social signals that The Dudes sees”, because sometimes Walter is the only one that remembers when rules matter.Honoring the spirit of Said, I should note that my subjective perception of private correspondence is worthless as public evidence,
Eh, it’s pretty clear to me that this isn’t something you’d lie about. And now, to anyone who trusts my trust in you. And to anyone who has similar trust in Anna, since she did the “Agree” thing.
That kind of thing won’t reach everyone, but I don’t think it has to.
I say more in my other response, but basically, I am deeply suspicious of any “I should ban him because he’s not updating” because of how cleanly justification enables pressure to update or “self-ban” and how easily the justification serves as a shield. So like, “If I’m as justified as I think I am, and he’s as beyond reach as I think he is, maybe I should prove it”.
This can sound kinda naive at first glance, because everyone has seen first hand that some people are “unreasonable”, but I’ve pushed on this hard. Like, in my perspective pain is obviously just information and can’t ever actually be a problem… which seems absurd because of the obvious reality of chronic pain and the implication that if I’m right then engagement lead to people with chronic pain perceiving their pain as not a problem, so… What happens when we try it?
After pushing in this direction for the last fifteen years, I’m not finding much room for valid excuses. Every time I fail to get through to people it turns out to be a skill issue on my side, and that skill is in epistemics, and structurally it kinda has to be.
So for example, the only person I’ve ever blocked on any platform is Said. To give myself a little credit, it was for one post in particular which he had already proven he couldn’t handle in a previous exchange on the topic. And it wasn’t “I need to ban him because he’s unreachable”. I would have had a good time going further with it, if it weren’t for my perception that LW didn’t have the stomach for going any further (and to Said’s credit, he did).But still, according to my own principles, it kinda seems like that too should fall away? Like, if I actually know LW norms to be wrong, they should dissolve on contact with how I relate on this topic. But I didn’t anticipate that I could just poke fun at Said in a loving but teasing way and have the median LW voter notice “Ah, this is what it should look like!”. I anticipated that LW would flinch in the same way I anticipated Said to flinch, and motivatedly not-acknowledge the warmth and implications of their discomfort.
Running that simulation forward, as I write this comment, what I actually anticipate is provoking so much insecurity that if I keep going down that path I either get banned or write off LW myself… which I don’t want to do. So there’s my own flinch. And like “Yeah, so don’t do it, dummy”, which I didn’t. But also I hadn’t fully worked through that flinch because mourning that loss of “At least Less Wrong can be expected to get this… right? Right?” was overwhelming for me. It has been a lot of work for me lower standards until I no longer have contempt for most people, and meeting the people who showed up to early LW meetups was an oasis of much needed relief. So to have that threatened, to me, was a bit hard to take. Still is, I guess.
I’ve been chipping away pieces of it and am somewhat better now, but it’s still my own wrongness and neglecting to mourn is against my own principles that the sequence is about in the first place. I think it would have been better if I let Said make his predictably dumb comments, do my damn mourning, and then respond in a way that demonstrates the principles I was trying to gesture towards… even though it would have been super off topic for that post in particular, and it doesn’t feel like that’s a “fair” standard to hold myself to because it’s a significantly higher bar than Said is holding himself to or whatever. Oops. My bad.So yeah, totally understandable move. I did something very similar myself, even though I probably wouldn’t have if I was sitting on the other side of the ban hammer. But I still ultimately see it as a failure to live up to ideal truth seeking to do it. And even little violations matter in big ways downstream.
Yeah, I do. The same problem exists in one on one situations, actually.
So, “communication that is wanted and received” is often the right play in a one on one thing. It’s what I’m aiming for here, and wouldn’t change if this conversation were in private. You can’t really get through to people if they don’t wanna hear you, so talking at people who aren’t interested is a big failure mode and all that. Sure.
At the same time, there is another function of private speech, which is filtering.
Say we’re interacting one on one, and I’m coming off like an arrogant jerk. This gets in the way of the interaction because arrogance makes people dumb, so you can’t take my words seriously and we can’t actually do things. To the extent that you care about being able to interact with me, you have reason to find a way to communicate that I’ll want to hear. “I’m struggling to interact with you. Would you like to help me figure out how to go about this?” rather than “You’re an arrogant idiot”, perhaps.
Sure.
But what if you don’t have reason to care that much, or I’m pushing back against your attempts to help me get along with you? What do you do? “If I have nothing nice to say, I don’t say anything?”
That’s the most common response, among people who try hard at these things, but it’s missing a lot. One of the most significant relationships for my development came from what can be caricatured as “You dumb” “no u, because X” “No u, because response to X”.
If you would have watched that conversation and the immediate aftermath, it would be easy to conclude “See, wasted time. Shouldn’t have offered what wasn’t wanted”, because neither of us changed our mind on the topic of discussion and we both agreed to go our separate ways and stop interacting.
But what ended up happening is that he reached out again because he could see that I wasn’t actually dumb (even if he thought I overreached on points) and the situation was symmetric in that respect. If either of us had gotten upset and run away, then that’s the filter working as intended. Dumb worthless people running away when shown what they are. But the fact that we didn’t dodge this conflict is what demonstrated that we were both able to face what the other person thought they saw, and either counter it legitimately or else find the humility to update.
The extra ingredient here is
“I’m gonna show you [what I see to be] the truth, including the truth of how worthwhile you are as a person, and find out if this is something you want to hear, at the level of effort I’m willing to put in to make it palatable”
It’s a filtering move. If he runs away, good. That’s what I want, given that he’s not interested in humbling himself far enough to find the truth that I can see, at the costs I’m willing to pay to make it seeable. If he doesn’t run away, good. Now I can see that he actually wanted to hear this, and that enables useful interaction. Most will fail the filter if you filter aggressively like that, but damn you’re missing a lot if you don’t give people the chance to prove you wrong about what they’re interested in hearing.
Tell me I’m being an arrogant jerk, and if my perspective is more compelling than yours here I can just share it and win you over. If it’s not, I’ll feel it. And then I’ll have to choose whether to dodge—to which you can keep hammering me, until I decide differently—or accept your points, or run away myself. No matter what happens, you win.… unless… you are actually dodging yourself. Because inviting conflict doesn’t help you avoid recognizing your own wrongness. But you wouldn’t want to do that anyway, right? :P
The whole impulse to “be a jerk”/”Tell people they’re wrong on the internet”/”engage in status BS instead of cooperative truth tracking” comes from perception (or perceived ability to claim) that the other person’s self concept is delusional to the point that it is interfering with cooperative truth seeking, and that pushing back is the necessary impetus to work towards a solution.
The problem, as I see it, is that we’re so starved of examples of people “fighting” in ways that prove productive—fighting playfully, with engagement rather than flinching and upsetness driving the boat—that we’ve come to believe that the rational thing is to avoid such conflict and emotions, instead of learning how to have them productively. Instead of learning how to engage in them such that they resolve, how to engage in them such that the unworthy filter themselves out, etc.
I write in depth about these dynamics in my sequence “Beneath Psychology”. The short version is that we need Security so that we can negotiate for Respect, so that we can even get started making progress on the object level. But we conflate lack of Respect with lack of Security, in part because we’re motivated to cover up our own insecurity and “disrespectful!” is one way to do it, and we miss the ways in which Security is downstream of actual clarity, so we end up doing insecurity at the process of resolving disagreements on Respect, which is the exact thing that blocks resolution of disagreements/updating on evidence.
And all of this dissolves once you see the pieces for what they are. Because, for example, instead of “He’s disrespecting me!”, it’s just “And? Am I worthy of it or not? I can show up and we’ll find out, if I want. Do I want that?”
I agree with the emotional oomph behind this, and the thesis that the ban betrays the values. That matters, and it hasn’t gotten the attention that it deserves IMO.
At the same time, I don’t think this essay will have the effect that you desire because it’s missing some things that are important to understand about why things have unfolded the way they have.
To be clear upfront, I don’t have any opinion on the ban itself, and I feel fondly towards Said because I recognize pieces of myself in him—for better and for worse. In this comment I’m speaking from a place of “here’s what it looks like from this perspective”, and not actually asserting any of the criticisms or defenses of anyone.
Anyway, quickly picking at it, line by line:
Said, no less socially perceptive,
Honoring the spirit of Said, what evidence do you have for this claim?
Yes, it is disappointing when a post doesn’t land in the comment section the way I wanted it to, but it’s obviously not because commenters are culpably withholding their illumination of my precious ideas.
On the contrary, this is not at all obvious. People often culpably withhold their illumination of ideas that challenge them. Take any political argument ever.
(As for “implying that they are superior in their dismissal of your irrational and dumb ideas”, I suppose it’s true that when someone rejects an idea, that creates a logical implication that the rejecter thinks they’re more rational than the idea-proposer on that topic. It’s a weirdly petty implication to focus on, though. Who cares?)
Approximately everyone cares. That’s who. Even those who try to not care because they think they shouldn’t.
Criticism isn’t easy for you to take, right? If someone seems to be handling it easily, what do you think is actually going on behind the
scenesbeliefs they consciously claim to hold?Handling it well doesn’t mean “not caring”. It means not being thrown off by this. If someone rejects an idea of mine, and they’re right to do it, that means they caught something I missed. It means they’re capable of seeing things that I do not. This matters! I want to notice this!
What is most striking about the second perspective’s list of reasons that critical commenters make Less Wrong unrewarding to interact with is what it does not say.
It doesn’t say the commenters are wrong!
[...]
We can infer that bad arguments and strawmanning are not seen by user reports as a pervasive problem on Less Wrong. (If they were, the second perspective would have mentioned it.)
This tells you that they can’t see the problem clearly enough to articulate it.
I know I have been criticized in ways which were factually incorrect on the face, by people who couldn’t articulate the actual problem because they were in no position to criticize… which were nonetheless gesturing at a real thing I was doing wrong.
It’s true that posting on Less Wrong can be “unrewarding”—but that’s because the rewards are real and therefore have to be earned.
The big problem is that this misses the most important type of reward altogether.
There’s the reward of the pat on the head, and then there’s the reward of becoming less wrong. That’s the one that matters, and isn’t covered here.
The process of learning to be less wrong, when done right, feels less wrong. Feels good, relative to the baseline you’re leaving.
It’s absolutely possible to foster a culture in which people enjoy being called out and corrected. For now, I’ll leave “What does this require, which is different from LW as it stands?” and “which is different from what Said does?” as an exercise for the reader, because the important part is that more is possible. Humility can feel good. How?
The reason Said needed to be banned, at least according to the perspectives that enacted this ban, is that the community norms here either failed to foster the humility in the community needed to learn from Said—or else failed to foster the humility in Said needed to learn from the community.
What is missing in the norms, and what would it look like to facilitate this learning? Why don’t people already like learning when they’re wrong, here? Why don’t people already like teaching people that they’re wrong, here?The only way I can read it as “an implicit claim that [Hoffman] violated some social norm” “for which he deserves to be punished” is if there’s a social norm against making posts that can be criticized,
Oh, no. Said is absolutely attempting to punished the according-to-him deserved because Hoffman violated Said’s norms.
Brief overview of the conversation, then we’ll go through it line by line:
“[quotes] hm”
“What’s your best guess as to what I meant here?”
“I assumed you meant what you wrote. It does not seem mysterious or confusing, just contradictory. (If you meant something other than what you wrote, well, I guess you’ll want to clarify.)”
“In case my point was obscure—time travel stories aren’t real, and Robinson Crusoe is also fictional.”
Said didn’t even make any substantive comment. He just juxtaposed quotes as if that’s all that’s needed. But he didn’t get the response “Ah, good point. I guess my whole post is wrong then”, or even “Ah, I see the confusion. Let me clarify” that he would have gotten if “all that’s needed” was correct.
Nor did Said show any signs of surprise. He didn’t say “Wait, you don’t see a problem here?” or “Oh, I misread?”. Instead, he doubles down on the presupposition that his interpretation is the only correct one, by conflating the meaning with the words used to convey the meaning. Said responded with “I assumed you meant what you wrote”. He didn’t respond with “I assumed you were intending for a relatively literal interpretation” or “Maybe we attach different meanings to the same words”.
Said is behaving exactly the way one would expect someone to behave if Hoffman is violating the norm of not saying obviously stupid shit. “Dude, I don’t even have to say anything for it to be obvious what’s wrong here. That’s how dumb it is. No, it’s not possible that the problem is on my side. I’m literally reading the meaning straight from your words with no interpretive layer, and if I read things wrong you should have written them right. Norm violation number 2!”
This is very different from someone who is offering an explanation that he expects the author to not have seen, or that he expects an author worthy of feeling welcome at LW to have not seen. And that communication of “You are not worthy of feeling welcome at LW” is exactly what “social punishment” is, right? The word “criticism”, as you’re using it, papers over the distinction between “here’s an error in your object level reasoning” and “here’s an error in thinking you even belong here, loser”.
If I read that same post and thought that Hoffman deserved to be in LW but couldn’t square such a seemingly obvious mistake with the idea that he shouldn’t be shunned off of LW, I’d ask. “Doesn’t this fall apart as soon as we notice you’re talking about fiction? What am I missing?”.
That’s the same information conveyed without the contempt. That does a better job teaching people like Hoffman, conditional on Hoffman being the one that’s missing something. And a better job teaching people in Said’s shoes, conditional on Said being the one that’s missing something.
The only question left is whether the implicit claims in the contempt are correct—a topic, which I’ll note, is kinda off limits to discuss explicitly in the comment section of this post. For in order for Said to speak plainly, he’d have to say “This is such a basic error that you do not belong here”. And in order for Hoffman to respond in kind, he’d have to say “No, u”.
But we can’t handle that, can we? We’re not going to resolve that one with either side saying “Okay, you were right” or “I was wrong”—are we?
What do you expect to unfold when these disagreements can’t be hashed out in the light? Is there any surprise that things have unfolded the way they have?
Achmiz is privately making a negative character judgment about the interlocutor as a person, even if Achmiz didn’t say that. Such an inference might well be correct, but it’s hard to see why it’s a moderation issue.
Well, it’s a moderation issue when correcting such incorrect judgements itself becomes a moderation issue.
Is it a moderation issue when someone says “You’re such an idiot, Said. No one respects you. You act like your judgements matter, and they don’t. You are too emotionally cowardly to face what you know yourself to be, and as a result you fail to meet the bar to be worthy of contributing here”?
If your answer is “no”, what moderation is left? What do you expect to happen to one more essentially unmoderated forum?
If your answer is “yes”, then what happens to the marketplace of ideas when you tax representation unevenly?
Punishing Achmiz and only Achmiz for frequently finding authors’ replies insufficient (because people suspect, rightly or wrongly, that Achmiz is privately judging them) would be incoherent,
That’s not the issue, at all.
No one has a problem with “finding authors replies insufficient”. I find your replies here insufficient. Yet you know I respect you. You can tell, ahead of time, even, that I want you here, posting this stuff. And I bet this makes it significantly easier and more pleasant to deal with even before we get into how that relates to the epistemics.
The problem is that Said jumps from “finding authors replies insufficient” to leaning on the status ledger that we’ve kinda sorta agreed to not lean on, and can’t within-moderation respond to with seemingly-appropriate force in kind. And he leans on it in ways that don’t read as “fair”.
To the extent I make a good point here, you will learn on the margin “Oh, Jimmy knows shit”. Even if I do it respectfully, if I make the point that you were missing more than you had realized, you’ll learn “Oh, I don’t know as much as I thought I did”. To the extent that I don’t succeed in making these points, those updates don’t happen. So either way, any updates will feel fair, because you chose them.
Tell people what they’re supposed to think on the status ledger, and they’ll be inclined to push back. Ban the pushback, and pressure will build until something gives.
I’d like the dam to not fail as much as anyone. “Dam failing is bad!” is true, and I think actually does need to be said here—so strong upvote.
At the same time, water pressure gonna do what water pressure gonna do. “Dam shouldn’t fail” doesn’t have much traction on me because it’s just physics. Actual goodness is served by routing water efficiently such that pressure to abandon our values never builds in the first place. This requires modifying our values to be more internally consistent, and noticing where they are failing us.
Let us not resist the pressure to abandon our values, but to notice the pressure, and what it says about our values as we’ve been understanding them.
Pace, for example, characterizes Achmiz as “holding extreme disdain and disrespect for interlocutors while being committed to never saying anything explicitly or even denying that it is the case.” [...] … But It’s Hard to Believe That Open Disdain Would Be Better
Right, so he’s good at noticing the norm about saying this stuff in the open and doesn’t say it in the open. Nor does he attempt to pretend his perspective is different than what it is, so people notice. This is good! He’s respecting the rules, being as honest as he’s allowed to be, and people are objecting to his actual beliefs! The ones he’s not even allowed to argue for!!!
Because they’re not allowed to argue back, sure. But man, what would you do if you were to realize that most people on LW are morons who don’t belong because they don’t even care to make sure they get basic shit right? How would you feel if no one could argue you out of your growing contempt, because instead of countering its correctness with arguments, they just got upset at you for daring to look down at them when that’s where they’re speaking from?
Once you realize that you’re in fact better than everyone else, such norms really paint you into a corner about what you can do about it.
Which is a problem if you’re correct about being better than everyone else and if you’re incorrect. We can’t even figure out how much of column A how much of column B, because we can’t talk about it.
Am I … somehow wrong about that? Have I been deceiving and gaslighting people this whole time by engaging with the substance of their ideas and trying to keep my thoughts about how stupid and dishonest they are to myself?
Yes, but you’re following the norms. And that’s probably good because it’s really hard to tell people that they’re stupid and dishonest in a way that they register as rewarding enough to sign up to learn from.
Engaging with the substance of the ideas is a big part of doing it skillfully, and it’s good to notice when you don’t have the skills to pull something off… but yes. There is a skill in telling people how stupid and dishonest they are in such a way that they realize you’re right and thank you for it. And furthermore it is a rationality skill. A skill in noticing what you’re really saying and believing, and whether it is as justified as you think it is. A skill in noticing how their beliefs are formed, whether they might be right in ways you’re not tracking about things you don’t realize they’re saying, and whether saying the things you think should be said actually does what you think it should.
I am not a polite person and am not committed to never saying anything explicitly nor denying that it is the case, so if it would help, I’m happy to clarify that I hold extreme disdain and disrespect for the entirety of the Less Wrong moderation team and a decent fraction of the userbase for general reasons that are neatly exemplified by the concrete case of their relentless persecution of Said Achmiz.
What a polite and respectful way to say that.
To the extent that I might qualify as part of the userbase for which you hold some amount of disdain/disrespect, thanks for respecting me enough to say it. That’s what’s important to me.
Your post here seems like a good example of the kind of straightforward aggression that is permissible, and I notice that it takes an amount of both skill and time that it is prohibitive.
“Criticism will be accepted, so long as every other line rhymes, each line contains a prime number of words, and there are no typos. Response time 1 year”.Rather, I am writing to the world in defense of a principle that I’ve spent the last decade of my life [44] fighting for: that the truth matters, and specifically, that the truth matters more than people’s feelings
Thank you. Sincerely.
The one little addition I’d like to make is that feelings are about reality as well, and as such embody statements with truth values. Knowing what to make of these matters immensely, because otherwise we cannot orient to what the challenge actually is.
I’m not talking about “by default”. Of course that is going to fail, for the same reason as “assume by default that every scoop of dirt has a diamond in it” will fail. Most people aren’t interesting, and if you’re paying attention to how the expected payoffs evolve, it doesn’t take years to notice that you need to be interested a lot fewer people.
The fact that you frame this as “I tried it, for years” shows that you’re taking a very low resolution long feedback loop approach to this. The things I’m talking about don’t show up at this resolution, and take too many feedback loops to develop on these timescales.
The question isn’t “Is it better to assume every scoop of dirt has a diamond in it, or give up altogether”, it’s… well, a bunch of things, but “Does this particular scoop have a diamond in it?” is one. “What am I even mining for, anyway?” is another. Who did I take an interest in, why did I think they were worth being interested in, how did I get that wrong, what is the updated set of criteria for someone to be interesting? How much can I trust my own predictions about who will prove worthwhile?
What I’m pointing at has nothing to do with the answer you have to any specific question. If you think most people are uninteresting, fine. Me too man, me too.
What I’m pointing at is that the presuppositions you’re using, and the way you frame things, heavily suggest that there are specific questions you’re not asking—or at least, not sitting with for long enough for the answers to shift. There’s a fluidity to be had, which allows for better conformance to reality, which isn’t showing up in how you’re writing about these things.
But I’m definitely coming in with an attitude of “How soon is interacting with this person going to be net positive for me?” and “How soon is she going to pull her weight in our interactions at all?”. In practice, sex is by far the most common way to get a positive answer to those questions quickly. (Other paths to a positive answer in-principle include unusually good dancing or her organizing fun outings or me learning interesting things from her. But all of those are rare, and it’s extremely rare for any of them to be as good as my typical sex.)
The interesting part of this is how you ground “net positive”, and what this does to your strategies.
To give a toy example, say you meet a woman who teaches you how to change a car tire, which comes in handy down the road when your tire blows out. It sounds like you think “Yep, value acquired” at step 2, because “this is interesting” which can serve as a proxy for “might become useful to me someday”. But you could also think “Nothing this woman has told me has been useful at all” right up until your tire blows out—at which point you think “Ah, she’s starting to pull her weight!”.
It’s just a different point at which you measure the expectation of value. Do you recognize “this is useful in expectation” because you know flat tires happen, or do you wait for it to be proven that you will get a flat tire? Or hold out even further with something like “Yeah, but maybe someone would have stopped and helped if it didn’t look like I was already taken care of it, so she still hasn’t proven value?”
This can go in the other direction too, with “Ooh! Interesting person! Value acquired!” before she tells you this fact at all, because you recognize in expectation that she’s likely to tell you things which are then likely to prove useful. If you do this, people can become positive sum much more quickly just because the accounting is changed to incorporate longer time horizons and more abstract abstractions. You don’t need her to save you from a flat tire immediately, so long as you can predict immediately that she’s likely to provide value in some way down the road.
The relational strategy you describe is very Kegan stage 2, which isn’t necessarily a bad thing, but it does suggest there’s a lot of value in understanding what stage 3 would look like and how it might outperform stage 2 for you even by your own values. For me, it was a very eye opening experience to see what it’s like taking an inherent interest in people. Before having this experience I would have written it off as “But what use is this person to me? There’s nothing interesting here, so ‘taking an interest’ would be fake and wrong”, but it turns out I was missing what it’s like to do it right. Done right, “inherent interest in people” still serves the self, just with implicit recognition of the fact that at some point trying to add every detail to the ledger costs more than it pays for.
The tricky part about this is that it’s not really about “conscious recognition of these facts”, but the lenses we use to experience the world. Sex feels “inherently good”, for example, whereas talking to some girl might not. But casual sex can lose the reward once we reflect on whether this formerly-expected-value is actually cashing out anywhere, and talking to new women can begin to feel rewarding once we realize that this one actually might. What can feel like “my terminal values, not to be fucked with” often turn out to be cached instrumental values for which we’ve lost the audit trail. The value here is in redoing the calculations so that we’re not left valuing the wrong things by our own (smarter, more reflectively coherent) values. And the trick is that until you can represent what it’d feel like to to have great sex and be “meh” about the great sex, and why that might be desirable, it’s hard to even see the distinctions being updated.
I’m reminded of an AI researcher I worked with in 2024 called his prior seven years of inner work (Vipassana, Tibetan Buddhist meditation, various retreats, IFS) “LARPing working on the problem.”
This part is interesting, since he’s been on both sides. It’s not that he “just wanted to LARP”, or he wouldn’t have done it for realsies and lost interest in hose communities. At the same time, by his own description, what he did for the first 7 years was LARPing at doing the thing instead of doing the thing. Not “Genuinely trying, and failing”. LARPing. Wearing the costume.
I bet you there are a lot of people in these shoes. LARPing and playing pretendies until someone shows them the real thing, at which point they eat it up.
A big part of the problem, maybe even the only problem in a real sense, is that they don’t know how to take it serious. Don’t really know that they aren’t taking it seriously. Because if you face that, then you have to take it seriously. And that means you have to look for reviews with actual results that you can expect to apply to you, and we all kinda know already that we’re not going to find that. Which means that we’re going to have to sit with “Oh god, they’re all useless, aren’t they. Am I hopeless?” as a real possibility. Which is a hard thing to sit with and take seriously. Especially as someone who has been flinching from “what will people think of me” into an anxiety problem, and flinching into LARPing at addressing their anxiety. What’s one more flinch? This one comes with reassurance from your therapist and support group that your therapist is doing a good job, and that you, like they, are trying your best.
Then, once you have an actual option to work with someone who does it for realsies, of course you take it. You don’t need to put on the costume anymore because you might actually become the real thing.
On the other side, I think it’s much the same. There’s still reason to flinch, only this time motivated by guilt and a need to pay the bills. I’m reminded of watching my daughter learn to play “play “hide and seek”″ as a stepping stone to learning to play “hide and seek”. Mentors who track their results are rare. And if you never meet any older kids playing “hide and seek” for realsies, it becomes hard to even notice that you’re LARPing even without reason to flinch.
I have a lot of sympathy for people stuck in this trap, even as the whole thing is kinda frustrating.
I think we’ll get there though. In part through posts like this, which make it more salient that there *is* a real thing, and it might even be reachable.
Often the best way to understand why other people do things we view as “irrational” is to notice why we do the same things. Or at least, what the closest thing is that we do, and why we do that.
In this case, you stopped at “I had a desk job”, and labeled that the root cause. You came up with “it’s outside the body!”, and rested on that to justify the halt.
But does “I had a desk job!” not have a cause? Might not the cause dip back into the system with something like “I wanted money, and that was the best paying job available”? Maybe the root cause is greed?
Why do you count “My internal decision making used my internal muscles to orient my neck in a certain way for hours per day” as “external”, but “lack of a high fiber prebiotic diet, the way my body evolved to expect” isn’t an external root cause? Lack of vitamin C is pretty clearly the “root cause” of scurvy, so this seems like a valid type of explanation. And once you notice that “lack of input” counts, even “lack of exogenous opioids” fits your criterion for a root cause of the pain — which it clearly isn’t.
The tree of causality has many branches, and it’s possible to intervene at any of them. The fire was caused by the fuel! No, by the oxygen! No, by the ignition source! Maybe a prebiotic diet would have fixed your stomach issue and things would have been fine. Maybe it would have been insufficient because you’d still have to fix your neck issues, but this criterion couldn’t tell you.
A “root cause” is a cause from which all other symptoms stem. Technically, there is no root as we can always keep going, but we can make pragmatic decisions about where to prune our representations of this tree of causality. Your “external to the body” heuristic, when applied intuitively as you do, is a heuristic. It is not the thing itself, and the “root cause” for a lot of rationality failures has to do with a failure to track this distinction, and giving in to the temptation to stamp our pragmatic decisions with the rational seal of approval.
Or at least, noticing that node does a lot for ya. The tree keeps going.
Although I’m already familiar with reinterpreting pain as something positive, and I’ve even experienced it myself, on reading this post I stopped to do it and I was able to see it from another angle, which was beneficial for me. Thanks for the examples, too.
I’m glad you got something out of it, thanks for sharing :)
What we add is often harmful, for example when there is pain and then we are afraid of it. Therefore, it is beneficial to unlearn that we added. We could at least temporarily add a positive interpretation, or hear the body better with less of our reaction. As a result, you’d be able to experience the sensation without being hindered by it, and you’d be able to experience it better, instead of being dissociated from it.
This isn’t reliable, though
Right, it can be hard to get positive interpretations to stick, or to unstick our reactions.
Part of what I am attempting to convey with this sequence as a whole is a specific move that renders these considerations irrelevant. Or rather, not the interesting part. I wasn’t paying any attention to things like “adding positive interpretations”. It’ll happen if it needs to.
In the case where I hurt my foot, this move looks like… “Yeah it hurts. Of course it hurts of course you don’t like that, yeah yeah, got it. Why does it hurt? What’s actually wrong?”
In the case of my friend’s “irrational” fear further along the sequence, “Yeah yeah you think this fear is dumb. Gotcha. Why are you afraid? What is actually going to happen?”
In the case of helping this guy with his pain, it’s “Of course I can’t ‘just tell him pain can’t be a problem’ and have that help anything . Okay, why? What would he need to see?” – and iterating that over and over.
In each case it’s a redirection, and in a particular direction that lies outside the normal perceptual FOV.
Yes, sometimes this results in listening to our bodies more directly, before our reaction. Sometimes it results in an additional positive interpretation. On a mechanistic level, interaction with him did lead to stuff like that.
The interesting part though, is the mechanics worked when he wasn’t trying to make them work, and only after he stopped trying to make them work. And that the reinterpretation stuck. So it went from “too hard to do” to “effortless and stable”.
My own experience mirrors his. I find that the more I try to fix things, even by mechanisms which I know to work and be valid, the harder they are. In a way that isn’t explainable as “Selection bias. You only try hard to fix the thing that didn’t resolve easily”, because sometimes it’s the same problem and it gets easier when I stop trying.
So like.. what is that? How does that work, exactly? What would it look like to intervene at this level, and watch the rest of the puzzle self assemble?
That’s the move I’ve been trying to iterate on, to develop in myself, and to convey in this sequence through performance of the move itself.
I don’t know if you’re looking for anything more right now, but if you are, this frame-breaking move is the thing that has been genuinely transformative for me. It is the thing that is upstream of every success story in this sequence.
Then how does one tell apant the true terminal values and instrumental ones?
I don’t know.
So like, one time I was playing football with my cousins on Thanksgiving, and hurt my foot bad enough that I thought it was broken. As I dug into why reality was different than I wanted, at first there was always something underneath.
“I wish my foot wasn’t hurting. Of course I wish my foot wasn’t hurting, who wouldn’t? I acknowledge the fact that my foot is hurting. Why is it hurting? Because it’s broken, lol.”
So then “I wish my foot wasn’t broken”, why’s it broken? Oh, because I was playing football and tripped. Why’d I trip? Shit just happens, man. So, “Of course shit happens, so of course I broke my foot when I was playing football with my family, so of course it hurts.” What’s the problem? Nothing, actually. Suffering resolved.
In that case, my desire to not be in pain bottomed out at not wanting to have hurt myself unnecessarily, and “shit just happens”, but like… the fact that it was serving as a terminal in this context is still dependent on the fact that I didn’t believe I could do anything about it. If there had been some nootropic that cuts the rate of mistakes to 10%, then “Why did shit happen?” now has an answer of “Because you didn’t improve your brain function, dummy”, and we’re back to the races.
I know how to get to the practical bottom of things, for a given context. But as some sort of general case where we remove “all” practical limits… I dunno man. I’m not sure the question is well formed. I’m not sure it’s not.
Does it mean that the CEV of an individual human
I don’t know where it grounds out, just where it doesn’t. Which is useful on the margin, and maybe even large margins, but in the event of a singularity where limits are removed past the point where we know what to make of it.. it’s still tough to say what that means kinda by definition.
Friston has some insights which seem relevant to me though. He talks about his “theory of every thing” (humorously distinct from “theory of everything”), and explains that every thing that exists must necessarily resist entropic forces towards disintegration or they would cease to exist. So in that sense, everything that exists seeks to maintain its own existence -- including drops of oil in water.
Humans obviously tend to actively maintain their own existence, but also lineages of humans on longer timescales. It’s not clear to me what happens when these things conflict and what “lineage” actually means once transhuman stuff becomes possible.
This is part of why I get a lot out of the word “trust”, which cleaves the second half of that very cleanly
Yeah, that’s a good point. That word does work very well for exactly that. Thanks for reminding me.
I think the reason I don’t tend to use the word “trust” is exactly this though. It works so well that it side steps the problem, which is great if you want to help people see one specific fear differently, and gently dissolve the pattern of confusion as fast as they’re comfortable. But obscures the issue when you’re going right for the
jugularepistemic foundation and trying to undermine the load bearing pillars right from the get go.“Trust” is probably a better approach in many/most cases, heh. I find the more aggressive version defensible and more fun, but it’s harder to get the trust needed for people to offer their foundations like that. I originally wrote out the stuff in my sequence under the title “Downshift, stupid”—poking at myself for exactly this kind of failure mode :)
Your trust is the map that forms your actual sense of reality. I was going to say “the map that you’re steering by” but that’s incorrect; lots of people who are scared of planes still fly in them.
Yeah, that’s an interesting little wrinkle which I was intentionally avoiding in writing this post.
I think the way I’d describe that is that people trust their beliefs about what they’re supposed to believe more than they actually trust their own object level beliefs. They’re fundamentally navigating a socially constructed reality map rather than using that map to help navigate the underlying reality itself. Which is kinda a huge deal, and everywhere.
But your trust is what you can tell for yourself is so. Someone who thinks they shouldn’t be afraid of planes but still is, is attempting to take someone’s word for it that planes are safe, which doesn’t produce trust. Whereas the same words from someone else can under some circumstances cause someone to notice “huh, I know several people who’ve been in car accidents, but I don’t actually know anybody who’s been in a plane accident” at which point it starts to affect their bodily sense of things.
You can absolutely trust someone’s word, if they convey that their word is worth trusting.
People rarely put themselves on the hook for that though. It’s not just “I shouldn’t be afraid of planes” used to weasel out of having to face “Am I going to be okay?”, it’s also “You shouldn’t be afraid of planes” used to weasel out of having to say “You’re going to be fine. I promise”.
Holding the line on the object level addresses the gravity of the situation head on. You don’t get to dismiss it away and then have an excuse for why it’s not on you that they don’t believe you. You have to sit with their fear, and know “If this person gets hurt because they trusted me and jumped off this cliff, I did that. I hurt them”. No “But statistics, so it was rational” to deflect with, because who decided to trust that the statistics applied here, and what was the result?
So like… how sure are you anyway? Maybe you want them to see it for themselves rather than trusting you, and that’s often wise.
But it’s not strictly necessary.
In practice, ironically, people usually only say “I know X” when they actually don’t trust X but have some reason to think that they should. [...] Trust is what truth feels like in first-person.[...]for interpersonal situations like “Jack would never crash a car” I would still recommend “I trust” over “I know”, for reasons described in 100% confidence + 100% humility
Yep.
The last line points out a substantive difference between “trust” and “know”. When I say that I “know” 2+2=4, it means I’m not even tracking the possibility of being wrong here. When I say I “trust” jack would never crash the car, but won’t say “know”, it shows that I’m tracking the possibility and accepting the risk anyway. “Trust” really is a good word for this move.
Thanks for the detailed response
But suppose that you are perfectly above all social pressure, and you happen to not like beer at the moment. Is there a good reason for you to choose the reprogramming anyway? Sure, if you do, you will be happy that you did. But if you won’t, you won’t miss it. (Maybe curiosity is the remaining argument for the change? But without the social pressure, there is no reason to be curious about this specific thing.) So it seems to me that there are two different reflectively stable outcomes, and it depends on history where you end up.
In practice, this path dependency thing is indeed important, but it has a lot to do with our tendency to get lost and fail to find coherence.
For example, instead of beer what about heroin? Needles are icky, but boy will that change if you give it a shot! What happens in the longer term isn’t so simple as “more happy” though, and especially when the effects come from exogenous chemicals, we can’t really trust our initial pleasure to cache out in anything real.
This can make heroin very risky because people will often fail to learn that heroin injections are yucky again, but that is where the road coheres to. I don’t have any experience with heroin, but I have tried legally prescribed opioids a couple times and went through this arc. After the first time I couldn’t stop thinking about it for a month because it felt so good. Eventually though, my brain kinda recognized that this is not actually a good thing, I don’t actually want it, and when I eventually tried it again “just because” it wasn’t even enjoyable.
Beer and coffee are a lot more subtle and have context dependent social stuff going on, but “If you do you’ll be glad you did” is far from a sure thing.
We could even go further; with superior self-modification skills I could give myself a completely arbitrary preference, for example a sexual fetish for triangles. It seems silly now, but after modifying myself, and decorating my palace with paintings of triangles, I would probably be happy that I did it. I may even feel a little curious right now about what such absurd situation would be like. So should I modify myself that way?
I mean… I kinda vote “yes”. Because curiosity is important, the things you learn from experience are important, and this is a relatively harmless example to experiment with.
In my experience though, arbitrary modifications like this aren’t very stable. If there isn’t any actual value being delivered, and you don’t try to set up a labyrinth of motivations to not-look, people tend to learn that triangles aren’t so exciting as they had tricked themselves into believing.
All new knowledge is self-modification, but not all self-modification is new (external) knowledge. You can also self-modify by resolving internal tensions. Or by changing a random connection in your brain.
A railroad spike to the brain will indeed indeed modify you, and is not well described as “learning new things!”. But resolving internal tensions usually is. And the railroad spike to the brain/random connection change is generally well described as losing things you have learned, or learning (expected) falsehoods.
Also, if you obtain new knowledge in different order, the later information gets interpreted in the light of the former.
This is another “probably in practice, but only because we don’t reach coherence” things. Maybe you consider the latter in terms of the former without going back to reinterpret the former in terms of the latter, but Bayes doesn’t justify this failure to propagate updates with any sort of path dependence.
The all-knowing Hitler would know that his original reasons for hating Jews are no longer valid, but he might retain an aesthetic preference for doing so, for example because the very emotion of feeling superior to someone feels enjoyable.
My point is that what we think of as inscrutable aesthetic preferences are built upon implicit beliefs about the world, and updating the underlying structure changes the aesthetics. Hitler may like to feel superior, but his potential superiority is itself a fact about reality that he could update on.
What happens when you sit with the question “Are you superior?”?
There’s often an impulse to flinch away, and refuse to update, but when you do, things change.
You’re noticing something really important and really interesting :)
Yep :)
There’s a reason “I expect you to ____” plays both roles.
Oh no, it’s deeper than that.
I figured this stuff out in part by trying to design an optimal temperature controller and noticing that it necessarily applies to anything that tries to do anything.
Friston takes this even further, and point out that everything is trying to do something. Because even drops of oil have to resist entropic forces otherwise they’d disperse into the surrounding water and cease to be distinguishable as a thing—and that therefore, it is a theory of “every thing” (not 100% sure this is one of the videos where he makes the pun).
In fact, the harder we try to be more objective, the more we import these same distortions at the meta level. “I’m not irrational dammit! You are!”
There are solutions, and you’re right that this is not one.
The tricky part is that taking it into account cuts against the ability to control, which is the exact thing we don’t want to give up. So the people who have the distortions in their way are going to distort away from what you’re saying the most.
We’re more or less not taking the uncertainty seriously, which is proof positive that we don’t have to.
Notice what comes up when we look at this though. “But there will be consequences!”. Yep.
Declaring what we “have to” do functions as another way to get us out of that uncomfortable uncertainty, huh?
You might like Valentine’s video No need for “should”