GradientDissenter comments on GradientDissenter’s Shortform

GradientDissenter 14 Nov 2025 7:51 UTC
73 points
9
The advice and techniques from the rationality community seem to work well at avoiding a specific type of high-level mistake: they help you notice weird ideas that might otherwise get dismissed and take them seriously. Things like AI being on a trajectory to automate all intellectual labor and perhaps take over the world, animal suffering, longevity, cryonics. The list goes on.
This is a very valuable skill and causes people to do things like pivot their careers to areas that are ten times better. But once you’ve had your ~3-5 revelations, I think the value of these techniques can diminish a lot.^[1]
Yet a lot of the rationality community’s techniques and culture seem oriented around this one idea, even on small scales: people pride themselves on being relentlessly truth-seeking and willing to consider possibilities they flinch away from.
On the margin, I think the rationality community should put more empasis on skills like:
Performing simple cost-effectiveness estimates accurately
I think very few people in the community could put together an analysis like this one from Eric Neyman on the value of a particular donation opportunity (see the section “Comparison to non-AI safety opportunities”). I’m picking this example not because it’s the best analysis of its kind, but because it’s the sort of analysis I think people should be doing all the time and should be practiced at, and I think it’s very reasonable to produce things of this quality fairly regularly.
When people do practice this kind of analysis, I notice they focus on Fermi estimates where they get good at making extremely simple models and memorizing various numbers. (My friend’s Anki deck includes things like the density of typical continental crust, the dimensions of a city block next to his office, the glide ratio of a hang glider, the amount of time since the last glacial maximum, and the fraction of babies in the US that are twins).
I think being able to produce specific models over the course of a few hours (where you can look up the glide ratio of a hang glider if you need it) is more neglected but very useful (when it really counts, you can toss the back of the napkin and use a whiteboard).
Simply noticing something might be a big deal is only the first step! You need to decide if it’s worth taking action (how big a deal is it exactly?) and what action to take (what are the costs and benefits of each option?). Sometimes it’s obvious, but often it isn’t, and these analyses are the best way I know of to improve at this, other than “have good judgement magically” or “gain life experience”.
Articulating all the assumptions underlying an argument
A lot of the reasoning I see on LessWrong feels “hand-wavy”: it makes many assumptions that it doesn’t spell out. That kind of reasoning can be valuable: often good arguments start as hazy intuitions. Plus many good ideas are never written up at all and I don’t want to make the standards impenetrably high. But I wish people recognized this shortcoming and tried to remedy it more often.
By “articulating assumptions” I mean outlining the core dynamics at play that seem important, the ways you think these dynamics work, and the many other complexities you’re ignoring in your simple model. I don’t mean trying to compress a bunch of Bayesian beliefs into propositional logic.
Contact with reality
It’s really really powerful to look at things directly (read data, talk to users, etc), design and run experiments, and do things in the world to gain experience.
Everyone already knows this, empiricism is literally a virtue of rationality. But I don’t see people employing it as much as they should be. If you’re worried about AI risk, talk to the models! Read raw transcripts!
Scholarship
Another virtue of rationality. It’s in the sequences, just not as present in the culture as you might expect. Almost nobody I know reads enough. I started a journal club at my company and after nearly every meeting folks tells me how useful it is. I so often see so much work that would be much better if the authors engaged with the literature a little more. Of course YMMV depending on the field you’re in; some literature isn’t worth engaging with.
Being overall skilled and knowledgeable and able to execute on things in the real world
Maybe this doesn’t count as a rationality skill per-se, but I think the meta skill of sitting down and learning stuff and getting good at it is important. In practice the average person reading this short form would probably be more effective if they spent their energy developing whatever specific concrete skills and knowledge were most blocking them.
This list is far from complete.^[2] I just wanted to gesture at the general dynamic.
1. ^
  They’re still useful. I could rattle off a half-dozen times this mindset let me notice something the people around me were missing and spring into action.
2. ^
  I especially think there’s some skill that separates people with great research taste from people with poor research taste that might be crucial, but I don’t really know what it is well enough to capture it here.
- habryka 14 Nov 2025 20:12 UTC
  26 points
  8
  Parent
  I think very few people in the community could put together an analysis like this one from Eric Neyman on the value of a particular donation opportunity (see the section “Comparison to non-AI safety opportunities”).
  Huh, FWIW, I thought this analysis was a quite classical example of streetlighting. It succeeded at quantifying some things related to the donation opportunity at hand, but it failed to cover the ones I considered most important. This seems like the standard error mode of this kind of estimate, and I was quite sad to see it here.
  Like, the most important thing to estimate when evaluating a political candidate is their trustworthiness and integrity! It’s the thing that would flip the sign on whether supporting someone is good or bad for the world. The model is silent on this point, and weirdly, it indeed, when I talked to many others about it, seemed to serve as a semantic stopsign for asking the much more important questions about the candidate.
  Like, I am strongly in favor of making quick quantitative model, but I felt like this one missed the target. I mean, like, it’s fine, I don’t think it was a bad thing, but at least various aspects about how it was presented made me think that Eric and others think this might come close to capturing the most important considerations, as opposed to a thing that puts some numbers on some second-order considerations that maybe become relevant once the more important questions are answered.
  - MichaelDickens 15 Nov 2025 5:32 UTC
    4 points
    0
    Parent
    ETA: I think this comment is missing some important things and I endorse Habryka’s reply more than I endorse this comment
    
    Like, the most important thing to estimate when evaluating a political candidate is their trustworthiness and integrity! It’s the thing that would flip the sign on whether supporting someone is good or bad for the world.
    
    I agree that this is an important thing that deserved more consideration in Eric’s analysis (I wrote a note about it on Oct 22 but then I forgot to include it in my post yesterday). But I don’t think it’s too hard to put into a model (although it’s hard to find the right numbers to use). The model I wrote down in my note is
    
    30% chance Bores would oppose an AI pause / strong AI regulations (b/c it’s too “anti-innovation” or something)
    40% chance Bores would support strong regulations
    30% chance he would vote for strong regulations but not advocate for them
    90% chance Bores would support weak/moderate AI regulations
    
    My guess is that ²⁄₃ of the EV comes from strong regulations and ¹⁄₃ from weak regulations (which I just came up with a justification for earlier today but it’s too complicated to fit in this comment), so these considerations reduce the EV to 37% (i.e., roughly divide EV by 3).
    
    FWIW I wouldn’t say “trustworthiness” is the most important thing, more like “can be trusted to take AI risk seriously”, and my model is more about the latter. (A trustworthy politician who is honest about the fact that they don’t care about AI safety will not be getting any donations from me.)
    - habryka 15 Nov 2025 5:45 UTC
      25 points
      11
      Parent
      FWIW I wouldn’t say “trustworthiness” is the most important thing, more like “can be trusted to take AI risk seriously”, and my model is more about the latter.
      No. Bad. Really not what I support. Strong disagree. Bad naive consequentialism.
      Yes, of course I care about whether someone takes AI risk seriously, but if someone is also untrustworthy, in my opinion this serves as a multiplier of their negative impact on the world. I do not want to create scheming and untrustworthy stakeholders that start doing sketchy stuff around AI risk. That’s how really a lot of bad stuff in the past has already happened.
      I think political donations to trustworthy and reasonable politicians who are open to AI X-risk, but don’t have an opinion on it are much better for the world (indeed, infinitely better due to inverted sign), than untrustworthy ones that do seem interested.
      That said, I agree that you could put this in the model! I am not against quantitatively estimating integrity and trustworthiness, and think the model would be a bunch better for considering it.
      - RationalElf 16 Nov 2025 8:39 UTC
        10 points
        0
        Parent
        Tone note: I really don’t like people responding to other people’s claims with content like “No. Bad… Bad naive consequentialism” (I’m totally fine with “Really not what I support. Strong disagree.”). It reads quite strongly to me as trying to scold someone or socially punish them using social status for a claim that you disagree with; they feel continuous with some kind of frame that’s like “habryka is the arbiter of the Good”
        habryka 16 Nov 2025 8:50 UTC
        6 points
        2
        Parent
        It sounds like scolding someone because it is! Like, IDK, sometimes that’s the thing you want to do?
        I mean, I am not the “arbiter of the good”, but like, many things are distasteful and should be reacted to as such. I react similarly to people posting LLM slop on LW (usually more in the form of “wtf, come on man, please at least write a response yourself, don’t copy paste from an LLM”) and many other things I see as norm violations.
        I definitely consider the thing I interpreted Michael to be saying a norm violation of LessWrong, and endorse lending my weight to norm enforcement of that (he then clarified in a way that I think largely diffused the situation, but I think I was pretty justified in my initial reaction). Not all spaces I participate in are places where I feel fine participating in norm enforcement, but of course LessWrong is one such place!
        Now, I think there are fine arguments to be made that norm enforcement should also happen at the explicit intellectual level and shouldn’t involve more expressive forms of speech. IDK, I am a bit sympathetic to that, but feel reasonably good about my choices here, especially given that Michael’s comment started with “I agree”, therefore implying that the things he was saying were somehow reflective of my personal opinion. It seems eminently natural that when you approach someone and say “hey, I totally agree with you that <X>” where X is something they vehemently disagree with (like, IDK imagine someone coming to you and saying “hey, I totally agree with you that child pornography should be legal” when you absolutely do not believe this), that they respond the kind of way I did.
        Overall, feedback is still appreciated, but I think I would still write roughly the same comment in a similar situation!
        DirectedEvolution 18 Nov 2025 17:46 UTC
        2 points
        0
        Parent
        Michael’s comment started with “I agree”, therefore implying that the things he was saying were somehow reflective of my personal opinion
        Michael’s comment started with a specific point he agreed with you on.
        I agree that this is an important thing that deserved more consideration in Eric’s analysis
        He specifically phrased the part you were objecting to as his opinion, not as a shared point of view.
        FWIW I wouldn’t say “trustworthiness” is the most important thing, more like “can be trusted to take AI risk seriously”, and my model is more about the latter.
        habryka 18 Nov 2025 19:40 UTC
        2 points
        0
        Parent
        I am pretty sure Michael thought he was largely agreeing with me. He wasn’t saying “I agree this thing is important, but here is this totally other thing that I actually think is more important”. He said (and meant to say) “I agree this thing is important, and here is a slightly different spin on it”. Feel free to ask him!
        DirectedEvolution 18 Nov 2025 21:21 UTC
        2 points
        0
        Parent
        I claim you misread his original comment, as stated. Then you scolded him based on that misreading. I made the case you misread him via quotes, which you ignored, instead inviting me to ask him about his intentions. That’s your responsibility, not mine! I’d invite you to check in with him about his meaning yourself, and to consider doing that in the future before you scold.
        habryka 18 Nov 2025 21:37 UTC
        2 points
        0
        Parent
        I mean, I think his intention in communicating is the ground truth! I was suggesting his intentions as a way to operationalize the disagreement. Like, I am trying to check that you agree that if that was his intention, and I read it correctly, then you agree that you were wrong to say that I misread him. If that isn’t the case then we have a disagreement about the nature of communication on our hand, which I mean, we can go into, but doesn’t sound super exciting.
        I do happen to be chatting with Michael sometime in the next few days, so I can ask. Happy to bet about what he says about what he intended to communicate! Like, I am not overwhelmingly confident, but you seem to present overwhelming confidence, so presumably you would be up for offering me a bet at good odds.
        MichaelDickens 16 Nov 2025 20:20 UTC
        4 points
        0
        Parent
        FWIW I think Habryka was right to call out that some parts of my comment were bad, and the scolding got me to think more carefully about it.
        Eli Tyre 18 Nov 2025 22:45 UTC
        2 points
        0
        Parent
        I would generally agree, but a mitigating factor here is that that MichaelDickens is presenting himself as agreeing with habryka. It seems more reasonable for habryka to strongly push back against statements that make claims about his own beliefs.
      - Thane Ruthenis 16 Nov 2025 1:48 UTC
        10 points
        4
        Parent
        Yes, of course I care about whether someone takes AI risk seriously, but if someone is also untrustworthy, in my opinion this serves as a multiplier of their negative impact on the world. I do not want to create scheming and untrustworthy stakeholders that start doing sketchy stuff around AI risk. That’s how really a lot of bad stuff in the past has already happened.
        No-true-Scotsman-ish counterargument: no-one who actually gets AI risk would engage in this kind of tomfoolery. This is the behavior of someone who almost got it, but then missed the last turn and stumbled into the den of the legendary Black Beast of Aaargh. In the abstract, I think “we should be willing to consider supporting literal Voldemort if we’re sure he has the correct model of AI X-risk” goes through.
        The problem is that it just totally doesn’t work in practice, not even on pure consequentialist grounds:
        You can never tell whether Voldemorts actually understand and believe your cause, or whether they’re just really good at picking the right things to say to get you to support them. No, not even if you’ve considered the possibility that they’re lying and you still feel sure they’re not. Your object-level evaluations just can’t be trusted. (At least, if they’re competent at their thing. And if they’re not just evil, but also bad at it, so bad you can tell when they’re being honest, why would you support them?)
        Voldemorts and their plans are often more incompetent than they seem,^[1] and when their evil-but-”effective” plan predictably blows up, you and your cause are going to suffer reputational damage and end up in a worse position than your starting one. (You’re not gonna find an Altman, you’ll find an SBF.)
        Voldemorts are naturally predisposed to misunderstanding the AI risk in precisely the ways that later make them engage in sketchy stuff around it. They’re very tempted to view ASI as a giant pile of power they can grab. (They hallucinate the Ring when they look into the Black Beast’s den, if I’m to mix my analogies.)
        In general, if you’re considering giving power to a really effective but untrustworthy person because they seem credibly aligned with your cause, despite their general untrustworthiness (they also don’t want to die to ASI!), you are almost certainly just getting exploited. These sorts of people should be avoided like wildfire. (Even in cases where you think you can keep them in check, you’re going to have to spend so much effort paranoidally looking over everything they do in search of gotchas that it almost certainly wouldn’t be worth it.)
        ^
        Probably because of that thing where if a good person dramatically abandons their morals for the greater good, they feel that it’s a monumental enough sacrifice for the universe to take notice and make it worth it.
        habryka 16 Nov 2025 2:01 UTC
        5 points
        0
        Parent
        A lot of Paranoia: A Beginner’s Guide is actually trying to set up a bunch of the prerequisites for making this kind of argument more strongly. In particular, a feature of people who act in untrustworthy ways, and surround themselves with unprincipled people, is that they end up sacrificing most of their sanity on the altar of paranoia.
        Like, fiction HPMoR Voldemort happened to not have any adversaries who could disrupt his OODA loop, but that was purely a fiction. A world with two Voldemort-level competent players results in two people nuking their sanity as they try to get one over each other, and at that point, you can’t really rely on them having good takes, or sane stances on much of anything (or, if they are genuinely smart enough, them making an actually binding alliance, which via utilization of things like unbreakable vows is surprisingly doable in the HPMoR universe, but which in reality runs into many more issues).
      - MichaelDickens 15 Nov 2025 6:08 UTC
        5 points
        2
        Parent
        Yeah I pretty much agree with what you’re saying. But I think I misunderstood your comment before mine, and the thing you’re talking about was not captured by the model I wrote in my last comment; so I have some more thinking to do.
        
        I didn’t mean “can be trusted to take AI risk seriously” as “indeterminate trustworthiness but cares about x-risk”, more like “the conjunction of trustworthy + cares about x-risk”.
  - GradientDissenter 14 Nov 2025 20:25 UTC
    3 points
    0
    Parent
    Fair enough. This doesn’t seem central to my point so I don’t really want to go down a rabbit-hole here. As I said originally “I’m picking this example not because it’s the best analysis of its kind, but because it’s the sort of analysis I think people should be doing all the time and should be practiced at, and I think it’s very reasonable to produce things of this quality fairly regularly.” I know this particular analysis surfaced some useful considerations others’ hadn’t thought of, and I learned things from reading it.
    I also suspect you dislike the original analysis for reasons that stem from deep-seated worldview disagreements with Eric, not because the methodology is flawed.
    - habryka 14 Nov 2025 20:52 UTC
      6 points
      3
      Parent
      I also suspect you dislike the original analysis for reasons that stem from deep-seated worldview disagreements with Eric, not because the methodology is flawed.
      I think the methodology of elevating cost-effectiveness estimates that thereby (usually, at least at a community level) produce lots of naive consequentialist choices, is a large chunk of the deep-seated worldview disagreement!
      I actually think I probably have it less with Eric than other people, but I think the disagreement here is at least not uncorrelated from the worldview divergence.
      I know this particular analysis surfaced some useful considerations others’ hadn’t thought of, and I learned things from reading it.
      Agree! I am glad to have read it and wish more people produced things like it. It’s also not particularly high on my list of things to strongly incentivize, but it’s nice because it scales well, and lots of people doing more things like this seems like it just makes things a bit better.
      My only sadness about it comes from the context in which it was produced. It seems eminently possible to me to have a culture of producing these kinds of estimates without failing to engage with the most important questions (or like, to include them in your estimates somehow), but I think it requires at least a bit of intentionality, and in the absence of that does seem like a bit of a trap.
  - Elizabeth 27 Dec 2025 3:27 UTC
    2 points
    1
    Parent
    seems like a lost opportunity to have this here and not on Eric’s actual post.
  - Adam Morris 15 Nov 2025 5:29 UTC
    1 point
    0
    Parent
    Is there reason to think that Bores or Wiener are not trustworthy or lack integrity? Genuine question, asking because it could affect my donation choices. (I couldn’t tell from your post if there were, e.g., rumors floating around about them, or if you were just using this as an example of a key question that you thought was missed in Neyman’s analysis.)
    - habryka 15 Nov 2025 5:47 UTC
      5 points
      2
      Parent
      I mean, I think there are substantial priors that trustworthiness or lack of integrity differ quite a lot between different politicians.
      That said, I overall had reasonably positive impressions after talking to Bores in-person. I… did feel a bit worried he was a bit too naive consequentialist, but various other things he said made me overall think he is a good person to donate to. But I am glad I talked to him since I was pretty uncertain before I did.
- Raemon 14 Nov 2025 22:30 UTC
  3 points
  0
  Parent
  For “Performing simple cost-effectiveness estimates accurately”, I would like to be better at this but I feel like I’m weak on some intermediate skills. I’d appreciate a post laying out more of the pieces.
  (A thing I find hard is somewhat related to the thing habryka is saying, where the real crux is often a murky thing that’s particularly hard to operationalize. Although in the case of the Eric Neyman thing, I think I separately asked those questions, and found Eric’s BOTEC useful for the thing it was trying to do)
- Saul Munn 15 Nov 2025 23:34 UTC
  1 point
  1
  Parent
  (1) Thanks for writing this!
  (2)
  This list is far from complete.
  Mind spelling out a few more items?
  (3) Consider posting this as a top-level post.