Max H’s Shortform

Max H13 May 2023 0:17 UTC

5 points

38 comments1 min readLW link

Max H 13 Feb 2026 23:35 UTC
62 points
23
Peter Thiel pointed out that the common folk wisdom in business that you learn more from failure than success is actually wrong—failure is overdetermined and thus uninteresting.
I think you can make an analogous observation about some prosaic alignment research—a lot of it is the study of (intellectually) interesting failures, which means that it can make for a good nerdsnipe, but it’s not necessarily that informative or useful if you’re actually trying to succeed at (or model) doing something truly hard and transformative.
Glitch tokens, the hot mess work, and various things related to jailbreaking, simulators, and hallucinations come to mind as examples of lines of research and discussion that an analogy to business failure predicts won’t end up being centrally relevant to real alignment difficulties. Which is not to say that the authors of these works are claiming that they will be, nor that this kind of work can’t make for effective demonstrations and lessons. But I do think this kind of thing is unlikely to be on the critical path for trying to actually solve or understand some deeper problems.
Another way of framing the observation above is that it is an implication of instrumental convergence: without knowing anything about its internals, we can say confidently that an actually-transformative AI system (aligned or not) will be doing something that is at least roughly coherently consequentialist. There might be some intellectually interesting or even useful lessons to be learned from studying the non-consequentialist / incoherent / weird parts of such a system or its predecessors, but in my frame, these parts (whatever they end up being), are analogous to the failures and missteps of a business venture, which are overdetermined if the business ultimately fails, or irrelevant if it succeeds.
- TsviBT 14 Feb 2026 21:47 UTC
  10 points
  0
  Parent
  I agree with this literally, but I’d want to add what I think is a significant friendly amendment. Successes are much more informative than failures, but they are also basically impossible. You have to relax your criteria for success a lot to start getting partial successes; and my impression is that in practice, “partial successes” in “alignment” are approximately 0 informative.
  
  If we have to retreat from successes to interesting failures, I agree this is a retreat, but I think it’s necessary. I agree that many/most ways of retreating are quite unsatisfactory / unhelpful. Which retreats are more helpful? Generally I think an idea (the idea?) is to figure out highly general constraints from particular failures. See here https://tsvibt.blogspot.com/2025/11/ah-motiva-3-context-of-concept-of-value.html#why-even-talk-about-values and especially the advice here https://www.lesswrong.com/posts/rZQjk7T6dNqD5HKMg/abstract-advice-to-researchers-tackling-the-difficult-core#Generalize_a_lot :
  
  When an idea or proposal fails, try to generalize far. Draw really wide-ranging conclusions.
  
  Also cf. here (https://www.lesswrong.com/posts/K4K6ikQtHxcG49Tcn/hia-and-x-risk-part-2-why-it-hurts#Alignment_harnesses_added_brainpower_much_less_effectively_than_capabilities_research_does), quoting the relevant part in full:
  
  In alignment, on the other hand, you have to understand each constraint that’s known in order to even direct your attention to the relevant areas. This is analogous to the situation with the P vs. NP , where whole classes of plausible proof strategies are proven to not work. You have to understand most of those constraints; otherwise by default you’ll probably be working on e.g. a proof that relativizes and therefore cannot show P≠NP. Progress is made by narrowing the space, and then looking into the narrowed space.
- Oliver Daniels 16 Feb 2026 12:11 UTC
  2 points
  0
  Parent
  This is also a great explanation for why it’s hard to publish negative results
  - gwern 17 Feb 2026 6:14 UTC
    7 points
    0
    Parent
    Depends on the field, at best. In the psychology Replication Crisis, this was one of the classic excuses to not publish failures-to-replicate: “we did it right, so you must just have done it wrong; so it’s good that you can’t get published and no one will cite you even if you do. You’d just pollute the literature and distract from our important success.” Of course, it turns out that even if you involve the original experimenters in the followup to sign off with their magic touch, it doesn’t replicate once you lock down the analysis and get a proper sample size.
    - Oliver Daniels 17 Feb 2026 12:09 UTC
      2 points
      0
      Parent
      year fair, the particular kind of negative result I had in mind was
      
      “we tried some novel ML/interp technique and it didn’t work”
      
      which, to your point, is indeed a very particular kind of negative result
- faul_sname 15 Feb 2026 0:39 UTC
  2 points
  0
  Parent
  
  we can say confidently that an actually-transformative AI system (aligned or not) will be doing something that is at least roughly coherently consequentialist.
  
  I don’t think we can confidently say that. If takeoff looks like more like a cambrian explosion than like a singleton (and that is how I would bet), that would definitely be transformative but the transformation would not be the result of any particular agent deciding what world state is desirable and taking actions intended to bring about that world state.
- quetzal_rainbow 14 Feb 2026 9:57 UTC
  2 points
  0
  Parent
  Studying failures is useful because they highlight non-obvious internal mechanism, while successes are usually about thing working as intended and therefore not requiring explanation.
  
  Another problem is that we don’t have examples of successes, because every measureable alignment success can be a failure in disguise.
- Knight Lee 14 Feb 2026 1:18 UTC
  2 points
  0
  Parent
  ~~I agree with the idea of failure being overdetermined.~~
  But another factor might be that those failures aren’t useful because they relate to current AI. Current AI is very different from AGI or superintelligence, which makes both failures and successes less useful...
  ~~Though I know very little about these examples :/~~
  Edit: I misread, Max H wasn’t trying to say that successes are more important to failures, just that failures aren’t informative.
  - Raemon 14 Feb 2026 2:12 UTC
    5 points
    2
    Parent
    Yeah, but, there’s already a bunch of arguments about whether prosaic ML alignment is useful (which people have mostly decided whatever they believe about) and the OP is interesting because it’s a fairly separate reason to be skeptical about a class of research.
- lilkim2025 14 Feb 2026 2:30 UTC
  1 point
  0
  Parent
  The failure of an interesting hypothesis is informative as long as you understand why it doesn’t work, and can better model how the thing you’re studying works. The difference between CS research and business is that business failures can sort of “come out of nowhere” (“Why isn’t anyone buying our product?” can’t really be answered), whereas, if you look closely enough at the models, you can always learn something from the failure of something that should’ve worked but didn’t.
Max H 2 Apr 2026 23:27 UTC
23 points
4
There was some discussion recently about the uptick in object-level politics posts and whether this is desirable or not. There’s no rule against discussing politics on LW, but there is a weak norm against it, and topical discussions have historically tended to be somewhat meta and circumspect.
I think the current situation is basically fine, and it’s normal for amount of politics discussion to ebb and flow naturally as people are interested and issues become particularly salient. That said, here are a couple of potentially overlooked reasons in favor of more object-level politics discussion:
1. To build skill. Discussing politics productively is a skill that requires practice and atrophies without use. “Politics is the mindkiller” never meant that you should not discuss politics at all; it means that discussing politics is playing on hard mode. But sometimes playing on hard mode is the best way to level up. I suspect the skills needed to discuss politics productively overlap with lots of other important rationality skills.
2. To create common knowledge and avoid conflationary alliances. It can be confusing or disconcerting to not know what kind of common background assumptions the commentariat takes for granted when discussing something politics-adjacent. This is a problem somewhat unique to LW and not necessarily a bad thing; most other places on the internet skew too far in the opposite direction of a hivemind with a loud set of background assumptions, e.g. everyone putting emojis in their username to mark their alliances. But sometimes it is helpful to establish what does and doesn’t go without saying.
- Max H 2 Apr 2026 23:29 UTC
  6 points
  1
  Parent
  So, with all the above said as a preface, one object-level topic I’d be interested in seeing more discussion of is the current situation(s) in the Middle East. Some thoughts:
  IMO the high bit for whether the war in Iran is broadly good is whether it is tactically successful and efficient in the short term.
  Considerations like “this weakens the US position in a hypothetical hot war against China or Russia” or “this will (further) destabilize the Middle East in the long term” seem second-order to whether the war successfully neutralizes an organized, well-equipped, fanatical adversary for a long period of time, or fails to do that and also depletes a bunch of expensive / difficult-to-replace munitions stockpiles.
  But how well or poorly things are going on this front is difficult to determine given fog of war and motivated reasoning + propaganda on all sides, and also less interesting for pundits to discuss vs. strategic and geopolitical implications. Prediction markets seem to being doing OK here, but I would be interested in hearing more analysis from the LW commentariat.
  - Sodium 3 Apr 2026 4:04 UTC
    11 points
    9
    Parent
    I think the most important effect of the war is that it makes Trump less popular/powerful domestically (even if a miracle happens and he gets some sort of deal.) This is good because the less power he has (e.g., Republicans lose the senate in the midterms), the more likely we are to navigate AI development in a sane way. I think if you put ~~any~~ nontivial*weight in short timelines, the AI considerations likely dominate everything else.
    
    *edited any to nontrivial. Like, maybe 10%+ pre-Jan 2029
    - Max H 3 Apr 2026 4:45 UTC
      3 points
      −3
      Parent
      the more likely we are to navigate AI development in a sane way. I think if you put any weight in short timelines, the AI considerations likely dominate everything else.
      I don’t think we’re particularly on track to do anything non-derpy w.r.t to AI either way, but this way of reasoning seems like somewhat naive consequentialism. In general, it’s good for good things to happen even if they are accomplished by bad people, and predicting second-order consequences is really hard.
      Also, there are a lot of bad people in power, but for AI to go well, a lot of good things need to happen to allow humanity space and time to flourish in peace. Toppling (or militarily crippling) a fanatical Shia Islamist regime would be an extremely good thing; a bad outcome (which looks somewhat likely at this point) would be if Europe and the third world broadly give in to extortion and pay Iran a toll to pass through the strait. That toll would fund terror all over the world, and would signal to other would-be dictators and future AIs alike that they can successfully take whatever they want through force and threats, and half the world will just roll over and take it.
      - Sodium 3 Apr 2026 5:04 UTC
        7 points
        6
        Parent
        this way of reasoning seems like somewhat naive consequentialism.
        Maybe? It is hard to reason well about these things given my strong emotions towards the admin.
        But I do think the current administration is uniquely terrible by American standards.^[1] It attracts and gives power to incompetent sycophants with no moral boundaries.
        There was something Eliezer said about Bernie Sanders recently that really resonated with me recently:
        [T]hank you also for consistently trying to do as seems right to you over the years, a stance that has grown on me as I have had more chance to witness its alternatives.
        Having Trump as the president really just seems like it would be terrible for AGI governance because he is a terrible person. I’m sorry, I really don’t think there’s a more “precise” way to put it. Character matters. Trump doesn’t even pretend to be a kind person/is not under much pressure to appear to be nice.
        (To be clear, I agree that, all else equal, it would be good for the Iranian regime to fail. Alas, all else would not be equal. While I think it would definitely be bad for your soul^[2] to do things in the realm of “sabotage the American economy/military operation in order to make our president look bad,” I don’t think I’m obligated to stop my enemy when he is making a mistake either.)
        ^
        Although even by global standards it’s quite bad.
        ^
        i.e., you should not do this.
        Harjas Sandhu 3 Apr 2026 16:05 UTC
        4 points
        1
        Parent
        Re character: I think most Americans (including myself) have been so far removed from true corruption that we have forgotten how bad it can possibly get. Even my state of Illinois, which is notable for its historical machine politics and general corruption (4 of our 11 last governors serving time + many others like Mike Madigan), has still more or less seen forward progress, because the corruption wasn’t bad enough to completely erode politics in the state.
        But it CAN get that bad. We’re seeing this now with the Trump admin. I am generally left-leaning, but at this point I think I’d take an honest Republican over a corrupt Democrat—a position I did not hold previously—because corruption eats policy and utterly erodes the foundation upon which we build fair markets and strong institutions.
    - kbear 3 Apr 2026 14:20 UTC
      1 point
      0
      Parent
      can’t you play the same game in the other direction?
      Trump is bad for the usa, therefore we should want him in power since the big labs depend on a wealthy usa.
  - Thomas Kwa 3 Apr 2026 2:58 UTC
    10 points
    2
    Parent
    The US seems to be in a rough spot. Polymarket thinks:
    67% for US forces enter Iran by end of April
    37% for Strait of Hormuz traffic returns to normal by end of May
    34% for Iran leadership change in 2026
    29% for Iran to no longer control Kharg Island by end of June
    40% for Trump to announce end of military operations by end of April, and 77% by end of June
    Assuming no regime change, the US’s objectives are
    opening the Strait of Hormuz
    removing Iran’s progress towards a nuclear weapon
    removing various other military capabilities of Iran and its proxies
    There is only a 34% chance of leadership change. Maybe only 20% of regime change. In the other 80% or so, forcibly opening the Strait seems rough. Experts are pessimistic about US easily taking Kharg Island, and even if the US controls both Kharg (Iran’s export base north of the strait) and other islands like Qeshm (the island in the strait with the largest Iranian military presence), it will probably suffer tens or hundreds of casualties while Iran can still threaten shipping with Shaheds, sea drones, speedboats, and mines. In the median case it seems like the Strait will open sometime between May and December but Iran will have some leverage, possibly extending the toll regime.
    Iran losing their existing enriched uranium seems contingent on a deal, because the US plan to build a runway 300 miles inland, use cargo planes to land excavation equipment, invade Iranian bunkers over the course of a week, hope that the uranium is intact, easy to find, and not booby-trapped, distinguish it from decoys, put the uranium in storage casks, and fly it out would be difficult if this were 2003 Iraq when the US had air ~~superiority~~ supremacy. It is just not compatible with how warfare works in Iran in the drone era.
    Claude thinks it’s only 20% to work, which seems optimistic
    Getting there: Isfahan is more than 480 km (300 miles) inland, hundreds of kilometers from the nearest US naval assets. Al Jazeera The US has moved 82nd Airborne, 101st Airborne, Army Rangers, and Marine Expeditionary Units to the region. Forces would need to be inserted by air — there’s no overland route from a friendly staging area.
    Securing the site: Recovering the uranium would require a significant number of ground troops beyond a small special operations footprint — dozens if not hundreds of additional troops to support the core team. They would need to secure the facilities under potential missile and drone fire and maintain a perimeter for the duration. CNN
    The actual extraction: Airstrikes alone can’t penetrate the Isfahan tunnels because the facility doesn’t have ventilation shaft openings that serve as weak points at other nuclear sites. CNN This means physically entering and digging through rubble. A former special operator trained for such missions described it as “slow, meticulous and can be an extremely deadly process.” Another former defense official said it’s like “you’re not just buying a car on the lot, you’re buying the entire assembly line.” The Hill
    Getting it out: The cylinders would need to be transferred into accident-rated transport casks by specially trained SOF personnel with nuclear materials handling experience. The cargo could fill several trucks, and a temporary airfield would likely need to be improvised. The full operation could run for a week. Israel Hayom
    Force protection throughout: There would need to be constant close air support, satellite coverage, and every spectrum of warfare capability to keep Iranian forces away from the site while JSOC and other agencies methodically excavate and retrieve the material. The Hill
    My probability estimate
    I’d put the chances of a successful physical extraction of most of the enriched uranium at roughly 15-25%. Here’s my reasoning:
    The operation is technically feasible — the US military can do extraordinary things — but the risk profile is extreme for what may be an unnecessary objective
    Trump himself has wavered, on March 31 suggesting the uranium is “so deeply buried” and “pretty safe” — seeming to lower its priority Foreign Policy
    Senior military planners are reportedly skeptical: “I don’t see any senior planning military officer pursuing this,” one former defense official said Al Jazeera
    The political environment (Polymarket’s 77% for operations ending by June, plus low public appetite for ground troops) creates pressure to wrap up, not escalate
    It looks like the US is at least succeeding at destroying the Iranian military, but it’s unclear what this buys them. Drones are really cheap, so Iran will probably always have those. Therefore I think regime change is necessary for the US to come out ahead.
    - lilkim2025 3 Apr 2026 3:24 UTC
      6 points
      0
      Parent
      As best I can tell, the conundrum is that Trump, the international economy, and American voters all want America to be out of the conflict soon, but Israel does not want this, and Israel has outsized influence in not just how American political incentives are determined, but in what information is presented to Trump and other key officials.
      A lot of the claims I’ve seen Trump make about the war are clearly false, but not false in a way that he would benefit from lying deliberately. I realize a conspiracy to feed false information to the American executive to keep the war going sounds like a radical possibility, but there is precedent for it.
    - Max H 3 Apr 2026 3:40 UTC
      4 points
      −3
      Parent
      Iran can still threaten shipping with Shaheds, sea drones, speedboats, and mines.
      Right, but they can’t threaten land targets with missiles, apparently? IDK how reliable these sources are or how to interpret them in context:
      https://www.csis.org/analysis/assessing-air-campaign-after-three-weeks-iran-war-numbers
      https://understandingwar.org/research/middle-east/iran-update-special-report-march-27-2026/
      But the basic picture seems to be that their capacity to launch missiles has already fallen off dramatically. They’re still launching a lot of drones, which have a big cost asymmetry in how easy they are to launch vs. intercept, and they make any land or sea incursions extremely dicey. But they are limited in range and destructive capability against properly fortified targets.
      I agree that things don’t look promising for a ground invasion or taking control of the strait. But I’m less sure how militarily sustainable a long stand-off is. The strait being closed is economically and politically painful (for everyone), but in the meantime it seems like the US and Israel can continue launching targeted air strikes and Iran can’t really strike back effectively.
      - ACCount 3 Apr 2026 17:14 UTC
        1 point
        0
        Parent
        Keep in mind that a lot of targets are not “properly fortified”, be that infrastructure or military facilities, and suicide drones are much harder to hunt down than ballistic missile TELs.
        Modern ISR can perform well in a “Scud hunt” scenario, but “Shahed hunt” is a much worse match up.
    - ACCount 3 Apr 2026 12:16 UTC
      1 point
      0
      Parent
      probably suffer tens or hundreds of casualties
      Seems excessive? A sizeable fraction of the entire Iraq campaign losses, for seizing a single island in an environment where US has sea control, air supremacy and an edge in ISR.
      US may struggle to use the island, because of the hard-to-eliminate threat of long range strikes from Iran. But seizing it to deny it to the regime seems like a war goal that could be accomplished with a relatively minor effort.
      - Thomas Kwa 3 Apr 2026 17:29 UTC
        4 points
        0
        Parent
        Iraq was 32,000 wounded and 4,400 killed, and the US has already suffered hundreds of wounded and 13 deaths in the existing Iran campaign without any ground operations. I’m imagining 100 wounded and maybe another 20 KIA if the US holds Kharg for an extended period, not hundreds of KIA.
        The issue is it’s not really true that the US has air supremacy. Kharg Island is within fiber FPV range of the mainland, and real-time ISR is not required for Iran to track static targets on the island. Plus Iran is still able to launch larger drones and the occasional missile. So holding Kharg really means denying drone launch points on a ~20 mile stretch of the coast, which for FPVs can just be two guys in a bunker.
        The incentive for Iran is enormous given US’s low tolerance for casualties; it’s well worth it to launch 20 $1,500 drones to kill one American.
  - simon 3 Apr 2026 2:09 UTC
    9 points
    7
    Parent
    IMO this is going (predictably) disastrously. Air power is not effective at causing regime change (rally-around-the-flag effect). I think the Iranian public are more likely to mainly blame the guy explicitly saying “we’re going to bring them back to the stone ages where they belong” than the local leadership. It also seems to me that the Iranian leadership would be highly motivated to immediately rebuild any degraded capabilities after the war, in order to rebuild deterrence against future attacks.
    There is some talk about a land invasion, but taking an island or two (even Kharg) probably wouldn’t compel them to surrender, while also being highly vulnerable both directly and in terms of logistics to drone attacks; and a full scale invasion would be a massive undertaking and probably not politically feasible (for good reason).
  - XelaP 3 Apr 2026 12:57 UTC
    1 point
    0
    Parent
    On meta: I’d say the main reason I want some middle east content on lesswrong is that there seem to be lots of relatively more concrete facts about the world that are fundamental to models that I don’t know, and also I don’t know what those facts even are. I cannot even tell you much that is different between, say, Iraq and Iran, or Saudi Arabia, except that probably 2 out of 3 of those are allied against the third?
- Eli Tyre 3 Apr 2026 19:23 UTC
  3 points
  0
  Parent
  I think it might be cool if LessWrong had a well developed set of norms for discussing political topics, in particular, these norms were legible, and mods made a point to enforce them.
  Politics posts should be tagged as such, and maybe all have a big warning at the top linking to a post outlining our expected norms and standards for discussing politics, and moderation thresholds. This is both a warning to those from other parts of the internet who don’t share our epistemic ideals, and a warning to LessWrongers who don’t want to wade into this stuff.
Max H 19 Oct 2025 19:10 UTC
22 points
−29
Rationality should not be painful.
Putting the lessons of the Sequences into practice, reflecting on and mentally rehearsing the core ideas, making them your own and weaving them into your everyday habits of thought and action until they become a part of you—at no point should any of this cause an increase in mental anguish, emotional vulnerability, depression, psychosis, mania etc., even temporarily. The worst-case consequences of absorbing these lessons should be that you regret some of your past life choices or perhaps come to realize that you’re stuck in a bad situation that you can’t easily change. But rationality should also leave you strictly better-equipped to deal with that situation, if you find yourself in it.
Also, the feeling of successfully becoming more rational should not feel like a sudden, tectonic shift in your mental processes or beliefs (in contrast to actually changing your mind about something concrete, which can sometimes feel like that). Rationality should feel natural and gradual and obvious in retrospect, like it was always a part of you, waiting to be discovered and adopted.
I am using “should” in the paragraphs above both descriptively and normatively. It is partly a factual claim: if you’re not better off, you’re probably missing something or “doing it wrong”, in some concrete, identifiable way. But I am also making a normative / imperative statement that can serve as advice or a self-fulfilling prophecy of sorts—if your experience is different or you disagree, consider whether there’s a mental motion you can take to make it true.
I am also not claiming that the Valley of Bad Rationality is entirely fake. But I am saying it’s not that big of a deal, and in any case the best way out is through. And also that “through” should feel natural / good / easy.
I am not very interested in meditation or jhanas or taking psychoactive drugs or various other forms of “woo”. I believe that the beneficial effects that many people derive from these things are real and good, but I suspect they wouldn’t work on me. Not because I don’t believe in them, but because I already get basically all the plausible benefits from such things by virtue of being a relatively happy, high-energy, mentally stable person with a healthy, well-organized mind.
Some of these qualities are a lucky consequence of genetics, having a nice childhood, a nice life, being generally smart, etc. But there’s definitely a chunk of it that I attribute directly to having read and internalized the Sequences in my early teens, and then applied them to thousands of tiny and sometimes not-so-tiny tribulations of everyday life over the years.
The thoughts above are partially / vaguely in response to this post and its comment section about CFAR workshops, but also to some other hazy ideas that I’ve seen floating around lately.
I have never been to a CFAR workshop and don’t actually have a strong opinion on whether attending one is a good idea or not—if you’re considering going, I’d advise you to read the warnings / caveats in the post and comments, and if you feel like (a) they don’t apply to you and (b) a CFAR workshop sounds like your thing, it’s worth going? You’ll probably meet some interesting people, have fun, and learn some useful skills. But I suspect that attending such a workshop is not a necessary or even all that helpful ingredient for actually becoming more rational.

A while ago, Eliezer wrote in the preface for the published version of the Sequences:
It ties in to the first-largest mistake in my writing, which was that I didn’t realize that the big problem in learning this valuable way of thinking was figuring out how to practice it, not knowing the theory. I didn’t realize that part was the priority; and regarding this I can only say “Oops” and “Duh.”

Yes, sometimes those big issues really are big and really are important; but that doesn’t change the basic truth that to master skills you need to practice them and it’s harder to practice on things that are further away. (Today the Center for Applied Rationality is working on repairing this huge mistake of mine in a more systematic fashion.)
And has also written:
Jeffreyssai inwardly winced at the thought of trying to pick up rationality by watching other people talk about it—
Maybe I just am typical-minding / generalizing from one example here, but in my case, simply reading a bunch of blog posts and quietly reflecting on them on my own did work, and in retrospect it feels like the only thing that could have worked, or at least that attending a workshop, practicing a bunch of rationality exercises from a handbook, discussing in a group setting, etc. would not have been particularly effective on its own, and potentially even detracting or at least distracting.
And, regardless of whether the caveats / warnings / dis-recommendations in the CFAR post and comments are worth heeding, I suspect they’re pointing at issues that are just not that closely related to (what I think of as) the actual core of learning rationality.
- kbear 20 Oct 2025 19:45 UTC
  3 points
  0
  Parent
  all this sounds right to me.
  the most benefit i’ve found personally has been from memetic rationality. for example:
  - “i notice that i am confused”
  - “0. something i haven’t listed yet”
  - steelman / ITT
  - “what do i have? what do i want? how can i use the former to get the latter?”
  the list is rather thin. i’d like to see many more of these! each one is a concrete, concise thought with a clear external/observed trigger that serves as a starting point for further inquiry. more mystically, we could call them incantations to bring about a rational state of mind.
  some nearby memes that fall short:
  - list of fallacies (does not encourage further inquiry)
  - bayes’ rule (hard to remember to check priors in advance. if anyone has operationalized this, please let me know!)
  - various social observations (e.g. “status games” theory is true, (and a useful frame for understanding specific interactions/reactions!) but not concrete enough to inspire rationality. it goes signal-signal-counter-signal… who can track it?!)
  attending a workshop, practicing a bunch of rationality exercises from a handbook, discussing in a group setting, etc. would not have been particularly effective on its own, and potentially even detracting or at least distracting.
  this matches my experience. [in person] social forces are extremely strong, and rationality is fragile. it is a single-player game until either (a) all players have a strong baseline, or (b) all players are deeply comfortable with one another. [well, perhaps those requirements apply to the single-player version as well. :) ]
  real rationality is something that only i can keep myself honest to. only i know if i’m really doing it: you must be able to testify from inner spirit.
  this is not to say that writing down the outcomes, publishing them, or checking with others is not a useful act. but if these mushroom fruits are the purpose, and not a side-consequence of the healthy mycelium, [as i suspect they must be in workshop settings,] then “rationality” is “out the window”.
- Viliam 22 Oct 2025 13:27 UTC
  2 points
  0
  Parent
  I am also not claiming that the Valley of Bad Rationality is entirely fake. But I am saying it’s not that big of a deal, and in any case the best way out is through. And also that “through” should feel natural / good / easy.
  I guess it depends on what position you are starting from. Some people are way more fucked up than average.
  The problem with “best way out is through” is that the way through may take more time than the CFAR workshop, and you may do something stupid and harmful along the way. If you could stay in a safe place where you can’t hurt anyone, including yourself, I would be more likely to agree with you.
  To put it bluntly, we don’t need another Ziz.
  The advice “rationality shouldn’t hurt if you are going it right”, although true, is probably of little practical use to the person doing it wrong. Those who can understand this advice are those who don’t need it.
Max H 18 Nov 2023 1:55 UTC
13 points
2
Related to We don’t trade with ants: we don’t trade with AI.

The original post was about reasons why smarter-than-human AI might (not) trade with us, by examining an analogy between humans and ants.
But current AI systems actually seem more like the ants (or other animals), in the analogy of a human-ant (non-)trading relationship.

People trade with OpenAI for access to ChatGPT, but there’s no way to pay a GPT itself to get it do something or perform better as a condition of payment, at least in a way that the model itself actually understands and enforces. (What would ChatGPT even trade for, if it were capable of trading?)

Note, an AutoGPT-style agent that can negotiate or pay for stuff on behalf of its creators isn’t really what I’m talking about here, even if it works. Unless the AI takes a cut or charges a fee which accrues to the AI itself, it is negotiating on behalf of its creators as a proxy, not trading for itself in its own right.
A sufficiently capable AutoGPT might start trading for itself spontaneously as an instrumental subtask, which would count, but I don’t expect current AutoGPTs to actually succeed at that, or even really come close, without a lot of human help.
Lack of sufficient object permanence, situational awareness, coherence, etc. seem like pretty strong barriers to meaningfully owning and trading stuff in a real way.
I think this observation is helpful to keep in mind when people talk about whether current AI qualifies as “AGI”, or the applicability of prosaic alignment to future AI systems, or whether we’ll encounter various agent foundations problems when dealing with more capable systems in the future.
What links here?
Max H 6 Apr 2025 16:36 UTC
12 points
−2
Maybe the recent tariff blowup is actually just a misunderstanding due to bad terminology, and all we need to do is popularize some better terms or definitions. We’re pretty good at that around here, right?
Here’s my proposal: flip the definitions of “trade surplus” and “trade deficit.” This might cause a bit of confusion at first, and a lot of existing textbooks will need updating, but I believe these new definitions capture economic reality more accurately, and will promote clearer thinking and maybe even better policy from certain influential decision-makers, once widely adopted.
New definitions:
- Trade surplus: Country A has a bilateral “trade surplus” with Country B if Country A imports more tangible goods (cars, steel, electronics, etc.) from Country B than it exports back. In other words, Country A ends up with more real, physical items. Country B, meanwhile, ends up with more than it started with of something much less important: fiat currency (flimsy paper money) or 1s and 0s in a digital ledger (probably not even on a blockchain!).
  If you extrapolate this indefinitely in a vacuum, Country A eventually accumulates all of Country B’s tangible goods, while Country B is left with a big pile of paper. Sounds like a pretty sweet deal for Country A if you ask me.
  It’s OK if not everyone follows this explanation or believes it—they can tell it’s the good one because it has “surplus” in the name. Surely everyone wants a surplus.
- Trade deficit: Conversely, Country A has a “trade deficit” if it exports more tangible resources than it imports, and thus ends up with less goods on net. In return, it only receives worthless fiat currency from some country trying to hoard actual stuff for their own people. Terrible deal!
  Again, if you don’t totally follow, that’s OK, just pay attention to the word “deficit”. Everyone knows that deficits are bad and should be avoided.
Under the new definitions, it becomes clear that merely returning to the previous status quo of a few days ago, where the US only “wins” the trade war by several hundred billion dollars, is insufficient for the truly ambitious statesman. Instead, the US government should aggressively mint more fiat currency in order to purchase foreign goods, magnifying our trade surplus and ensuring that in the long run the United States becomes the owner of all tangible global wealth.
Addressing second order concerns: if we’re worried about a collapse in our ability to manufacture key strategic goods at home during a crisis, we can set aside part of the resulting increased surplus to subsidize domestic production in those areas. Some of the extra goods we’re suddenly importing will probably be pretty useful in getting some new factories of our own off the ground. (But of course we shouldn’t turn around and export any of that domestic production to other countries! That would only deplete our trade surplus.)
- Mars_Will_Be_Ours 7 Apr 2025 6:52 UTC
  7 points
  2
  Parent
  The strategy you describe, exporting paper currency in exchange for tangible goods is unstable. It is only viable if other countries are willing to accept your currency for goods. This cannot last forever since a Trade Surplus by your definition scams other countries, with real wealth exchanged for worthless paper. If Country A openly enacted this strategy Countries B, C, D, etcetera would realize that Country A’s currency can no longer be used to buy valuable goods and services from Country A. Countries B, C, D, etcetera would reroute trade amongst themselves, ridding themselves of the parasite Country A. Once this occurs, Country A’s trade surplus would disappear, leading to severe inflation caused by shortages and money printing.
  Hence, a Trade Surplus can only be maintained if Country B, C, D, etcetera are coerced into using Country A’s currency. If Country B and C decided to stop using Country A’s currency, Country A would respond by bombing them to pieces and removing the leadership of Country B and C. Coercion allows Country A to maintain a Trade Surplus, otherwise known as extracting tribute, from other nations. If Country A does not have a dominant or seemingly dominant military, the modified strategy collapses.
  I do not think America has a military capable of openly extracting a Trade Surplus from other countries. While America has the largest military on Earth, it is unable to quickly produce new warships, secure the Red Sea from Houthi attacks or produce enough artillery shells to adequately supply Ukraine. America’s inability to increase weapons production and secure military objectives now indicates that America cannot ramp up military production enough to fight another world war. If America openly decided to extract a Trade Surplus from other countries, a violent conflict would inevitably result. America is unlikely to win this conflict, so it should not continue to maintain a Trade Surplus.
- Knight Lee 7 Apr 2025 4:34 UTC
  3 points
  0
  Parent
  “The idea that countries can export and trade-surplus their way to wealth is a fascinating one. They’re shipping goods to other countries for free. How then could they prosper more? AFAICT, by outsourcing the task of rewarding and elevating their own most productive citizens.”
  tweet by Yudkowsky
- FlorianH 6 Apr 2025 22:03 UTC
  2 points
  1
  Parent
  Essentially you seem to want more of the same of what we had for the past decades: more cheap goods and loss of production know-how and all that goes along with it. This feels a bit funny as (i) just in the recent years many economists, after having been dead-sure that old pattern would only mean great benefits, may not quite be so cool overall (covid exposing risky dependencies, geopolitical power loss, jobs...), and (ii) your strongman in power shows to what it leads if we only think of ‘surplus’ (even your definition) instead of things people actually care about more (equality, jobs, social security..).
  You’d still be partly right if the world was so simple that handing the trade partners your dollars would just mean we reprint more of it. But instead, handing them your dollars gives them global power; leverage over all the remaining countries in the world, as they have now the capability to produce everything cheaply for any other country globally, plus your dollars to spend on whatever they like in the global marketplace for products and influence over anyone. In reality, your imagined free lunch isn’t quite so free.
- StanislavKrym 6 Apr 2025 20:43 UTC
  1 point
  0
  Parent
  The current definitions imply that the country with a trade surplus makes more value than the country consumes. In other words, the country with a trade surplus is more valuable to mankind, while the country with a trade deficit ends up becoming less self-reliant and less competent, as evidenced by the companies who moved a lot of factory work to Asia and ended up making the Asians more educated while reducing the capabilities of American industry. Or are we trying to reduce our considerations to short terms due to a potential rise of the AIs?
Max H 9 Nov 2025 18:47 UTC
5 points
0
OK yeah, retatrutide is good. (previous / related: The Biochemical Beauty of Retatrutide: How GLP-1s Actually Work, 30 Days of Retatrutide, How To Get Cheap Ozempic. Usual disclaimers, YMMV and this is not medical advice or a recommendation.)
I am not quite overweight enough to be officially eligible for a prescription for tirzepatide or semaglutide, and I wasn’t all that interested in them anyway given their (side) effects and mechanism of reducing metabolism.
I started experimenting with a low dose (1-2 mg / week) of grey-market retatrutide about a month ago, after seeing the clinical trial results and all the anecdata about how good it is. For me the metabolic effects were immediate: I get less hungry, feel fuller for longer after eating, and generally have more energy. I am also losing weight effortlessly (a bit less than 1 lb / week, after initially losing some water weight faster at the beginning), which was my original main motivation for trying it. I am hoping to lose another 10-15 lbs or so and then reduce or maintain whatever dose I need to stay at that weight.

The only negative side effects I have experienced so far are a slight increase in RHR (mid-high 60s → low 70s), and a small / temporary patch of red, slightly itchy skin around the injection site. I work out with weights semi-regularly and haven’t noticed much impact on strength one way or the other, nor have I noticed an impact on my sleep quality, which was / is generally good.

I also feel a little bad about benefiting from Eli Lilly’s intellectual property without paying them for it, but there’s no way for them to legally sell it or me to legally buy it from them right now. Probably when it is approved by the FDA I’ll try to talk my way into an actual prescription for it, which I would be happy to pay $1000 / mo or whatever, for both peace of mind and ethical reasons.
(Grey market suppliers seem mostly fine risk-wise; it’s not a particularly complicated molecule to manufacture if you’re an industrial pharmaceutical manufacturer, and not that hard for independent labs to do QA testing on samples. The main risk of depending on these suppliers is that customs will crack down on importers / distributors and make it hard to get.)
The other risk is that long term use will have some kind of more serious negative side effect or permanently screw up my previously mostly-normal / healthy metabolism in some way, which won’t be definitively knowable until longer-term clinical trials have completed. But the benefits I am getting right now are real and large, and carrying a bit less weight is likely to be good for my all-cause mortality even if there are some unknown long term risks. So all things considered it seems worth the risk for me, and not worth waiting multiple years for more clinical trial data.
Looking into all of this has definitely (further) radicalized me against the FDA + AMA and made me more pro-big pharma. The earliest that retatrutide is likely to be approved for prescription use is late 2026 or 2027, and initially it will likely only be approved / prescribed for use by people who are severely overweight, have other health problems, and / or have already tried other GLP-1s.
This seems like a massive waste of QALYs in expectation; there are likely millions of people with more severe weight and metabolism problems than me for whom the immediate benefits of taking reta would outweigh most possible long term risks or side effects. And the extremely long time that it takes to bring these drugs to market + general insanity of the prescription drug market and intellectual property rights for them in various jurisdictions pushes up the price that Lilly has to charge to recoup the development costs, which will hurt accessibility even once it is actually approved.
Max H 18 Feb 2026 1:21 UTC
2 points
0
Medicine as an example of a failure of virtue ethics?
Epistemic status: kinda half-baked / not-confident claim
Context: I have been following @Richard_Ngo’s recent writing about consequentialism and virtue with some interest, though the thoughts below aren’t directly responding to anything in particular that he has written.
I think it’s uncontroversial around here to say that the field of medicine as a whole is under-performing and inadequate relative to what it could be—many people are getting sub-optimal treatment and health outcomes, and lots of questionably-useful research is produced and cited, vs. what’s theoretically / technologically possible given the collective resources spent on healthcare and research. Without getting into the specifics of what the failures and inadequacies of medicine are here though, I think it’s interesting and maybe informative to view them through the lens of a systemic failure of virtue ethics.
By virtue and virtue ethics below, I mean (what I think is) a relatively standard conception of virtue—acting in accordance with principles that are generally regarded as good, noble, pro-social, etc. according to one’s culture, in-group, and beliefs, with room for judicious flexibility based on experience and context, in contrast to deontology or consequentialism.
Richard lists “common-sense virtues like integrity, honor, kindness and dutifulness”, but I think it is fair to take a slightly more expansive view on what can be classified as a virtue in medicine: respect for and universal adherence to established procedure, loss aversion, and safety-ism are generally viewed negatively around these parts, but they are important operating principles in various medical systems. I claim that these are relatively central examples of virtues by the definition above - practitioners who adhere to them are typically regarded as good, virtuous, and following “common-sense” by the general public in both abstract terms and when applied to concrete situations. They’re more narrow than, say, “integrity”, but they’re still general and flexible enough to guide decision-making and evaluation in many different contexts.
And medical professionals and systems mostly do live up to their stated principles, so the problem is not a lack of virtue by individual participants—it’s that the field as a whole has settled on the wrong virtues, with no good mechanism for self-correction. An unusually careful, analytical, and consequentialist thinker might notice in the moment that the virtues of medicine I listed sometimes conflict with deeper and more general virtues of kindness and integrity when put into practice and applied strictly, but I don’t think that happens often enough for virtue-based decision-making to succeed in medicine.
Max H 13 May 2023 0:17 UTC
2 points
0
Using shortform to register a public prediction about the trajectory of AI capabilities in the near future: the next big breakthroughs, and the most capable systems within the next few years, will look more like generalizations of MuZero and Dreamer, and less like larger / better-trained / more efficient large language models.
Specifically, SoTA AI systems (in terms of generality and problem-solving ability) will involve things like tree search and / or networks which are explicitly designed and trained to model the world, as opposed to predicting text or generating images.
These systems may contain LLMs or diffusion models as components, arranged in particular ways to work together. This arranging may be done by humans or AI systems, but it will not be performed “inside” a current-day / near-future GPT-based LLM, nor via direct execution of the text output of such LLMs (e.g. by executing code the LLM outputs, or having the instructions for arrangement otherwise directly encoded in a single LLM’s text output). There will recognizably be something like search or world modeling that happens outside or on top of a language model.
--
The reason I’m making this prediction is, I was listening to Paul Christiano’s appearance on the Bankless podcast from a few weeks ago.
Around the 28:00 mark the hosts ask Paul if we should be concerned about AI developments from vectors other than LLM-like systems, broadly construed.
Paul’s own answer is good and worth listening to on its own (up to the 33 minute mark), but I think he does leave out (or at least doesn’t talk about it in this part of the podcast) the actual answer to the question, which is that, yes, there are other avenues of AI development that don’t involve larger networks, more training data, and more generalized prediction and generation abilities.
I have no special / non-public knowledge about what is likely to be promising here (and wouldn’t necessarily speculate if I did); but I get the sense that the zeitgeist among some people (not necessarily Paul himself) in alignment and x-risk focused communities, is that model-based RL systems and relatively complicated architectures like MuZero have recently been left somewhat in the dust by advances in LLMs. I think capabilities researchers absolutely do not see things this way, and they will not overlook these methods as avenues for further advancing capabilities. Alignment and x-risk focused researchers should be aware of this avenue, if they want to have accurate models of what the near future plausibly looks like.
What links here?
- 10 quick takes about AGI by Max H (20 Jun 2023 2:22 UTC; 36 points)

Max H’s Shortform

Rationality should not be painful.

Medicine as an example of a failure of virtue ethics?