Max H

Karma: 2,657

Most of my posts and comments are about AI and alignment. Posts I’m most proud of, which also provide a good introduction to my worldview:

I also created Forum Karma, and wrote a longer self-introduction here.

PMs and private feedback are always welcome.

NOTE: I am not Max Harms, author of Crystal Society. I’d prefer for now that my LW postings not be attached to my full name when people Google me for other reasons, but you can PM me here or on Discord (m4xed) if you want to know who I am.

Max H 7 Oct 2025 19:23 UTC
12 points
0
in reply to: 1a3orn’s comment on: 1a3orn’s Shortform
(By “coherent” I (vaguely) understand an entity (AI, human, etc) that does not have ‘conflicting drives’ within themself, that does not want ‘many’ things with unclear connections between those things, one that always acts for the same purposes across all time-slices, one that has rationalized their drives and made them legible like a state makes economic transactions legible.)
Coherence is mostly about not stepping on your own toes; i.e. not taking actions that get you strictly less of all the different things that you want, vs. some other available action. “What you want” is allowed to be complicated and diverse and include fuzzy time-dependent things like “enough leisure time along the way that I don’t burn out”.

This is kind of fuzzy / qualitative, but on my view, most high-agency humans act mostly coherently most of the time, especially but not only when they’re pursuing normal / well-defined goals like “make money”. Of course they make mistakes, including meta ones (e.g. misjudging how much time they should spend thinking / evaluating potential options vs. executing a chosen one), but not usually in ways that someone else in their shoes (with similar experience and g) could have easily / predictably done better without the benefit of hindsight.
Here are some things a human might stereotypically do in the pursuit of high ability-to-act in the world, as it happens in humans:
- Try to get money through some means
- Try to become close friends with powerful people
- Take courses or read books about subject-matters relevant to their actions
- Etc
Lots of people try to make money, befriend powerful / high-status people around them, upskill, etc. I would only categorize these actions as pursuing “high ability-to-act” if they actually work, on a time scale and to a degree that they actually result in the doer ending up with the result they wanted or the leverage to make it happen. And then the actual high ability-to-act actions are the more specific underlying actions and mental motions that actually worked. e.g. a lot of people try starting AGI research labs or seek venture capital funding for their startup or whatever, few of them actually succeed in creating multi-billion dollar enterprises (real or not). The top-level actions might look sort of similar, but the underlying mental motions and actions will look very different whether the company is (successful and real), (successful and fraud), or a failure. The actual pursuing-high-ability-to-act actions are mostly found in the (successful and real, successful and fraud) buckets.
And here are some things a human might stereotypically do while pursuing coherence.
- Go on a long walk or vacation reflecting on what they’ve really wanted over time
- Do a bucketload of shrooms
- Try just some very different things to see if they like them
- Etc
Taking shrooms in particular seems like a pretty good example of an action that is almost certainly not coherent, unless there is some insight that you can only have (or reach the most quickly) by taking hallucinogenic drugs. Maybe there are some insights like that but I kind of doubt it, and trying shrooms first before you’ve exhausted other ideas, in some vague pursuit of some misunderstood concept of coherence, is not the kind of thing i would expect to be common in the most successful humans or AIs. There are of course exceptions (very successful humans who have taken drugs and attribute some of their success to it), but my guess is that success is mostly in spite of the drug use, or at least that the drug use was not actually critical.
The other examples are maybe stereotypes of what some people think of as pursuing coherent behavior, but I would guess they’re also not particularly strongly correlated with actual coherence.

Max H 23 Sep 2025 23:29 UTC
9 points
2
on: Notes on fatalities from AI takeover
It’s unclear what fraction of people die due to takeover because this is expedient for the AI, but it seems like it could be the majority of people and could also be almost no one. If AIs are less powerful, this is more likely (because AIs would have a harder time securing a very high chance of takeover without killing more humans).
Yeah, this (extinction to facilitate takeover) seems like the most plausible pathway to total or near-total extinction by far. An AI that is only a little bit smarter than humanity collectively has to worry about humans making a counter-move—launching missiles or building a competing AI or various kinds of sabotage. If you’re a rogue AI, engineering a killer virus (something that smart humans can already do or almost do, if they wanted to) as soon as you or humanity has built out sufficient robotics infrastructure, makes all the subsequent parts of your takeover / expansion plan much less contingent and more straightforward to reason about. (And I think the analogy to historical relatively-bloodless coups here is a pretty weak counter / faint hope—for one, because human coup instigators generally still need humans to rule over, whereas AIs wouldn’t.)

If there are a large number of different rogue AIs, it becomes more likely that one of them would benefit from massive fatalities (e.g. due to a pandemic) making this substantially more likely.
I don’t see how the number of AIs makes a big difference here, rather than the absolute power level of the leading AI? An extinction or near-extinction event seems beneficial to just about any unaligned AI that is not all-powerful enough to not have to worry about humanity at all.
Put another way, the scenarios where extinction doesn’t happen due to takeover only feel plausible in scenarios where a single AI fooms so fast and so hard that it can leave humanity alive without really sweating it. But if I understand the landscape of the discourse / disagreement here, these fast and discontinuous takeoff scenarios are exactly the ones that you and some others find the least plausible.

Max H 20 Sep 2025 18:30 UTC
6 points
2
in reply to: Max Harms’s comment on: Max Harms’s Shortform
I think the linked tweet is possibly just misinterpreting what the authors meant by “transistor operations”? My reading is that “1000″ binds to “operations”; the actual number of transistors in each operation is unspecified. That’s how they get the 10,000x number—if a CPU runs at 1 GHz, neurons run at 100 Hz, then even if it takes 1000 clock cycles to do the work of neuron, the CPU can still do it 10,000x faster.
(IDK what the rationale was in the editorial process for using “transistor operations” instead of a more standard term like “clock cycles”, but a priori it seems defensible. Speculating, “transistors” was already introduced in the sentence immediately prior, so maybe the thinking was that the meaning and binding of “transistor operations” would be self-evident in context. Whereas if you use “clock cycles” you have to spend a sentence explaining what that means. So using “transistor operations” reduces the total number of new jargon-y / technical terms in the paragraph by one, and also saves a sentence of explanation.)
Anyway, depending on the architecture, precision, etc. a single floating point multiplication can take around 8 clock cycles. So even if a single neuron spike is doing something complicated that requires several high-precision multiply + accumulate operations in serial to replicate, that can easily fit into 1000 clock cycles on a normal CPU, and much fewer if you use specialized hardware.

As for the actual number of transistors themselves needed to do the work of a neuron spike, it again depends on exactly what the neuron spike is doing and how much precision etc. you need to capture the actual work, but “billions” seems too high by a few OOM at least. Some reference points: a single NAND gate is 4 transistors, and a general-purpose 16-bit floating point multiplier unit is ~5k NAND gates.

Max H 19 Sep 2025 17:46 UTC
50 points
19
in reply to: Max Harms’s comment on: Max Harms’s Shortform
The passage seems fine to me; I commented on Erdil’s post and other brain efficiency discussions at the time, and I still think that power consumption is a more objective way of comparing performance characteristics of the brain vs. silicon, and that various kinds of FLOP/s comparisons favored by critics of the clock speed argument in the IAB passage are much more fraught ([1], [2]).
It’s true that clock speed (and neuron firing speed) aren’t straightforwardly / directly translatable to “speed of thought”, but both of them are direct proxies for energy consumption and power density. And a very rough BOTEC shows that ~10,000x is a reasonable estimate for the difference in power density between the brain and silicon.
Essentially, the brain is massively underclocked because of design-space restrictions imposed by biology and evolution, whereas silicon-based processing has been running up against fundamental physical limits on component size, clock speed, and power density for a while now. So once AIs can run whatever cognitive algorithms that the brain implements (or algorithms that match the brain in terms of high-level quality of the actual thoughts) at any speed, the already-existing power density difference implies they’ll immediately have a much higher performance ceiling in terms of the throughput and latency that they can run those algorithms at. It’s not a coincidence that making this argument via clock speeds leads to basically the same conclusion as making the same argument via power density.

Max H 18 Sep 2025 23:31 UTC
7 points
0
on: I enjoyed most of IABIED
- Tricky hypothesis 1: ASI will in fact be developed in a world that looks very similar to today’s (e.g. because sub-ASI AIs will have negligible effect on the world; this could also be because ASI will be developed very soon).
- Tricky hypothesis 2: But the differences between the world of today and the world where ASI will be developed don’t matter for the prognosis.
Both of these hypotheses look relatively more plausible than they did 4y ago, don’t they? Looking back at this section from the 2021 takeoff speed conversation gives a sense of how people were thinking about this kind of thing at the time.

AI-related investment and market caps are exploding, but not really due to actual revenue being “in the trillions”—it’s mostly speculation and investment in compute and research.

Deployed AI systems can already provide a noticeable speed-up to software engineering and other white-collar work broadly, but it’s not clear that this is having much of an impact on AI research (and especially a differential impact on alignment research) specifically.

Maybe we will still get widely deployed / transformative robotics, biotech, research tools etc. due to AI that could make a difference in some way prior to ASI, but SoTA AIs of today are routinely blowing through tougher and tougher benchmarks before they have widespread economic effects due to actual deployment.
I think most people in 2021 would have been pretty surprised by the fact we have widely available LLMs in 2025 with gold medal-level performance on the IMO, but which aren’t yet having much larger economic effects. But in relative terms it seems like you and Christiano should be more surprised than Yudkowsky and Soares.

Max H 15 Sep 2025 15:51 UTC
2 points
0
in reply to: lc’s comment on: lc’s Shortform
The “you’re sued” part is part of what ensures that the forms get filled out honestly and comprehensively.
Depending on the kind of audit you do, the actual deliverable you give your auditor may just be a spreadsheet with a bunch of Y/N answers to hundreds of questions like “Do all workstations have endpoint protection software”, “Do all servers have intrusion detection software”, etc. with screenshots of dashboards as supporting evidence for some of them.
But regardless of how much evidence an external auditor asks for, at large companies doing important audits, every single thing you say to the auditor will be backed internally with supporting evidence and justification for each answer you give.
At a bank you might have an “internal audit” department that has lots of meetings and back-and-forth with your IT department; at an airline it might be a consulting firm that you bring in to modernize your IT and help you handle the audit, or, depending on your relationship with your auditor and the nature of the audit, it might be someone from the audit firm itself that is advising you. In each case, their purpose is to make sure that every machine across your firm really does have correctly configured EDR, fully up to date security patches, firewalled, etc. before you claim that officially to an auditor.

Maybe you have some random box used to show news headlines on TVs in the hallways—turns out these are technically in-scope for having EDR and all sorts of other endpoint controls, but they’re not compatible with or not correctly configured to run Microsoft Defender, or something. Your IT department will say that there are various compensating / mitigating controls or justifications for why they’re out of scope, e.g. the firewall blocks all network access except the one website they need to show the news, the hardware itself is in a locked IT closet, they don’t even have a mouse / keyboard plugged in, etc. These justifications will usually be accepted unless you get a real stickler (or have an obstinate “internal auditor”). But it’s a lot easier to just say “they all run CrowdStrike” than it is to keep track of all these rationales and compensating controls, and indeed ease-of-deployment is literally the first bullet in CrowdStrike’s marketing vs. Microsoft Defender:

CrowdStrike: Deploy instantly with a single, lightweight agent — no OS prerequisites, complex configuration, or fine tuning required.
Microsoft: Complicated deployment hinders security. All endpoints require the premium edition of the latest version of Windows, requiring upfront OS and hardware upgrades for full security functionality.
You wrote in a sibling reply:
Further, the larger implication of the above tweet is that companies use Crowdstrike because of regulatory failure, and this is also simply untrue. There are lots of reasons people sort of unthinkingly go with the name brand option in security, but that’s a normal enterprise software thing and not anything specific to compliance.

I agree that this has little to do with “regulatory failure” and don’t know / don’t have an opinion on whether that’s what the original tweet author was actually trying to communicate. But my point is that firms absolutely do make purchasing decisions about security software for compliance reasons, and a selling point of CrowdStrike (and Carbon Black, and SentinelOne) is that they make 100% compliance easier to achieve and demonstrate vs. alternative solutions. That’s not a regulatory failure or even necessarily problematic, but it does result in somewhat different outcomes compared to a decision process of “unthinkingly going with the name brand option” or “carefully evaluate and consider only which solutions provide the best actual security vs. which are theater”.

Max H 15 Sep 2025 3:36 UTC
2 points
0
in reply to: lc’s comment on: lc’s Shortform
The audit requirements Mark is talking about don’t exist. He just completely made them up.
The screenshotted tweet says that you’re required to install something like Crowdstrike, which is correct and also seems consistent with the ChatGPT dialogue you linked?
There are long lists of computer security practices and procedures needed to pass an audit for compliance with a standard like ISO27001, PCI DSS, SOC 2, etc. that many firms large and small are subject to (sometimes but not necessarily by law—e.g. companies often need to pass an SOC 2 audit because their customers ask for it).

As you say, none of these standards name specific software or vendors that you have to use in order to satisfy an auditor, but it’s often much less of a headache to use a “best in class” off-the-shelf product (like CrowdStrike) that is marketed specifically as satisfying specific requirements in these standards, vs. trying to cobble together a complete compliance posture using tools or products that were not designed specifically to satisfy those requirements.
A big part of the marketing for a product like CrowdStrike is that it has specific features which precisely and unambiguously satisfy more items in various auditor checklists than competitors.
So “opens up an expensive new chapter of his book” is colorful and somewhat exaggerated, but I wouldn’t describe it as “misinformation”—it’s definitely pointing at something real, which is that a lot of enterprise security software is sold and bought as an exercise in checking off specific checklist items in various kinds of audits, and how easy / convenient / comprehensive a solution makes box-checking is often a bigger selling point than how much actual security it provides, or what the end user experience is actually like.

Max H 9 Sep 2025 21:03 UTC
49 points
38
in reply to: philip_b’s comment on: Obligated to Respond
I would just evaluate your argument on my own and I would evaluate the counterargument in the comment on my own.
The precise issue is that a sizable fraction of the audience will predictably not do this, or will do it lazily or incorrectly.
On LessWrong, this shows up in voting patterns, for example, a controversial post will sometimes get some initial upvotes and then the karma / trend will swing around based on the comments and who had the last word. Or, a long back-and-forth ends up getting far fewer votes (and presumably, eyeballs) than the top-level post / comment.
My impression is that most authors aren’t that sensitive to karma per se but they are sensitive to a mental model of the audience that this swinging implies, namely that many onlookers are letting the author and their interlocutor(s) do their thinking for them, with varying levels of attention span, and where “highly upvoted” is often a proxy for “onlookers believe this is worth responding to (but won’t necessarily read the response)”. So responding often feels both high stakes and unrewarding for someone who cares about communicating something to their audience as a whole.
Anyway, I like Duncan’s post as a way of making the point about effort / implied obligation to both onlookers and interlocutors, but something else that might help is some kind of guide / reminder / explanation about principles of being a good / high-effort onlooker.

Max H 24 Aug 2025 18:42 UTC
3 points
0
in reply to: Buck’s comment on: Buck’s Shortform
What specifically do you think is obviously wrong about the village idiot <-> Einstein gap? This post from 2008 which uses the original chart makes some valid points that hold up well today, and rebuts some real misconceptions that were common at the time.
The original chart doesn’t have any kind of labels or axes, but here are two ways you could plausibly view it as “wrong” in light of recent developments with LLMs:
- Duration: the chart could be read as a claim that the gap between the development of village idiot and Einstein-level AI in wall-clock time would be more like hours or days rather than months or years.
- Size and dimensionality of mind-space below the superintelligence level. The chart could be read as a claim that the size of mindspace between village idiot and Einstein is relatively small, so it’s surprising to Eliezer-200x that there are lots of current AIs landing in between them, and staying there for a while.
I think it’s debatable how much Eliezer was actually making the stronger versions of the claims above circa 2008, and also remains to be seen how wrong they actually are, when applied to actual superintelligence instead of whatever you want to call the AI models of today.
OTOH, here are a couple of ways that the village idiot <-> Einstein post looks prescient:
- Qualitative differences between the current best AI models and second-to-third tier models are small. Most AI models today are all roughly similar to each other in terms of overall architecture and training regime, but there are various tweaks and special sauce that e.g. Opus and GPT-5 have that Llama 4 doesn’t. So you have something like: Llama 4: GPT-5 :: Village idiot : Einstein, which is predicted by:
Maybe Einstein has some minor genetic differences from the village idiot, engine tweaks. But the brain-design-distance between Einstein and the village idiot is nothing remotely like the brain-design-distance between the village idiot and a chimpanzee. A chimp couldn’t tell the difference between Einstein and the village idiot, and our descendants may not see much of a difference either.
(and something like a 4B parameter open-weights model is analogous to the chimpanzee)
Whereas I expect that e.g. Robin Hanson in 2008 would have been quite surprised by the similarity and non-specialization among different models of today.
- Implications for scaling. Here’s a claim on which I think the Eliezer-200x Einstein chart makes a prediction that is likely to outperform other mental models of 2008, as well as various contemporary predictions based on scaling “laws” or things like METR task time horizon graphs:
  
  ”The rough number of resources, in terms of GPUs, energy, wall clock time, lines of Python code, etc. needed to train and run best models today (e.g. o4, GPT-5), are sufficient (or more than sufficient) to train and run a superintelligence (without superhuman / AI-driven levels of optimization / engineering / insight).”
  
  My read of task-time-horizon and scaling law-based models of AI progress is that they more strongly predict that further AI progress will basically require more GPUs. It might be that the first Einstein+ level AGI is in fact developed mostly through scaling, but these models of progress are also more surprised than Eliezer-2008 when it turns out that (ordinary, human-developed) algorithmic improvements and optimizations allow for the training of e.g. a GPT-4-level model with many fewer resources than it took to train the original GPT-4 just a few years ago.

Max H 13 Aug 2025 13:57 UTC
2 points
1
in reply to: the gears to ascension’s comment on: Forum Karma: view stats and find highly-rated comments for any LW user
Thanks for the report, should be fixed now.

The issue was that the LW GraphQL API has changed slightly, apparently. The user query suggested here no longer works, but something like:
```
{
      GetUserBySlug(slug: "max-h") {
          _id
          slug
          displayName
          pageUrl
          postCount
          commentCount
          createdAt
      }
}
```
works fine.

Max H 9 Jul 2025 15:56 UTC
22 points
8
on: Applying right-wing frames to AGI (geo)politics
I prefer (classical / bedrock) liberalism as a frame for confronting societal issues with AGI, and am concerned by the degree to which recent right-wing populism has moved away from those tenets.
Liberalism isn’t perfect, but it’s the only framework I know of that even has a chance of resulting in a stable consensus. Other frames, left or right, have elements of coercion and / or majoritarianism that inevitably lead to legitimacy crises and instability as stakes get higher and disagreements wider.
My understanding is that a common take on both the left and right these days is that, well, liberalism actually hasn’t worked out so great for the masses recently, so everyone is looking for something else. But to me every “something else” on both the left and right just seems worse—Scott Alexander wrote a bunch of essays like 10y ago on various aspects of liberalism and why they’re good, and I’m not aware of any comprehensive rebuttal that includes an actually workable alternative.
Liberalism doesn’t imply that everyone needs to live under liberalism (especially my own preferred version / implementation of it), but it does provide a kind of framework for disagreement and settling differences in a way that is more peaceful and stable than any other proposal I’ve seen.
So for example on protectionism, I think most forms of protectionism (especially economic protectionism) are bad and counterproductive economic policy. But even well-implemented protectionism requires a justification beyond just “it actually is in the national interest to do this”, because it infringes on standard individual rights and freedoms. These freedoms aren’t necessarily absolute, but they’re important enough that it requires strong and ongoing justification for why a government is even allowed to do that kind of thing. AGI might be a pretty strong justification!
But at the least, I think anyone proposing a framework or policy position which deviates from a standard liberal position should acknowledge liberalism as a kind of starting point / default, and be able to say why the tradeoff of any individual freedom or right is worth making, each and every time it is made. (And I do not think right-wing frameworks and their standard bearers are even trying to do this, and that is very bad.)

Max H 30 Jun 2025 4:58 UTC
19 points
19
in reply to: TurnTrout’s comment on: TurnTrout’s shortform feed
I think it was fine for Nate to delete your comment and block you, and fine for you to repost it as a short form.
But my anecdote is a valid report of the historical consequences of talking with Nate – just as valid as the e/acc co-founder’s tweet.
“just as valid” [where validity here = topical] seems like an overclaim here. And at the time of your comment, Nate had already commented in other threads, which are now linked in a footnote in the OP:
By “cowardice” here I mean the content, not the tone or demeanor. I acknowledge that perceived arrogance and overconfidence can annoy people in communication, and can cause backlash. For more on what I mean by courageous vs cowardly content, see this comment. I also spell out the argument more explicitly in this thread.
So it’s a bit of a stretch to say that any AI safety-related discussion or interpersonal interaction that Nate has ever had in any context is automatically topical.
I also think your description of Nate’s decision to delete your comment as “not … allowing people to read negative truths about his own behavior” is somewhat overwrought. Both of the comment threads you linked were widely read and discussed at the time, and this shortform will probably also get lots of eyeballs and attention.
At the very least, there is an alternate interpretation, which is that the comment really was off-topic in Nate’s view, and given the history between the two of you, he chose to block + delete instead of re-litigating or engaging in a back-and-forth that both of you would probably find unpleasant and unproductive. Maybe it would have been more noble or more wise of him to simply let your comment stand without direct engagement, but that can also feel unpleasant (for Nate or others).

Max H 29 Jun 2025 11:50 UTC
6 points
1
in reply to: Stephen Martin’s comment on: Support for bedrock liberal principles seems to be in pretty bad shape these days
I gave YIMBYism as an example of a policy agenda that would benefit from more widespread support for liberalism, not as something I personally support in all cases.
A liberal argument for NIMBYism could be: people are free to choose the level of density and development that they want within their own communities. But they should generally do so deliberately and through the rule of law, rather than through opposition to individual developments (via a heckler’s veto, discretionary review processes that effectively require developers to lobby local politicians and woo random interest groups, etc.). Existing strict zoning laws are fine in places where they already exist, but new laws and restrictions should be wary of treading on the rights of existing property owners, and of creating more processes that increase discretionary power of local lawmakers and busybodies.

Max H 29 Jun 2025 3:15 UTC
5 points
2
in reply to: Gordon Seidoh Worley’s comment on: Support for bedrock liberal principles seems to be in pretty bad shape these days
Hmm, I’m not so pessimistic. I don’t think the core concepts of liberalism are so complex or unintuitive that the median civically engaged citizen can’t follow along given an amenable background culture.
And lots of policy, political philosophy, culture, big ideas, etc. are driven by elites of some form, not just liberalism. Ideas and culture among elites can change and spread very quickly. I don’t think a liberal renaissance requires “wrestling control” of any particular institutions so much as a cultural shift that is already happening to some degree (it just needs slightly better steering IMO).

Max H 28 Jun 2025 22:35 UTC
2 points
0
in reply to: the gears to ascension’s comment on: Support for bedrock liberal principles seems to be in pretty bad shape these days
I don’t personally feel screwed over, and I suspect many of the people in the coalitions I mentioned feel similarly. I am sympathetic to people who do feel that way, but I am not really asking them to unilaterally honor anything. The only thing in my post that’s a real concrete ask is for people who do already broadly support liberalism, or who have preferred policy agendas that would benefit from liberalism, be more outspoken about their support.

(To clarify, I have been using “liberalism” as a shorthand for “bedrock liberalism”, referring specifically to the principles I listed in the first paragraph—I don’t think everything that everyone usually calls “liberalism” is broadly popular with all the coalitions I listed, but most would at least pay lip service to the specific principles in the OP.)

Max H 28 Jun 2025 22:00 UTC
2 points
0
in reply to: the gears to ascension’s comment on: Support for bedrock liberal principles seems to be in pretty bad shape these days
I don’t really agree with the characterization of recent history as people realizing that “liberalism isn’t working”, and to the degree that I would advocate for any specific policy change, I support a “radical incrementalist” approach. e.g. maybe the endpoint of the ideal concept of property rights is pretty far from wherever we are right now, but to get there we should start with small, incremental changes that respect existing rights and norms as much as possible.
So for example, I think Georgism is a good idea in general, but not a panacea, and a radical and sudden implementation would be illiberal for some of the reasons articulated by @Matthew Barnett here.
I think a more realistic way to phase in Georgism that respects liberal principles would mainly take the form of more efficient property tax regimes—instead of complex rules and constant fights over property tax assessment valuations, there would be hopefully slightly less complex fights over land valuations, with phase-ins that keep the overall tax burden roughly the same. Some property owners with relatively low-value property on higher value land (e.g. an old / low density building in Manhattan) would eventually pay more on the margin, while others with relatively high-value property on lower value land (e.g. a newer / high density building in the exurbs) would pay a bit less. Lots of people in the middle of the property-vs-land value spectrum would pay about the same. But this doesn’t really get at the core philosophical objections you or others might have with current norms around the concept of property ownership in general.

Support for bedrock liberal principles seems to be in pretty bad shape these days

Max H28 Jun 2025 20:37 UTC

32 points

52 comments4 min readLW link

Max H 28 Apr 2025 23:24 UTC
4 points
0
in reply to: Wei Dai’s comment on: Wei Dai’s Shortform
I kind of doubt that leaders at big labs would self-identify as being motivated by anything like Eliezer’s notion of heroic responsibility. If any do self-identify that way though, they’re either doing it wrong or misunderstanding. Eliezer has written tons of stuff about the need to respect deontology and also think about all of the actual consequences of your actions, even (especially when) the stakes are high:
The critical question here is: what happens if the plot successfully places the two of them in an epistemic Cooperation-Defection Dilemma, where rather than the two of them just having different goals, Carissa believes that he is mistaken about what happens...

In this case, Carissa could end up believing that to play ‘Defect’ against him would be to serve even his own goals, better than her Cooperating would serve them. Betraying him might seem like a friendly act, an act of aid.
(https://glowfic.com/replies/1874768#reply-1874768)
If he commits to a drastic action he will estimate that actual victory lies at the end of it, and his desperation and sacrifice will not have figured into that estimation process as positive factors. His deontology is not for sale at the price point of failure.
(https://glowfic.com/replies/1940939#reply-1940939)
Starting an AI lab in order to join a doomed race to superintelligence, and then engaging in a bunch of mundane squabbles for corporate control, seems like exactly the opposite of the sentiment here:
For Albus Dumbledore, as for her, the rule in extremis was to decide what was the right thing to do, and do it no matter the cost to yourself. Even if it meant breaking your bounds, or changing your role, or letting go of your picture of yourself. That was the last resort of Gryffindor.
(https://hpmor.com/chapter/93)
Also, re this example:
It also seemingly justifies or obligates Sam Altman to fight back when the OpenAI board tried to fire him, if he believed the board was interfering with his mission.

In general, it seems perfectly fine and normal for a founder-CEO to fight back against a board ouster—no need to bring heroic responsibility into it. Of course, all parties including the CEO and the board should stick to legal / above-board / ethical means of “fighting back”, but if there’s a genuine disagreement between the board and the CEO on how to best serve shareholder interests (or humanity’s interests, for a non-profit), why wouldn’t both sides vigorously defend their own positions and power?

Perhaps the intended reading of your example is that heroic responsibility would obligate or justify underhanded tactics to win control, when the dispute has existential consequences. But I think that’s a misunderstanding of the actual concept. Ordinary self-confidence and agency obligate you to defend your own interests / beliefs / power, and heroic responsibility says that you’re obligated to win without stepping outside the bounds of deontology or slipping into invalid / motivated reasoning.

Max H 6 Apr 2025 16:36 UTC
12 points
−2
on: Max H’s Shortform
Maybe the recent tariff blowup is actually just a misunderstanding due to bad terminology, and all we need to do is popularize some better terms or definitions. We’re pretty good at that around here, right?
Here’s my proposal: flip the definitions of “trade surplus” and “trade deficit.” This might cause a bit of confusion at first, and a lot of existing textbooks will need updating, but I believe these new definitions capture economic reality more accurately, and will promote clearer thinking and maybe even better policy from certain influential decision-makers, once widely adopted.
New definitions:
- Trade surplus: Country A has a bilateral “trade surplus” with Country B if Country A imports more tangible goods (cars, steel, electronics, etc.) from Country B than it exports back. In other words, Country A ends up with more real, physical items. Country B, meanwhile, ends up with more than it started with of something much less important: fiat currency (flimsy paper money) or 1s and 0s in a digital ledger (probably not even on a blockchain!).
  If you extrapolate this indefinitely in a vacuum, Country A eventually accumulates all of Country B’s tangible goods, while Country B is left with a big pile of paper. Sounds like a pretty sweet deal for Country A if you ask me.
  It’s OK if not everyone follows this explanation or believes it—they can tell it’s the good one because it has “surplus” in the name. Surely everyone wants a surplus.
- Trade deficit: Conversely, Country A has a “trade deficit” if it exports more tangible resources than it imports, and thus ends up with less goods on net. In return, it only receives worthless fiat currency from some country trying to hoard actual stuff for their own people. Terrible deal!
  Again, if you don’t totally follow, that’s OK, just pay attention to the word “deficit”. Everyone knows that deficits are bad and should be avoided.
Under the new definitions, it becomes clear that merely returning to the previous status quo of a few days ago, where the US only “wins” the trade war by several hundred billion dollars, is insufficient for the truly ambitious statesman. Instead, the US government should aggressively mint more fiat currency in order to purchase foreign goods, magnifying our trade surplus and ensuring that in the long run the United States becomes the owner of all tangible global wealth.
Addressing second order concerns: if we’re worried about a collapse in our ability to manufacture key strategic goods at home during a crisis, we can set aside part of the resulting increased surplus to subsidize domestic production in those areas. Some of the extra goods we’re suddenly importing will probably be pretty useful in getting some new factories of our own off the ground. (But of course we shouldn’t turn around and export any of that domestic production to other countries! That would only deplete our trade surplus.)

Max H 4 Mar 2025 6:35 UTC
9 points
−3
on: Self-fulfilling misalignment data might be poisoning our AI models
Describing misaligned AIs as evil feels slightly off. Even “bad goals” makes me think there’s a missing mood somewhere. Separately, describing other peoples’ writing about misalignment this way is kind of straw.
Current AIs mostly can’t take any non-fake responsibility for their actions, even if they’re smart enough to understand them. An AI advising someone to e.g. hire a hitman to kill their husband is a bad outcome if there’s a real depressed person and a real husband who are actually harmed. An AI system would be responsible (descriptively / causally, not normatively) for that harm to the degree that it acts spontaneously and against its human deployers’ wishes, in a way that is differentially dependent on its actual circumstances (e.g. being monitored / in a lab vs. not).
Unlike current AIs, powerful, autonomous, situationally-aware AI could cause harm for strategic reasons or as a side effect of executing large-scale, transformative plans that are indifferent (rather than specifically opposed) to human flourishing. A misaligned AI that wipes out humanity in order to avoid shutdown is a tragedy, but unless the AI is specifically spiteful or punitive in how it goes about that, it seems kind of unfair to call the AI itself evil.

Max H

Sup­port for bedrock liberal prin­ci­ples seems to be in pretty bad shape these days

Support for bedrock liberal principles seems to be in pretty bad shape these days