Knight Lee comments on Power Lies Trembling: a three-book review

Knight Lee 23 Feb 2025 9:13 UTC
11 points
3
I think society is very inconsistent about AI risk because the “Schelling point” is that people feel free to believe in a sizable probability of extinction from AI without looking crazy, but nobody dares argue for the massive sacrifices (spending or regulation or diplomacy) which actually fit those probabilities.
The best guess by basically every group of people, is that with 2%-12%, AI will cause catastrophe (kill 10% of people). At these probabilities, AI safety should be an equal priority to the military!
Yet at the same time, nobody is doing anything about it. Because they all observe everyone else doing nothing about it. Each person thinks the reason that “everyone else” is doing nothing, is that they figured out good reasons to ignore AI risk. But the truth is that “everyone else” is doing nothing for the same reason that they are doing nothing. Everyone else is just following everyone else.
This “everyone following everyone else” inertia is very slowly changing, as governments start giving a bit of lip-service and small amounts of funding to organizations which are half working on AI Notkilleveryoneism. But this kind of change is slow and tends to take decades. Meanwhile many AGI timelines are less than one decade.
- Wofsen 18 Apr 2025 20:49 UTC
  3 points
  0
  Parent
  Have you incorporated Government Notkilleveryoneism into your model? Rational people interested in not dying ought to invest in AI safety in proportion to the likelihood that they expect to be killed by AI. Rational governments ought to invest in AI safety in proportion to the likelihood they expect to be killed by AI. But, as we see from this article, what kills governments is not what kills people.
  The government of Acheampong is in more danger from a miscalculated letter than from an AI-caused catastrophe that kills 10% of people or more, so long as important people like Acheampong are not among the 10%. The government of Acheampong, as a mimetic agent, should not care about alignment as much as it should care about having a military; governments maintain standing armies for protection against threats from within and without its borders.
  You do not have to expect a government to irrationally invest in AI alignment anyway. Private Investment in AI 2024 was $130 billion. There is plenty of private money which could dramatically increase funding for AI safety. Yet it has not happened. Here are some potential explanations.
  1) People do not think there is AI risk, because other people do not think there is AI risk, the issue you mentioned.
  2) When people must put their money where their mouth is, their revealed preference is that they see no AI risk.
  3) People think there is AI risk, but see no way to invest in mitigating AI risk. They would rather invest in flood insurance.
  If option 3 is the case, Mr. Lee, you should create AI catastrophe insurance. You can charge a premium akin to high-risk life insurance. You can invest some of your revenue in assets you think will survive the AI catastrophe and then distribute these to policyholders or policyholders’ next of kin in case of catastrophe, and you can invest the rest in AI safety. If there is an AI catastrophe, you will be physically prepared. If there is not an AI catastrophe, you will profit handsomely from the success of your AI safety investments and the service you would have done in a counterfactual world. You said yourself “nobody is doing anything about it.” This is your chance to do something. Good luck. I’m excited to hear how it goes.
  - Knight Lee 18 Apr 2025 23:27 UTC
    1 point
    0
    Parent
    :) thank you so much for your thoughts.
    Unfortunately, my model of the world is that if AI kills “more than 10%,” it’s probably going to be everyone and everything, so the insurance won’t work according to my beliefs.
    I only defined AI catastrophe as “killing more than 10%” because it’s what the survey by Karger et al. asked the participants.
    I don’t believe in option 2, because if you asked people to bet against AI risk with unfavourable odds, they probably won’t feel too confident against AI risk.
- Da_Peach 18 Apr 2025 13:19 UTC
  3 points
  0
  Parent
  I think this particular issue has less to do with public sentiment & more to do with problems that require solutions which would inconvenience you today for a better tomorrow.
  Like climate change: it is an issue everyone recognizes will massively impact the future negatively (to the point where multiple forecasts suggest trillions of dollars of losses). Still, since fixing this issue will cause prices of everyday goods to rise significantly and force people into switching to green alternatives en masse, no one advocates for solutions. News articles get released each year stating record high temperatures & natural disaster rates; people complain that the seasons have been getting more extreme each passing year (for example, the monsoon in Northern India has been really inconsistent for several years now—talking from personal experience). Yet, changes are gradual (carbon tax is still wildly unpopular, and even though the public sentiment around electric cars has been getting tame, people still don’t advocate for anti-car measures).
  Compare this to Y2K; it was known that the “bug” would be massively catastrophic, and even though it was massively expensive to fix, they did fix it. Why? Because the issue didn’t affect the lives of common folks substantively.
  Though my model is certainly not all-encompassing, like how the problem of CFCs causing Ozone depletion in the upper atmosphere was largely solved, even though it very much did impact people’s everyday lives & did cost a lot of money & even required global co-operation. I guess there is a tipping point on the inconvenience caused v/s perceived threat graph when people start mobilizing for the issue.
  PS: It’s funny how I ended at the crux of the original article, you should now be able to apply threshold modelling for this issue, since perceived threat largely depends on how many other people (in your local sphere) are shouting that this issue needs everyone’s attention.
  - Knight Lee 19 Apr 2025 19:18 UTC
    2 points
    0
    Parent
    You’re very right, in addition to people not working on AI risk because they don’t see others working on it, you also have the problem that people aren’t interested in working on future risk to begin with.
    I do think that military spending is a form of addressing future risks, and people are capable of spending a lot on it.
    I guess military spending doesn’t inconvenience you today, because there already are a lot of people working in the military, who would lose their jobs if you reduced military spending. So politicians will actually make more people lose their jobs if they reduced military spending.
    Hmm. But now that you bring up climate change, I do think that there is hope… because some countries do regulate a lot and spend a lot on climate change, at least in Europe. And it is a complex scientific topic which started with the experts.
    Maybe the AI Notkilleveryone movement should study what worked and failed for the green movement.
    - Da_Peach 20 Apr 2025 7:41 UTC
      3 points
      0
      Parent
      That’s an interesting idea. The military would undoubtedly care about AI alignment — they’d want their systems to operate strictly within set parameters. But the more important question is: do we even want the military to be investing in AI at all? Because that path likely leads to AI-driven warfare. Personally, I’d rather live in a world without autonomous robotic combat or AI-based cyberwarfare.
      But as always, I will pray that some institution (like the EU) leads the charge & start instilling it into people’s heads that this is a problem we must solve.
      - Knight Lee 20 Apr 2025 8:32 UTC
        1 point
        0
        Parent
        Oops, I didn’t mean we should involve the military in AI alignment. I meant the military is an example of something working on future threats, suggesting that humans are capable of working on future threats.
        I think the main thing holding back institutions is that public opinion does not believe in AI risk. I’m not sure how to change that.
        Da_Peach 20 Apr 2025 9:06 UTC
        3 points
        0
        Parent
        To sway public opinion about AI safety, let us consider the case of nuclear warfare—a domain where long-term safety became a serious institutional concern. Nuclear technology wasn’t always surrounded by protocols, safeguards, and watchdogs. In the early days, it was a raw demonstration of power: the bombs dropped on Hiroshima and Nagasaki were enough to show the sheer magnitude of destruction possible. That spectacle shocked the global conscience. It didn’t take long before nation after nation realized that this wasn’t just a powerful new toy, but an existential threat. As more countries acquired nuclear capabilities, the world recognized the urgent need for checks, treaties, and oversight. What began as an arms race slowly transformed into a field of serious, respected research and diplomacy—nuclear safety became a field in its own right.
        The point is: public concern only follows recognition of risk. AI safety, like nuclear safety, will only be taken seriously when people see it as more than sci-fi paranoia. For that shift to happen, we need respected institutions to champion the threat. Right now, it’s mostly academics raising the alarm. But the public—especially the media and politicians—won’t engage until the danger is demonstrated or convincingly explained. Unfortunately for the AI safety issue, evidence of AI misalignment causing significant trouble will probably mean it’s too late.
        Adding fuel to this fire is the fact that politicians aren’t gonna campaign about AI safety if the corpos in your country don’t want to & your enemies are already neck-to-neck in AI dev.
        In my subjective opinion, we need the AI variant of Hiroshima. But I’m not too keen on this idea, for it is a rather dreadful thought.
        Edit: I should clarify what I mean by “the AI variant of Hiroshima.” I don’t think a large-scale inhuman military operation is necessary (as I already said, I don’t want AI warfare). What I mean instead is something that causes significant damage & makes it to newspaper headlines worldwide. Examples: strong evidence that AI swayed the presidential election one way; a gigantic economic crash caused because of a rogue AI (not the AI bubble bursting); millions of jobs being lost in a short timeframe because of one revolutionary model, which then snaps because of misalignment; etc. There are still dreadful, but at least no human lives are lost & it gets the point across that AI safety is an existential issue.
        Knight Lee 20 Apr 2025 18:04 UTC
        1 point
        0
        Parent
        I think different kinds of risks have different “distributions” of how much damage they do. For example, the majority of car crashes causes no injuries (but damage to the cars), a smaller number causes injuries and some causes fatalities, and the worst ones can cause multiple fatalities.
        For other risks like structural failures (of buildings, dams, etc.) the distribution has a longer tail: in the worst case very many people can die. But the distribution still tapers off towards greater number of fatalities, and people sort have have a good idea of how bad it can get before the worst version happens.
        For risks like war, the distribution has an even longer tail, and people are often caught by surprise how bad they can get.
        But for AI risk, the distribution of damage caused is very weird. You have one distribution for AI causing harm due to its lack of common sense, where it might harm a few people, or possibly cause one death. Yet you have another distribution for AI taking over the world, with a high probability of killing everyone, a high probability of failing (and doing zero damage), and only a tiny bit of probability in between.
        It’s very very hard to learn from experience in this case. Even the biggest wars tend to surprise everyone (despite having a relatively more predictable distribution).
        Da_Peach 26 Apr 2025 11:16 UTC
        2 points
        0
        Parent
        That’s a cool way to frame damage risks, but I think your distribution for AI damage is for ASI, not AGI. I think it’s very reasonable that an AGI-based system may cause the type of damage that I am talking about.
        Even if you believe that as soon as we achieve AGI, we’ll accelerate to ASI because AGI by definition is self-improving, it still takes time to train a model, and research is slow. I hope that the window b/w AGI & ASI is large enough for such a “Hiroshima event” to occur, so humanity wakes up to the risks of maligned AI systems.
        PS: Sorry for the late response, I was offline for a couple of days
        Knight Lee 27 Apr 2025 8:43 UTC
        2 points
        0
        Parent
        No need to say sorry for that! On a forum, there is no expectation to receive a reply. If every reply obligated the recipient to make another reply, comment chains will drag on forever.
        You can freely wait a year before replying.
        I’m worried that once a “Hiroshima event” occurs, humanity won’t have another chance. If the damage is caused by the AGI/ASI taking over places, then the more power it obtains, the more it can obtain even more power, so it won’t stop at any scale.
        If the damage is caused by bad actors using an AGI to invent a very deadly technology, there is a decent chance humanity can survive, but it’s very uncertain. A technology can never be uninvented, and more and more people will know about it.
        Kaj_Sotala 27 Apr 2025 10:17 UTC
        3 points
        0
        Parent
        You can freely wait a year before replying.
        Or more! (I was delighted to receive this reply.)