don't_wanna_be_stupid_any_more comments on Too Many Metaphors: A Case for Plain Talk in AI Safety

don't_wanna_be_stupid_any_more 1 Jun 2025 18:24 UTC
2 points
0
i agree that EY’s earlier attempts at advocating for AI safety were not the best and often counter productive, but i think you are underestimating just how hard it is to communicate those ideas to a lay audience, i for example tried to discuss this topic with a few friends from my universities IT faculty ,they are NOT lay people and they have background knowledge and yet despite studying this subject for over a year i was unable to get my point across.
talking to people is a skill that needs training, you can’t expect someone no matter how smart or knowledgeable they are to come out the gate swinging with max charisma, some mistakes need to be made.
EY has improved over the last year, his latest appearance on robinson erhardt was a major improvement.
- David Harket 2 Jun 2025 5:51 UTC
  2 points
  0
  Parent
  I agree that no matter how smart or knowledgeable someone is, it’s rare to come out of the gate with perfect communication skills. And I agree these ideas are genuinely hard to convey to non-experts.
  That said, my intuition is that the risk of AGI is better communicated through distilled versions of the core arguments, like instrumental convergence and the orthogonality thesis, rather than via anthropomorphic or futuristic metaphors.
  For example, I recently tried to explain AGI risk to my dad. I started with the basics: the problem of misaligned AGI, current alignment limitations, and how optimisation at scale could lead to unintended consequences. But his takeaway was essentially: “Sure, it’s risky to give powerful tools to people with different values than mine, but that’s not existential.”
  I realised I hadn’t made the actual point. So I clarified: the danger isn’t just in bad actors using AI, but in AIs themselves pursuing goals that are misaligned with human values, even if no humans are malicious.
  I used a simple analogy: “If you’re building a highway and there’s an anthill in the way, you don’t hate the ants—you just don’t care.” That helped. It highlighted two things: (1) we already treat beings with different moral weight indifferently, and (2) AIs might do the same, not out of malice but out of indifference shaped by their goals.
  My point isn’t that analogies are always bad. But they work best when they support a precise concept, not when they replace it. That’s why I think Yudkowsky’s earlier communication sometimes backfires. It leaned too hard on the metaphor and not enough on the logic (that definitely exists. I have a high regard for his work). I’ll check out the Robinson Erhardt interview, though; if it’s a shift in tone, that’s good to hear.