Wei Dai comments on Legible vs. Illegible AI Safety Problems

Wei Dai 5 Nov 2025 1:06 UTC
LW: 12 AF: 4
4
AF
Thanks! Assuming it is actually important, correct, and previously unexplicated, it’s crazy that I can still find a useful concept/argument this simple and obvious (in retrospect) to write about, at this late date.
- xpym 6 Nov 2025 11:05 UTC
  2 points
  0
  Parent
  I’m surprised that you’re surprised. To me you’ve always been a go-to example of someone exceptionally good at both original seeing and taking weird ideas seriously, which isn’t a well-trodden intersection.
  - Wei Dai 6 Nov 2025 23:00 UTC
    3 points
    0
    Parent
    I elaborated a bit more on what I meant by “crazy”: https://www.lesswrong.com/posts/PMc65HgRFvBimEpmJ/legible-vs-illegible-ai-safety-problems?commentId=x9yixb4zeGhJQKtHb.
    
    And yeah I do have a tendency to take weird ideas seriously, but what’s weird about the idea here? That some kinds of safety work could actually be harmful?
    - xpym 7 Nov 2025 12:26 UTC
      1 point
      0
      Parent
      Nah, the weird idea is AI x-risk, something that almost nobody outside of LW-sphere takes seriously, even if some labs pay lip service to it.