Cole Wyeth comments on Legible vs. Illegible AI Safety Problems

Cole Wyeth 4 Nov 2025 23:08 UTC
LW: 19 AF: 8
4
AF
I think this is a very important point. Seems to be a common unstated crux, and I agree that it is (probably) correct.
- Wei Dai 5 Nov 2025 1:06 UTC
  LW: 12 AF: 4
  4
  AF Parent
  Thanks! Assuming it is actually important, correct, and previously unexplicated, it’s crazy that I can still find a useful concept/argument this simple and obvious (in retrospect) to write about, at this late date.
  - xpym 6 Nov 2025 11:05 UTC
    2 points
    0
    Parent
    I’m surprised that you’re surprised. To me you’ve always been a go-to example of someone exceptionally good at both original seeing and taking weird ideas seriously, which isn’t a well-trodden intersection.
    - Wei Dai 6 Nov 2025 23:00 UTC
      3 points
      0
      Parent
      I elaborated a bit more on what I meant by “crazy”: https://www.lesswrong.com/posts/PMc65HgRFvBimEpmJ/legible-vs-illegible-ai-safety-problems?commentId=x9yixb4zeGhJQKtHb.
      
      And yeah I do have a tendency to take weird ideas seriously, but what’s weird about the idea here? That some kinds of safety work could actually be harmful?
      - xpym 7 Nov 2025 12:26 UTC
        1 point
        0
        Parent
        Nah, the weird idea is AI x-risk, something that almost nobody outside of LW-sphere takes seriously, even if some labs pay lip service to it.