Gurkenglas comments on Charbel-Raphaël’s Shortform

Gurkenglas 14 Oct 2025 7:50 UTC
9 points
6
I infer they didn’t get “The most forbidden technique”. Try again with e.g. “Never train an AI to hide its thoughts.”?
- mattmacdermott 16 Oct 2025 15:00 UTC
  2 points
  0
  Parent
  Yeah, I think “training for transparency” is fine if we can figure out good ways to do it. The problem is more training for other stuff (e.g. lack of certain types of thoughts) pushes against transparency.