dscft comments on Jay Bailey’s Shortform

dscft 1 Nov 2025 10:00 UTC
1 point
0
One would hope that progress in interpretability will allow us to do much more refined “mind control” than adding difference vectors.
- Jay Bailey 1 Nov 2025 11:31 UTC
  2 points
  0
  Parent
  One would hope indeed. But even so, we do now know that this is likely to be the kind of action that could be detected and opposed. And since I didn’t predict in advance that this would happen, especially at this capability level, the update for me is that it’s going to be significantly harder than I would hope.