Vivek Hebbar comments on the void

Vivek Hebbar 13 Jun 2025 3:24 UTC
LW: 11 AF: 9
7
AF
I sympathize somewhat with this complexity point but I’m worried that training will be extremely non-Bayesian in a way that makes complexity arguments not really work. So I feel like the point about optimization power at best cuts the worry about hyperstition by about a factor of 2. Perhaps there should be research on how “sticky” the biases from early in training can be in the face of later optimization pressure.
- Daniel Kokotajlo 16 Jun 2025 16:15 UTC
  LW: 4 AF: 3
  0
  AF Parent
  Mia & co at CLR are currently doing some somewhat related research iiuc