Bogdan Ionut Cirstea comments on Daniel Kokotajlo’s Shortform

Bogdan Ionut Cirstea 4 Feb 2025 22:53 UTC
LW: 4 AF: 2
0
AF
I also think it’s important to notice how much less scary / how much more probably-easy-to-mitigate (at least strictly when it comes to technical alignment) this story seems than the scenarios from 10 years ago or so, e.g. from Superintelligence / from before LLMs, when pure RL seemed like the dominant paradigm to get to AGI.
- Daniel Kokotajlo 4 Feb 2025 23:30 UTC
  LW: 2 AF: 2
  0
  AF Parent
  I don’t think it’s that much better actually. It might even be worse. See this comment: