tailcalled answers Goals of model vs. goals of simulacra?

tailcalled 12 Apr 2023 14:28 UTC
2 points
0
People are going to try to make LLMs do power seeking, such as by setting up a loop that invokes a power-seeking simulacrum and does as it commands. It is currently unclear how much they will succeed. If they succeed then a lot of classical power-seeking discussion will apply to the resulting objects; otherwise LLMs are presumably not the path to AGI.
- dr_s 12 Apr 2023 14:42 UTC
  1 point
  0
  Parent
  They’re already trying (look up ChaosGPT, though that’s mostly a joke). But my question is more about what changes from misalignment problems in gradient descent. For example, is it easier or harder for the simulacrum to align its own copy running on a more powerful underlying model?