Gotcha that is indeed meaningfully different. I still don’t agree with this alignment-by-induction story, and I want to say that the standard yudkowskian model is watertight enough that your story must implicitly violate one of its assumptions. But I am still trying to pinpoint just which assumption that is.
Edit: also then this whole alignment-by-induction thing feels like a big load-bearing implicit assumption in this post
Gotcha that is indeed meaningfully different. I still don’t agree with this alignment-by-induction story, and I want to say that the standard yudkowskian model is watertight enough that your story must implicitly violate one of its assumptions. But I am still trying to pinpoint just which assumption that is.
Edit: also then this whole alignment-by-induction thing feels like a big load-bearing implicit assumption in this post