Hastings comments on Alignment as uploading with more steps

Hastings 14 Sep 2025 19:49 UTC
2 points
0
I think there is a bit of a rhetorical issue here with the necessity argument: I agree that a powerful program aligned to a person would have an accurate internal model of that person, but I think that this is true by default whenever a powerful, goal seeking program interacts with a person- it’s just one of the default instrumental subgoals, not alignment specific.
- Cole Wyeth 14 Sep 2025 19:55 UTC
  2 points
  0
  Parent
  There’s a difference between building a model of a person and using that model as a core element of your decision making algorithm. So what you’re describing seems even weaker than weak necessity.
  However, I agree that some of the ideas I’ve sketched are pretty loose. I’m trying to provide a conceptual frame and work out some of the implications only.