MSRayne comments on What kind of place is this?

MSRayne 25 Feb 2023 19:22 UTC
1 point
0
You’re correct, but since I define “aligned” as “tending to do what is actually best according to humanity’s value system”, and given that it would be harmful for them to take such a risk, a totally aligned AGI would not, in fact, take that risk lol. So although your addition is important to note, there’s a sense in which it is redundant.
- Vladimir_Nesov 25 Feb 2023 19:39 UTC
  3 points
  1
  Parent
  Both direct and transitive alignment are valuable concepts. Especially with LLM AGIs, which I think are the only feasible directly aligned AGI we are likely to build, but which I suspect won’t be transitively aligned by default.
  
  Since transitive alignment varies among humans (different humans have different inclinations towards building AGIs of uncertain alignment, given a capability to do that), it might be valuable to align LLM personalities to become people who are less likely to fail transitive alignment.