Vladimir_Nesov comments on What kind of place is this?

Vladimir_Nesov 25 Feb 2023 19:39 UTC
3 points
1
Both direct and transitive alignment are valuable concepts. Especially with LLM AGIs, which I think are the only feasible directly aligned AGI we are likely to build, but which I suspect won’t be transitively aligned by default.

Since transitive alignment varies among humans (different humans have different inclinations towards building AGIs of uncertain alignment, given a capability to do that), it might be valuable to align LLM personalities to become people who are less likely to fail transitive alignment.