Both direct and transitive alignment are valuable concepts. Especially with LLM AGIs, which I think are the only feasible directly aligned AGI we are likely to build, but which I suspect won’t be transitively aligned by default.
Since transitive alignment varies among humans (different humans have different inclinations towards building AGIs of uncertain alignment, given a capability to do that), it might be valuable to align LLM personalities to become people who are less likely to fail transitive alignment.
Both direct and transitive alignment are valuable concepts. Especially with LLM AGIs, which I think are the only feasible directly aligned AGI we are likely to build, but which I suspect won’t be transitively aligned by default.
Since transitive alignment varies among humans (different humans have different inclinations towards building AGIs of uncertain alignment, given a capability to do that), it might be valuable to align LLM personalities to become people who are less likely to fail transitive alignment.