lberglund comments on All AGI Safety questions welcome (especially basic ones) [~monthly thread]

lberglund 5 Nov 2022 17:35 UTC
1 point
0
I think the issue might be that the ELK head (the system responsible for eliciting another system’s latent knowledge) might itself be deceptively aligned. So if we don’t solve deceptive alignment our ELK head won’t be reliable.