TristanTrim comments on 6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa

TristanTrim 9 Dec 2025 20:09 UTC
6 points
0

I imagine that you’re using a more specific definition of it than I am here.

I might be. I might also be using a more general definition. Or just a different one. Alas, that’s natural language for you.

very far into making AI systems do a lot of valuable work for us with very low risk.

I agree, but feel it’s important to note the low risk is only locally low. Globally I think the risk is catastrophic.

I think the biggest difference in our POV might be that I think the systems we are using to control what happens in our world (markets, governments, laws) are already misaligned and heading towards disasters, and if we allow them to continue getting more capable they will not suddenly be capable enough to get back on track because they were never aligned to target human friendly preferences in the first place. Rather, they target proxies, but capabilities have gone beyond the point where those proxies are articulate enough for good outcomes. We need to switch focus from capabilities to alignment.