ozziegooen comments on 6 reasons why “alignment-is-hard” discourse seems alien to human intuitions, and vice-versa

ozziegooen 9 Dec 2025 19:56 UTC
LW: 2 AF: 1
0
AF
Thanks for the clarification.

> But the thing I’m most worried about is companies succeeding at “making solid services/products that work with high reliability” without actually solving the alignment problem, and then it becomes even more difficult to convince people there even is a problem as they further insulate themselves from anyone who disagrees with their hyper-niche worldview.

The way I see it, “making solid services/products that work with high reliability” is solving a lot of the alignment problem. As in, this can get us very far into making AI systems do a lot of valuable work for us with very low risk.

I imagine that you’re using a more specific definition of it than I am here.
- Steven Byrnes 10 Dec 2025 13:55 UTC
  LW: 5 AF: 2
  0
  AF Parent
  The way I see it, “making solid services/products that work with high reliability” is solving a lot of the alignment problem.
  Funny, I see “high reliability” as part of the problem rather than part of the solution. If a group is planning a coup against you, then your situation is better not worse if the members of this group all have dementia. And you can tell whether or not they have dementia by observing whether they’re competent and cooperative and productive before any coup has started.
  If the system is not the kind of thing that could plot a coup even if it wanted to, then it’s irrelevant to the alignment problem, or at least to the most important part of the alignment problem. E.g. spreadsheet software and bulldozers likewise “do a lot of valuable work for us with very low risk”.
- TristanTrim 9 Dec 2025 20:09 UTC
  5 points
  0
  Parent
  
  I imagine that you’re using a more specific definition of it than I am here.
  
  I might be. I might also be using a more general definition. Or just a different one. Alas, that’s natural language for you.
  
  very far into making AI systems do a lot of valuable work for us with very low risk.
  
  I agree, but feel it’s important to note the low risk is only locally low. Globally I think the risk is catastrophic.
  
  I think the biggest difference in our POV might be that I think the systems we are using to control what happens in our world (markets, governments, laws) are already misaligned and heading towards disasters, and if we allow them to continue getting more capable they will not suddenly be capable enough to get back on track because they were never aligned to target human friendly preferences in the first place. Rather, they target proxies, but capabilities have gone beyond the point where those proxies are articulate enough for good outcomes. We need to switch focus from capabilities to alignment.