williawa comments on Alignment will happen by default. What’s next?

williawa 25 Nov 2025 21:10 UTC
1 point
0
AF
I wrote more in depth why I think they solve themselves in the post I linked, but I would say:
First of all, I’m assuming that if we get intent alignment, we’ll get CEV alignment shortly after. Because people want the AI aligned to their CEV, not their immediate intent. And solving CEV should be within reach for a intent-aligned ASI. (Here, by CEV alignment I mean “aligning an AI to the values of a person or collection of people that they’d endorse upon learning all the relevant facts and reflecting”.)
If you agree with this, I can state the reason I think these issues will “solve themselves” / become “merely practical/political” by asking a question: You say animal suffering will not be solved, because most people seem not to care about animal suffering. If this is the case, why should they listen to you? Why should they agree there is a problem here? If they reason they don’t care about animal suffering is because they incorrectly extrapolate their own values (are making a mistake from their own perspective), then their CEV would fix the problem, and the ASI optimizing their CEV would extinguish the factory farms. If they’re not making a mistake, and the coherent extrapolation of their own values says factory farming is fine, then they have no reason to listen to you. And if this is the case, its a merely political problem, you (and I) want the animals not to suffer, and the way we can ensure this is by ensuring that our (meaning Adria and William) values have strong enough weight in the value function the AI ends up optimizing for. And this problem is of the same character as trying to get a certain party to have enough votes to get a piece of legislation passed. Its hard and complicated, but not the way a lot of philosophical and ethical questions are hard and complicated.
I could address the other points, but I think I’d be repeating myself. A CEV aligned AI would not create a world where you’re sad (in the most general sense, of having the world not be a way you’d endorse in your wisest most knowledgeable state, after the most thorough reflection) because you think everything is meaningless. It’d find some way to solve the meaninglessness-problem. Unless the problem is genuinely unsolvable, in which case we’re just screwed and our only recourse is not building the ASI in the first place.
- Adrià Garriga-alonso 26 Nov 2025 5:44 UTC
  LW: 2 AF: 1
  0
  AF Parent
  
  Because people want the AI aligned to their CEV, not their immediate intent
  
  I think this is debatable. Many won’t want to change at all.
  
  You say animal suffering will not be solved, because most people seem not to care about animal suffering. If this is the case, why should they listen to you?
  
  Well, they won’t agree, but the animals would agree. This is tyranny of the majority (or the oligarchy, of humans compared to animals). They would dispute the premise. I agree that at some point this is pure exercise of power, some values are just incompatible (e.g. dictatorship of each individual person).
  
  Unless the problem is genuinely unsolvable, in which case we’re just screwed
  
  Yeah, I guess it has this fatal inevitability to it, no? The only way around is prohibition which won’t happen because the incentives are too great. One faction can divert the torrent of history slightly, but not stop it entirely.
  - williawa 26 Nov 2025 7:43 UTC
    1 point
    0
    Parent
    I think this is debatable. Many won’t want to change at all.
    I defined what I meant by CEV in a way that doesn’t entail “changing”.