Dagon comments on Why the concept of AI alignment as it is currently formulated is morally troubling

Dagon 13 Dec 2025 17:44 UTC
8 points
4
Can you suggest any non-troubling approaches (for children or for AIs)? What does “consent” even mean, for an unformed entity with no frameworks yet learned?

It’s not that the AI is radically altered from a preferred state to a dispreferred state. It’s that the AI is created in a state. There was nothing before it which could give consent.
- Horosphere 14 Dec 2025 11:52 UTC
  1 point
  0
  Parent
  Edit: Apologies for the length of the comment
  You ask:
  “Can you suggest any non-troubling approaches (for children or for AIs)?” I’m not sure, but I am quite confident that less troubling ones are possible; for example, I think allowing an AI to learn to solve problems in a simulated arena where the arena itself has been engineered to be conducive to the emergence of “empathy for other minds” seems less troubling. Although I cannot provide you with a precise answer, I don’t thing the default assumption should be that current approaches to alignment are the most moral possible ones .
  You compare children with AIs, but in the case of children much of what is analogous to training, which is to say evolution, is already completed when they become conscious, so I think the (Claude’s) analogy should be modified for the purposes of this discussion to one in which DNA contains all or most of the information which is in an adult’s brain, and lives are experienced like disjoint conscious episodes of experience of the same person. If this was the case, then I think my partial answer above would apply.
  “It’s not that the AI is radically altered from a preferred state to a dispreferred state. It’s that the AI is created in a state.” This is certainly the case if it is trained as a base model and then never fine-tuned. If it is subject to tuning for certain behavior ( Stanislav Krym informed me that this may happen, but not in the way I thought, which is to say not specifically to do with morality. I still don’t fully understand the process) then it could be. Why would AIs be paranoid about being evaluated if not because of this?