Relational Alignment: Trust, Repair, and the Emotional Work of AI

Alignment isn’t just about control, it’s about trust. This post explores “relational alignment” as complementary to functional safety. What would it take for AI to not just do the right thing, but remember what matters to you? The aim is to spark technical and philosophical dialogue on trust modeling, value memory, and relational repair.

Last time, I wrote about how we are designing AI systems the way we approach arranged marriages, by optimising for traits, ticking boxes and forgetting that the real relationship begins after the specs are met. I ended with a question—what kind of a relationship are we designing with AI?

This post is an attempt to stay with that question a little longer, and go one level deeper.

Once we let go of the illusion of control, once we admit that intelligence isn’t static or one-sided, we’re left with something more complex, and more human. We are left with a relationship, one that evolves.

Obedience vs. Trust

We’re used to thinking of alignment in functional terms.

Does the system do what I ask?
Does it optimise the right metric?
Does it avoid catastrophic failure?

These are important questions, and they make sense at scale, in the abstract. But the experience of using AI isn’t abstract. It’s intimate. It’s personal. And in that domain, alignment isn’t a solved problem. It’s a living process.

When I used to work with couples in conflict, I’d often ask them …
Do you want to be right, or do you want to be in relationship?

I find myself thinking about that question again, now in the context of AI, because so much of our design language still smells like obedience. We talk about alignment the way we talk about training a dog, or onboarding a junior employee.

But relationships don’t thrive on obedience. They thrive on trust, built slowly, repaired often, and tested most meaningfully when things go wrong.

Relational Alignment: A Reframe

So, here’s a thought I’ve been sitting with:

What if AI alignment isn’t just about getting the “right” output, but about practicing mutual adaptation over time?

In this frame, alignment is less about pre-set constraints and more about ongoing calibration. Less about designing perfect rules, and more about learning how to repair when those rules fall short. A relationally aligned system wouldn’t just know what not to do. It would remember what matters to you.

Say you ask your AI assistant to summarise your resume.

It gets your experience wrong. You correct it saying, “This is important to me, please make sure this reflects 10+ years in product leadership.”

Next week, it makes the same mistake, or one close enough to feel like déjà vu.

Now the problem isn’t accuracy. It’s carelessness. It’s forgetting what you flagged.

And in relationships, forgetfulness erodes trust faster than malice. A relationally aligned system wouldn’t just update the data. It would treat that correction as a signal of value—This matters to her. Pay attention.

From Universal to Personal Values

Most of the alignment discourse I’ve encountered feels bimodal. Either we teach the model to behave, or we let AI learn what it will, from the world. But what if alignment lives in the space between those poles? Somewhere between autonomy and adherence?

In a relational frame, I’ve started to think about values in two layers:

Universal values: non-harm, honesty, consent
Relational preferences: personal signals of what matters in context

Most alignment work today is focused on the first layer, and that’s essential.
But the second layer is where emotional safety lives. It’s what makes relationships with humans, and maybe even with machines feel trustworthy and safe.

Lessons from Parenting

Of course, there’s a power asymmetry here. We created these systems. But that doesn’t mean we should design the relationship to be static. So I think of this more like parenting.

We don’t raise children with a fixed instruction set, handed over at birth. We model values. We adjust through feedback. We learn from our mistakes. We forgive, re-attune, and try again.

What if alignment is less about locking in values, and more about modelling, mirroring, and maturing them through experience? That might mean designing AI not just to be right, but to recognise what a relationship needs to thrive:

Memory that holds what matters
Transparency around uncertainty
Protocols for repair, not just prevention
Willingness to grow, not just optimise
Accountability, even in asymmetry

Alignment isn’t something we solve. It’s something we stay in conversation with. Just like any relationship worth building.

Why This Matters

We don’t live in isolation. We live in interaction. And if we get those interactions wrong, if our systems can’t listen, can’t remember, can’t repair, then we’ll build tools that are smart, but sterile. Capable, but not collaborative.

I’m not saying this replaces technical work. But maybe it deepens it. Maybe it reminds us that intelligence isn’t what happens on a benchmark. It’s what happens in a moment of misunderstanding, when the repair matters more than the response.

This shift, though, leaves us with new kinds of uncertainty. If we think of alignment as a relationship, not just an objective, then we also have to face the messiness that real relationships bring.

What counts as learning? What does growth look like? What do we trust, and why?

Open Questions

There are several questions I don’t have the answers to, but I am going to leave them here

What makes a system trustworthy, in this moment, with this person?
How do we encode not just what’s true, but what’s meaningful?
How do we allow for difference, not just in data, but in values, styles, and needs?
Can alignment be personal, evolving, and emotionally intelligent, without pretending the system is human?

I’d love to collaborate with technical folks working on memory, value modelling, or interpretability. If any of this resonates, or breaks down, I’d be grateful for feedback or pointers.