Guardian AI (Misaligned systems are all around us.)

Work done @ SERI-MATS, idea from a conversation with Ivan Vendrov at Future Forum earlier this year.

Misaligned systems are all around us. They are what make me watch another video of a man in filthy shorts building a hut using only tools made from rocks and his own armpit hair. And the reason I have never, ever watched a single episode of Flavourful Origins in isolation. Maybe they make you mindlessly seek cat gifs, or keep you scrolling twitter in a cosy fug of righteous indignation long after you should be asleep. They could also be the reason your uncle is a bit more xenophobic now than he used to be. A bit more dismissive of the ‘snowflakes’. A bit more superior, and less kind.

None of this is new, of course. Advertising and propaganda have been around for a long time. But it feels different now – or at least it does to me. I was born in 1990 – I haven’t got a chance. TikTok’s fabulous algorithm steamrollers my brain. I look up, glazed after an undetermined period of involuntary consumption, wondering what happened and why I feel so hollow. I had to delete the app.

Maybe this isn’t you. I’m prone to flow states of both the deeply productive and malign kinds: good at losing myself in a task, terrible at multitasking. But surely, no-one really looks back at a four hour TikTok binge and thinks “that was an excellent use of my time”.

(This isn’t to say that personalised recommender systems or targeted ads or compelling news feeds are bad per se – just that they’re probably not very well aligned with your longer term goals.)

So, probably none of the AI systems in your life are optimising hard for your flourishing right now. Money, yes; attention, yes; but long term, real happiness? Probably not.

(Though, I guess there are things like AI-powered fitness or habit forming apps, etc?)

Could we change that? I’ve been kicking an idea around for a while. It’s not very well formed yet, but it’s something like making wrappers around existing algorithms to shift the optimisation objective. Sick of twitter bickering? A guardian AI could adjust your news feed to give you content that’s more satisfying and informative, and less likely to drag you into pointless arguments. Want to refocus your YouTube recommendations to give you great maths lecture content, without being derailed by popular science videos? Or, stay down with the kids on TikTok without getting lost in it? The guardian wrapper could work to your advantage, retaining the power and joy of these systems while blunting the more pernicious effects. You don’t have to feel frustrated when your app blocker kicks in – you can enjoy your twenty minutes of twitter-time and then get on with your life. Like a sunrise lamp, instead of an alarm clock.

This idea maybe has broader applications than just making recommender systems better for you. There are all kinds of subtle interventions a guardian AI could make to protect you and improve your life in a million small ways. I’m sure you can think of lots of them. “You’ve not seen X for a while, you’re both free on Saturday and it’ll be sunny! Shall I book a tennis court?”.

Worth exploring? More broadly, this kind of stuff seems like a nice low-stakes-yet-real-world test bed for some important ideas. How to define good proxies for flourishing, satisfaction, et cetera with minimal human input? How to mitigate the effects of misaligned black-box systems that want to hijack our puny human brains?