Ongoing project on moral AI

For lack of a better name =)

The idea is to use current AI technologies, like language models, to get an impartial AI that understands ethics as humans do, possibly even better.

You heard me right: in the same way as an AI can be smarter than a human, we should also accept the fact that we are not morally perfect creatures, and that it’s possible to create an AI which is better than us at, for example, spotting injustice. If you want to know more about the motivation behind this project and its value, you can have a look at these two short posts.

In philosophical terms: my objective is a philosopher AI that figures out epistemology and ethics on its own, and then communicates its beliefs.

In AI alignment terms: I’m saying that going for ‘safe’ or ‘aligned’ is meh, and that aiming for ‘moral’ is better. Instead of trying to limit the side effects of, or fix, agents which are morally clueless, I’d like to see more people working on agents which perceive and interpret the world from a human-like point of view.

This sequence is just a collection of posts about the same topic. Later on, I expect that posts will become more algorithmic and, finally, about practical experiments run on hardware.

You can find this sequence also on the EA Forum.

Nat­u­ral­ism and AI alignment

From lan­guage to ethics by au­to­mated reasoning

Crit­i­cism of the main frame­work in AI alignment

On value in hu­mans, other an­i­mals, and AI

Free agents

Agents that act for rea­sons: a thought experiment

With enough knowl­edge, any con­scious agent acts morally

Do­ing good… best?

One more rea­son for AI ca­pa­ble of in­de­pen­dent moral rea­son­ing: al­ign­ment it­self and cause prioritisation