If we’re going to understand what constitutes good policy, we need to understand what constitutes a good moral theory. We can gain some insights from looking at the practices of moral philosophy. Yes, you knew damn well that when we started this, we would talk about the trolley problem.
So you’re in class, and your philosophy professor asks you why you chose to pull the lever, you say “More lives are better than fewer”, and then they hit you with the Surgeon variant of the problem. Why do they do that? What are we achieving when we pose moral dilemmas?
Pictured: My partner explaining why reality TV is enjoyable The Good Place’s take on the trolley problem
Your professor is, for each dilemma, getting you to state your intuitive moral preferences over two specified timelines (one in which you pull the lever, and the one in which you don’t). Then you’re supposed to state a rule that consistently represents your moral preferences over all possible dilemmas. And if your proposed rule fails to account for the next contrived dilemma your professor throws at you, you’ve done something wrong. It’s essentially the same as scientific falsification, except we use thought experiments.
People in the philosophy biz often get away with questions that no one else can ask
Now you might wonder, if moral intuitions arbitrate which moral rules are “correct”, why not just use moral intuition to evaluate these dilemmas? What is the point of a moral rule? Well, they have several purposes:
Unless we’re born with moral intuitions that are perfectly accurate, we’re going to have some priors that are wrong.
Sometimes our moral priors are outright contradictory. For example, “Violence is never moral” and “Culture determines morality” contradict because of the possibility of violent cultures.
Sometimes our stated justifications for our moral intuitions are contradictory, but those intuitions could be represented with a non-contradictory moral rule.
Often, our intuitive ranking will be incomplete, which makes moral decisions difficult even if we could perfectly predict the consequences.
Finding a moral rule helps us to correct poor moral intuitions, explain our good moral intuitions with more clarity, and guide us where our intuitions don’t exist.
Our view of thinking of moral rules as summarising moral preferences over timelines works not just for consequentialist theories, but for all moral theories that meet some fairly general criteria (namely, completeness and transitivity).
Even the really dumb ones
How are we going to link this to goals?
Now this part is extremely important. If we’re clever about how we represent each timeline, and how we represent each goal, we can mathematically connect goals to morality. The most elegant representation, if I do say so myself, is the following.
Let each timeline be the set containing all statements that are empirically true at all times and places within that timeline. For example, “[Unique identifier for you] is reading [unique identifier for this sentence]” would not be in the representation of the timeline that we exist in, because once you begin the next sentence, it’s no longer true. Instead, the statement would have to say something like “[Unique identifier for you] read [unique identifier for that sentence] during [the time period at which you read that sentence]”. For brevity’s sake, the remaining example statements won’t contain unique identifiers or time periods, as the precise meaning of the statements should be clear from the context.
What? I see this wordy explanation as an absolute win.
Now if you’re a stickler, and I know some of you are (yes, I’m pointing at you), you’re probably thinking “But we can’t actually talk about timelines, because at best, we only make educated guesses about the future.” So let’s incorporate uncertainty by using “timeline forecasts”.
Let’s represent these forecasts as the set of all predictions that we would make about a particular timeline right now. For example, the pull-the-lever forecast contains the statement “The five live with a credence of 1”. Note that each forecast can roughly be thought of as a probability distribution of timelines.
(Why are they effectively probability distributions? Each timeline contains all statements that are true about itself. So if “You pull the lever”, “The five live”, and “The one dies” are within a timeline, you can string these into a single, larger statement that is also in that timeline: “You pull the lever, and the five live, and the one dies”. So, each timeline contains a very large statement that uniquely identifies it within any finite set of timelines. And that combined statement, amended with an associated credence, will be within our timeline forecast.)
So, why is this particular representation so useful?
Linking the pieces together
This representation allows us to link moral rules to forecasts to “goalsets”.
Pictured: Me explaining literally anything
We’ll say that a moral rule chooses the best forecasts. I.e. it’s a function such that
the input is any set of forecasts, and
the output is the set of all forecasts that are weakly preferred to anything in the input set.
As for goalsets, they can be “satisfied” by particular forecasts (i.e. your goalset is a subset of that forecast). For example, let’s say your goalset in the trolley problem contains only one statement: “There was no way to save more people, with a credence of 1”. Then only the pull-the-lever forecast satisfies your goalset.
The last piece of this is your “empirical model”, which tells you what your forecast would be if you attempted a given “plan”. (I.e. it’s a function that maps a plan to a forecast.)
Wait, why goalsets? Can’t we evaluate goals individually?
We cannot evaluate goals individually. There are two reasons for this. (1) Any issue with a goal will manifest in a goalset that contains it, so we can rephrase any criticisms of a goal in terms of goalsets. And (2) there are times when we have to look at the goalset as a whole before identifying an issue.
For example, suppose we’re stranded on an island. Suppose the best outcome occurs when one of us hunts for food and the other builds a shelter or vice versa. So “I build a shelter” is a goal that fits in one ideal outcome, and “You build a shelter” fits in another. However, if our goalsets contains both those statements, we’ll starve (albeit in the comfort of an excellent shelter).
Okay? Good. Now we have everything in place.
What’s in the next part?
Remember that, at the start of all this, our main purpose is to show that mechanism design is the right way to make policy. Hopefully the constraints we outline will also be practical takeaways for you, just as they’ve been for me.
We’ll show that we can make perfect policy with mechanism design (which we’ll rigorously define), as well as use it to navigate towards perfect policy when we don’t have all the answers just yet.
We now have all the tools to objectively define criteria for all of this, and we can prove mathematically how these properties relate to each other. Which, in my opinion, is pretty damn cool. If it interests you too, see you in the next one.
This is Part 2 of a series. I’ll try to write a new post each week. If you want to get notified of each new post, you can subscribe to my account.
The Benevolent Ruler’s Handbook (Part 2): Morality Rules
Read Part 1 for more background.
Let’s get on track
If we’re going to understand what constitutes good policy, we need to understand what constitutes a good moral theory. We can gain some insights from looking at the practices of moral philosophy. Yes, you knew damn well that when we started this, we would talk about the trolley problem.
So you’re in class, and your philosophy professor asks you why you chose to pull the lever, you say “More lives are better than fewer”, and then they hit you with the Surgeon variant of the problem. Why do they do that? What are we achieving when we pose moral dilemmas?
Pictured:
My partner explaining why reality TV is enjoyableThe Good Place’s take on the trolley problemYour professor is, for each dilemma, getting you to state your intuitive moral preferences over two specified timelines (one in which you pull the lever, and the one in which you don’t). Then you’re supposed to state a rule that consistently represents your moral preferences over all possible dilemmas. And if your proposed rule fails to account for the next contrived dilemma your professor throws at you, you’ve done something wrong. It’s essentially the same as scientific falsification, except we use thought experiments.
People in the philosophy biz often get away with questions that no one else can ask
Now you might wonder, if moral intuitions arbitrate which moral rules are “correct”, why not just use moral intuition to evaluate these dilemmas? What is the point of a moral rule? Well, they have several purposes:
Unless we’re born with moral intuitions that are perfectly accurate, we’re going to have some priors that are wrong.
Sometimes our moral priors are outright contradictory. For example, “Violence is never moral” and “Culture determines morality” contradict because of the possibility of violent cultures.
Sometimes our stated justifications for our moral intuitions are contradictory, but those intuitions could be represented with a non-contradictory moral rule.
Often, our intuitive ranking will be incomplete, which makes moral decisions difficult even if we could perfectly predict the consequences.
Finding a moral rule helps us to correct poor moral intuitions, explain our good moral intuitions with more clarity, and guide us where our intuitions don’t exist.
Our view of thinking of moral rules as summarising moral preferences over timelines works not just for consequentialist theories, but for all moral theories that meet some fairly general criteria (namely, completeness and transitivity).
Even the really dumb ones
How are we going to link this to goals?
Now this part is extremely important. If we’re clever about how we represent each timeline, and how we represent each goal, we can mathematically connect goals to morality. The most elegant representation, if I do say so myself, is the following.
Let each timeline be the set containing all statements that are empirically true at all times and places within that timeline. For example, “[Unique identifier for you] is reading [unique identifier for this sentence]” would not be in the representation of the timeline that we exist in, because once you begin the next sentence, it’s no longer true. Instead, the statement would have to say something like “[Unique identifier for you] read [unique identifier for that sentence] during [the time period at which you read that sentence]”. For brevity’s sake, the remaining example statements won’t contain unique identifiers or time periods, as the precise meaning of the statements should be clear from the context.
What? I see this wordy explanation as an absolute win.
Now if you’re a stickler, and I know some of you are (yes, I’m pointing at you), you’re probably thinking “But we can’t actually talk about timelines, because at best, we only make educated guesses about the future.” So let’s incorporate uncertainty by using “timeline forecasts”.
Let’s represent these forecasts as the set of all predictions that we would make about a particular timeline right now. For example, the pull-the-lever forecast contains the statement “The five live with a credence of 1”. Note that each forecast can roughly be thought of as a probability distribution of timelines.
(Why are they effectively probability distributions? Each timeline contains all statements that are true about itself. So if “You pull the lever”, “The five live”, and “The one dies” are within a timeline, you can string these into a single, larger statement that is also in that timeline: “You pull the lever, and the five live, and the one dies”. So, each timeline contains a very large statement that uniquely identifies it within any finite set of timelines. And that combined statement, amended with an associated credence, will be within our timeline forecast.)
So, why is this particular representation so useful?
Linking the pieces together
This representation allows us to link moral rules to forecasts to “goalsets”.
Pictured: Me explaining literally anything
We’ll say that a moral rule chooses the best forecasts. I.e. it’s a function such that
the input is any set of forecasts, and
the output is the set of all forecasts that are weakly preferred to anything in the input set.
As for goalsets, they can be “satisfied” by particular forecasts (i.e. your goalset is a subset of that forecast). For example, let’s say your goalset in the trolley problem contains only one statement: “There was no way to save more people, with a credence of 1”. Then only the pull-the-lever forecast satisfies your goalset.
The last piece of this is your “empirical model”, which tells you what your forecast would be if you attempted a given “plan”. (I.e. it’s a function that maps a plan to a forecast.)
Wait, why goalsets? Can’t we evaluate goals individually?
We cannot evaluate goals individually. There are two reasons for this. (1) Any issue with a goal will manifest in a goalset that contains it, so we can rephrase any criticisms of a goal in terms of goalsets. And (2) there are times when we have to look at the goalset as a whole before identifying an issue.
For example, suppose we’re stranded on an island. Suppose the best outcome occurs when one of us hunts for food and the other builds a shelter or vice versa. So “I build a shelter” is a goal that fits in one ideal outcome, and “You build a shelter” fits in another. However, if our goalsets contains both those statements, we’ll starve (albeit in the comfort of an excellent shelter).
Okay? Good. Now we have everything in place.
What’s in the next part?
Remember that, at the start of all this, our main purpose is to show that mechanism design is the right way to make policy. Hopefully the constraints we outline will also be practical takeaways for you, just as they’ve been for me.
We’ll show that we can make perfect policy with mechanism design (which we’ll rigorously define), as well as use it to navigate towards perfect policy when we don’t have all the answers just yet.
We now have all the tools to objectively define criteria for all of this, and we can prove mathematically how these properties relate to each other. Which, in my opinion, is pretty damn cool. If it interests you too, see you in the next one.
This is Part 2 of a series. I’ll try to write a new post each week. If you want to get notified of each new post, you can subscribe to my account.