[ongoing] Thoughts on Proportional voting methods

About this post

This post is a work-in-progress, I (Jame­son) am up­dat­ing it over the course of weeks/​months, and the mods will re­post it when it gets sub­stan­tial up­dates. If you see this on the front­page its be­cause it’s had a sub­stan­tial up­date. If you read it and have ques­tions please leave a com­ment. I’d also ap­pre­ci­ate com­ments on the writ­ing style and/​or struc­ture.


I think I’m over halfway done with the ar­ti­cle as a whole.

Last 3 updates

V 0.7.2: A ter­minol­ogy change. New terms: Retroac­tive Power, Effec­tive Vot­ing Equal­ity, Effec­tive Choice, Aver­age Voter Effec­tive­ness. (The term “effec­tive” is a nod to Cather­ine He­len Spence). The math is the same ex­cept for some ul­ti­mately-in­con­se­quen­tial changes in when you sub­tract from 1. Also, started to add a closed list ex­am­ple from Is­rael; not done yet.

V 0.7.1: added a di­gres­sion on dime­sion­al­ity, in ital­ics, to the “Mea­sur­ing “Rep­re­sen­ta­tion qual­ity”, sep­a­rate from power” sec­tion. Finished con­vert­ing the ex­ist­ing ex­am­ples from RF to VW.

V 0.7.0: Switched from “Rep­re­sen­ta­tional Fair­ness” to the more-in­ter­pretable “Vote Wastage”. Wrote enough so that it’s pos­si­ble to un­der­stand what I mean by VW, but this still needs re­vi­sion. Also pend­ing, change my calcu­la­tions for spe­cific meth­ods from RF to VW.


I have been think­ing about pro­por­tional vot­ing meth­ods, and have some half-baked thoughts. “Half-baked” means that I think the ba­sic in­gre­di­ents of some worth­while new ideas are in there, but they’re not ready for gen­eral con­sump­tion yet. By the time you read these words, the thoughts be­low may well be more than half-baked.

I think it’s worth­while to be­gin write these thoughts out with a real-world au­di­ence in mind. I’m choos­ing to do that here. The read­er­ship of this site is not the ideal au­di­ence, but it’s close enough.

Here are some as­sump­tions/​stand­points I’m go­ing to take:

  • My read­ers are as well-in­formed as I could rea­son­ably imag­ine them to be. That is, they’ve read and un­der­stood the three posts of my prior se­quence here on vot­ing the­ory; are fa­mil­iar with some ba­sic con­cepts of statis­tics such as var­i­ance, ex­pec­ta­tion, dis­tri­bu­tion, sam­pling, con­sis­tency, es­ti­ma­tors and es­ti­mands, and Bayes the­o­rem.

  • “Good democ­racy” is a nor­ma­tive goal. A big part of this is try­ing to define that more rigor­ously, but I won’t be fo­cus­ing too much on defend­ing it philo­soph­i­cally.

  • I may at times, es­pe­cially later on, veer into the ter­ri­tory of poli­tics and “the­ory of change”; that is, how it might be pos­si­ble to ac­tu­ally put the ideas here into prac­tice. At such times, I won’t hide the fact that I my­self hold gen­er­ally leftist/​pro­gres­sive views. Still, when­ever I’m not ex­plic­itly dis­cussing means, I be­lieve that the demo­cratic ends I fo­cus on are defen­si­ble from many points of view. Ob­vi­ously, democ­racy isn’t com­pat­i­ble with some­thing like dog­matic an­ar­chism or dog­matic monar­chism, but I think that even a soft-edged an­ar­chist or monar­chist might learn some­thing from these ideas.


Ter­minol­ogy note

I will dis­cuss both demo­cratic “sys­tems” and vot­ing “meth­ods”, and these may ap­pear to be in­ter­change­able terms. They are not. “Meth­ods” are spe­cific al­gorithms which define an “ab­stract bal­lot” (set of valid vote types), and map from a tally of such votes to an out­come. “Sys­tems” are broader; they in­clude “meth­ods”, but also rules for when and how to hold elec­tions, who gets to vote, etc.

Here’s some sim­ple no­ta­tion/​ter­minol­ogy I’ll use:

  • V, the to­tal num­ber of vot­ers /​ ballots

  • S, the to­tal num­ber of seats to be elected in this elec­tion. (In other sources, this is of­ten referred to as M, for “dis­trict (M)ag­ni­tude”, in cases where large elec­tions are split up by dis­trict into var­i­ous smaller elec­tions, typ­i­cally with only 3-5 seats per dis­trict.)

  • Hare Quota: V/​S, the num­ber of votes each win­ning rep­re­sen­ta­tive would have if they all had ex­actly equal num­bers and no votes were wasted.

  • Droop Quota: V/​(S+1), the num­ber of votes each win­ning rep­re­sen­ta­tive would have if they all had an equal num­ber but one quota of votes was wasted on los­ing can­di­dates.


Goal/​end­point of this piece

My goal here is to be­gin to put the com­par­a­tive study of pro­por­tional (multi-win­ner) vot­ing meth­ods on a foot­ing that’s as solid as that of sin­gle-win­ner meth­ods. In the case of sin­gle-win­ner meth­ods, that solid foot­ing is pro­vided pri­mar­ily by hav­ing a clear, unified met­ric: Voter Satis­fac­tion Effi­ciency (VSE), which I’ll ex­plain briefly be­low. Although this met­ric is in prin­ci­ple sen­si­tive to as­sump­tions about the dis­tri­bu­tion of elec­torates, it is in prac­tice ro­bust enough to provide at least some use­ful guidance as to which meth­ods are bet­ter or worse, and as to which patholo­gies in given meth­ods are likely to be com­mon or rare. That makes it much more pos­si­ble to have a use­ful dis­cus­sion about the trade­offs in­volved in vot­ing method de­sign.

This is not to say that any met­ric will ever be the be-all-and-end-all of mechanism de­sign. In the sin­gle-win­ner case, even though we have VSE, we still have to an­swer ques­tions like: “Is it worth ad­vo­cat­ing for a method with bet­ter VSE that’s more com­plex and/​or harder to au­dit, or should we stick with some­thing with a slightly worse VSE that’s sim­pler?” But with­out hav­ing a met­ric, that de­bate would very quickly de­volve into self-serv­ing bias on both sides; ev­ery­body would tend to weight the sce­nar­ios where their preferred method performed well more heav­ily and dis­count those where it performed poorly. In other words, with­out a clearly-defined met­ric, it’s al­most im­pos­si­ble to prop­erly “shut up and mul­ti­ply”.

Defin­ing a good met­ric for pro­por­tional meth­ods is harder than it is for sin­gle-win­ner meth­ods. I am not the first per­son to at­tempt this, but I find prior at­tempts fall short of what is needed. As I write these words, I’m not finished defin­ing my met­ric, but I do have a clear enough idea that I think I can do bet­ter than what’s gone be­fore. I also have a clear enough idea to know that, even when I’m done with this, my multi-win­ner met­ric will not be even as “math­e­mat­i­cally clean” as VSE is. In fact, I sus­pect it may be prov­ably im­pos­si­ble to cre­ate a math­e­mat­i­cally clean met­ric with all the char­ac­ter­is­tics you’d like it to have; I sus­pect that it will be nec­es­sary to make some sim­plify­ing as­sump­tions/​ap­prox­i­ma­tions even though they are clearly vi­o­lated. I hope it will be pos­si­ble to do so in a way such that the mea­sured qual­ity of prac­ti­cal vot­ing meth­ods is ro­bust to the as­sump­tions, even if there might be meth­ods that could in­ten­tion­ally ex­ploit the as­sump­tions to get un­re­al­is­tic qual­ity mea­sures.

Once I’ve defined a met­ric, I’ll dis­cuss prac­ti­cal­ities: which pro­por­tional meth­ods I be­lieve may be bet­ter in which real-world con­texts, and how one might get those adopted.

TLD̦R (Sum­mary) (empty)

What does it mean to say a sys­tem is “demo­cratic”?

The archetyp­i­cal case of democ­racy is a refer­en­dum. A new law is pro­posed; “all” the peo­ple get to vote “yes” or “no”; and “ma­jor­ity rules”. In prac­tice, “all” tends to be limited by some el­i­gi­bil­ity crite­ria, but I’d say that most such crite­ria make the sys­tem less demo­cratic, other than maybe age and res­i­dency. The nor­ma­tive case that this kind of democ­racy is a good idea tends to rest on the Con­dorcet jury the­o­rem: that is, if the av­er­age voter is more likely to be “right” than “wrong” (for in­stance, more likely to choose the op­tion which will max­i­mize global util­ity), then the chances the refer­en­dum out­come will be “right” quickly con­verge to­wards 100% as the num­ber of vot­ers in­creases.

But of course, not ev­ery col­lec­tive de­ci­sion nat­u­rally re­duces to just two op­tions, and not “all” the peo­ple have time to vote on all the col­lec­tive de­ci­sions. (Which de­ci­sions should be made col­lec­tively and which should be made in­di­vi­d­u­ally is out of scope for these mus­ings, but I be­lieve there are at least some cases that fall clearly on each side.) So not all “democ­racy” will be like that archety­pal case. Still, it’s pretty clear that any more-com­plex demo­cratic sys­tem must be “similar” to the archetyp­i­cal case in some sense.

When you’re choos­ing be­tween a finite num­ber of op­tions that are known at the time of vot­ing, you need a sin­gle-win­ner vot­ing method. It’s pretty easy to define a way to mea­sure “how good” such a method is: we can just use util­i­tar­i­anism. (That doesn’t mean you have to en­tirely buy in to util­i­tar­i­anism as an eth­i­cal/​philo­soph­i­cal sys­tem; even if you have reser­va­tions about it in a broader sense, I think it’s pretty clear that it’s a use­ful model for the pur­poses of com­par­ing vot­ing meth­ods in the ab­stract, in a way that other eth­i­cal sys­tems aren’t.) A vot­ing method is good in­so­far as it max­i­mizes ex­pected util­ity. To ac­tu­ally eval­u­ate this ex­pected util­ity, you’d need a dis­tri­bu­tion of pos­si­ble elec­torates, where each pos­si­ble elec­torate has a set of op­tions and a set of vot­ers. Each voter has some util­ity for each op­tion, some knowl­edge about the other vot­ers, and some rule for turn­ing their util­ity and knowl­edge into a vote. As you might imag­ine, ac­tu­ally work­ing this all out can get com­plex, but the ba­sic idea is sim­ple.

Choos­ing a sin­gle leader can be seen as a spe­cial case of choos­ing be­tween var­i­ous op­tions. But elect­ing a leg­is­la­ture is some­thing differ­ent. How do you define the util­ity of a leg­is­la­ture that has a 6040 split be­tween two par­ties, ver­sus one that has 7030 split? There is (to me at least) an ob­vi­ous in­tu­ition that a leg­is­la­ture is more demo­cratic if it’s more rep­re­sen­ta­tive, and more rep­re­sen­ta­tive if it’s more pro­por­tional. With some ad­di­tional as­sump­tions, the drive for pro­por­tion­al­ity might be made more rigor­ous; if vot­ers would be 6040 on X vs Y but leg­is­la­tors are 7030, there must prob­a­bly be some X’ and Y’ such that vot­ers would be 4951 but leg­is­la­tors would be at least 5149.

Still, “rep­re­sen­ta­tive” or “pro­por­tional” are still just words; it turns out that, out­side of sim­ple spe­cial cases where there are a few pre-defined par­ties and vot­ers care only about par­ti­san­ship, poli­ti­cal sci­en­tists don’t tend to define these ideas rigor­ously. But we can do bet­ter. One way we might define “rep­re­sen­ta­tive” comes from statis­ti­cal sam­pling. Imag­ine that the leg­is­la­ture’s job is merely to re­pro­duce, as faith­fully as pos­si­ble, di­rect democ­racy, with­out or­di­nary vot­ers hav­ing to waste all that time choos­ing and vot­ing. In a statis­ti­cal sense, that would mean the loss func­tion would be some­thing like: the ex­pected squared differ­ence be­tween the per­cent a pro­posal gets in the leg­is­la­ture, and the per­cent it would have got­ten in a di­rect refer­en­dum, given that the pro­posal is drawn from some pre­defined dis­tri­bu­tion over pos­si­ble pro­pos­als. Note that we can still use the Con­dorcet jury the­o­rem to ar­gue that this would max­i­mize util­ity, but now not only are we piling as­sump­tion on top of as­sump­tion, we’re also mak­ing ap­prox­i­ma­tions of ap­prox­i­ma­tions. The leg­is­la­tive out­come is an ap­prox­i­ma­tion of the pop­u­lar out­come which is an ap­prox­i­ma­tion of the ideal out­come; if the first step of ap­prox­i­mat­ing goes wrong, there’s no rea­son to be­lieve the sec­ond step will fix it.

(This loss func­tion is effec­tively [the ex­pec­ta­tion of] a met­ric be­tween dis­tri­bu­tions: mea­sur­ing a kind of dis­tance be­tween the voter dis­tri­bu­tion and the leg­is­la­tor dis­tri­bu­tion, us­ing the pro­posal dis­tri­bu­tion as an un­der­ly­ing prob­a­bil­ity mea­sure. To help get an in­tu­ition for what this means, imag­ine that ide­ol­ogy is a sin­gle di­men­sion, with the vot­ers some­how dis­tributed along it, and each pro­posal is a cut point. As­sum­ing the pro­posal dis­tri­bu­tion is con­tin­u­ous, we could with­out loss of gen­er­al­ity rescale the ide­ol­ogy so that the pro­posal dis­tri­bu­tion is Uniform(0,1), and then graph those pro­pos­als on the x axis with the pro­por­tion of vot­ers to the left of the cut on the y axis. We could su­per­im­pose a graph with the pro­por­tion of leg­is­la­tors to the left of the cut on the y axis; this loss func­tion would then be the in­te­gral of the square of the ver­ti­cal dis­tance be­tween those graphs.)

A sketch to help vi­su­al­ize the above

This is a rigor­ous defi­ni­tion of “rep­re­sen­ta­tive”, but a deeply un­satis­fy­ing one. By this defi­ni­tion, one of the “best” pos­si­ble vot­ing meth­ods would be sor­ti­tion; just choose rep­re­sen­ta­tives uniformly at ran­dom from the pop­u­la­tion. We know from statis­ti­cal sam­pling the­ory that in this case, the loss func­tion stated above, for liter­ally any pos­si­ble ques­tion, is asymp­tot­i­cally con­sis­tent; it falls off as the in­verse square root of the size of the leg­is­la­ture. (Such a sim­ple ran­dom sam­ple isn’t ac­tu­ally the best; you can im­prove it by do­ing var­i­ance-min­i­miz­ing tricks such as strat­ify­ing on self-de­clared party or other salient poli­ti­cal vari­ables. But that’s a minor de­tail which changes the er­ror by a roughly con­stant fac­tor, not by chang­ing the ba­sic limit­ing be­hav­ior.)

But in the sin­gle-win­ner case, sor­ti­tion (even strat­ified sor­ti­tion) just means ran­domly choos­ing a dic­ta­tor, which is clearly nowhere close to be­ing among the most-demo­cratic sin­gle-win­ner meth­ods. Clearly, there’s some demo­cratic ideal or ideals miss­ing from the above loss func­tion.

There are at least two char­ac­ter­is­tics that good rep­re­sen­ta­tives should have, be­sides hav­ing similar views col­lec­tively to those whom they rep­re­sent: they should be ex­em­plary, and they should be ac­countable.

“Ex­em­plary” means that in some re­gards, you want them to be sys­tem­at­i­cally differ­ent from the av­er­age of the peo­ple they rep­re­sent; smarter, less ig­no­rant, bet­ter at pos­i­tive-sum ne­go­ti­a­tion, etc. For in­stance, a lin­guis­tic minor­ity would want to be rep­re­sented by some­body who was fluent in the lan­guage of busi­ness in the leg­is­la­ture, even if they them­selves mostly were not. Sor­ti­tion is not a good solu­tion here.

“Ac­countable” means that there is at least some mechanism to help bet­ter al­ign in­cen­tives be­tween rep­re­sen­ta­tives (agents) and their “con­stituents” (prin­ci­pals). This gen­er­ally means that vot­ing rules work to ag­gre­gate prefer­ences in a way that con­sid­ers all votes equally. Again, sor­ti­tion fails this badly.

Note that by adding “ex­em­plary” and “ac­countable” to our list of goals, we’ve opened up the pos­si­bil­ity for leg­is­la­tive out­comes to be not just as-good-as di­rect democ­racy, but ac­tu­ally bet­ter, in sev­eral ways. First, it may even be that there is some op­ti­mum size for rea­soned de­bate and group ra­tio­nal­ity; too small and there is not enough di­ver­sity of ini­tial opinions, too large and there is not enough band­width for con­sen­sus-seek­ing dis­cus­sion. Se­cond, ex­em­plary rep­re­sen­ta­tives may be bet­ter than or­di­nary vot­ers at un­der­stand­ing the im­pact of each de­ci­sion, or at ne­go­ti­at­ing to find win-win com­pro­mises. It’s difficult to imag­ine for­mal­iz­ing this with­out mak­ing huge ar­bi­trary as­sump­tions, but it’s similar to the idea of “co­her­ent ex­trap­o­lated vo­li­tion”; that is, ex­em­plary rep­re­sen­ta­tives may be able to ex­trap­o­late vot­ers’ vo­li­tion in a way that the voter might not ex­actly agree with a pri­ori but would see in ret­ro­spect to be bet­ter. Which brings us to the third way bet­ter out­comes might be pos­si­ble: by mak­ing rep­re­sen­ta­tives ac­countable pri­mar­ily not on a mo­ment-by-mo­ment ba­sis, but on a once-per-elec­tion time cy­cle, we might nudge them to think from a slightly more fu­ture-ori­ented per­spec­tive.

This idea — that rep­re­sen­ta­tive democ­racy doesn’t just save vot­ers’ time; that it, in prin­ci­ple, can ac­tu­ally give bet­ter out­comes than di­rect democ­racy — is so im­por­tant, it already has a name: re­pub­li­can­ism. (I’ll be us­ing that word as I con­tinue this ar­ti­cle; to avoid con­fu­sion, I’ll re­fer to the main US poli­ti­cal par­ties as “GOP” and “Dems”.)

It’s clear that all three of these ideals — rep­re­sen­ta­tivity, ex­em­plari­ness, and ac­countabil­ity — have to do with democ­racy. One can imag­ine pairs of situ­a­tions that differ on only one of these three axes, and still, one mem­ber of the pair will seem (in my strong in­tu­ition) more demo­cratic than the other. For ex­am­ple, imag­ine the fic­tional planet men­tioned in The Hitch­hik­ers’ Guide to the Galaxy, where peo­ple vote to elect lizards. Though the com­plete lack of rep­re­sen­ta­tivity makes this a poor democ­racy, it would clearly be even less demo­cratic if the los­ing lizard com­mit­ted elec­toral fraud.


Di­gres­sion: di­men­sion­al­ity and representation

Above, we’ve defined “rep­re­sen­ta­tive­ness” as some kind of differ­ence-in-dis­tri­bu­tions be­tween the vot­ers (that is, their sup­port for any given pro­posal drawn ran­domly from the dis­tri­bu­tion of pos­si­ble pro­pos­als) and the leg­is­la­ture (that is, the ex­pected sup­port for any given pro­posal, where the ex­pec­ta­tion is taken across any ran­dom­ness in the vot­ing method). ((Note that in some cases, when con­sid­er­ing the leg­is­la­ture’s “ex­pected” sup­port for pro­pos­als, it may be con­ve­nient to con­di­tion, not on the bal­lots, but only on the voter’s views on all pro­pos­als; that is, to con­sider the pos­si­ble vari­abil­ity in vot­ers’ strate­gic be­hav­ior as both “ran­dom” and “part of the vot­ing method”.))

Since the num­ber of vot­ers is usu­ally quite large com­pared to the size of the leg­is­la­ture, we might imag­ine/​ap­prox­i­mate the voter dis­tri­bu­tion as a con­tin­u­ous “cloud”, and any given leg­is­la­ture as a set of points in­tended to rep­re­sent that cloud. Since we’re in­ter­ested in how well those points can pos­si­bly rep­re­sent that cloud, one key vari­able is the effec­tive di­men­sion of that cloud.

To be spe­cific: we know that it’s pos­si­ble to rep­re­sent the cloud spa­tially, with any pos­si­ble pro­posal as one of the di­men­sions. This is po­ten­tially an in­finite-di­men­sional space, but in prac­tice, it will have some di­men­sions with high var­i­ance, and some di­men­sions with neg­ligible var­i­ance.

You’ve prob­a­bly seen “poli­ti­cal com­pass” di­a­grams which pro­ject poli­ti­cal ide­olo­gies onto two di­men­sions:

But of course, those aren’t the only two di­men­sions you could use to cat­e­go­rize vot­ers:

If we use the di­men­sions that tend to al­ign well with how vot­ers would di­vide up on high-prob­a­bil­ity leg­is­la­tive pro­pos­als (the prin­ci­pal com­po­nents), how many di­men­sions would we need? ((OK, “effec­tive di­men­sion” isn’t ex­actly that; it mea­sures not only how many “rel­a­tively big” di­men­sions there are, but also how “rel­a­tively small” the rest are. I’m be­ing de­liber­ately vague about how pre­cisely I’d define “effec­tive di­men­sion” be­cause I sus­pect that un­less you ig­nore vari­a­tion be­low a cer­tain noise thresh­old the ED is ac­tu­ally in­finite in the limit of in­finite vot­ers.))

If the “effec­tive di­men­sion” is low, then a rel­a­tively small leg­is­la­ture can do quite a good job at rep­re­sent­ing even an in­finite pop­u­la­tion. If the effec­tive di­men­sion is high, though, then the ide­olog­i­cal dis­tance be­tween a ran­domly-cho­sen voter and the clos­est mem­ber of the leg­is­la­ture, will tend to be al­most as high as the dis­tance be­tween any two ran­domly-cho­sen vot­ers. In other words, there will be so many ways for any two peo­ple to differ, that the very idea that one per­son could “rep­re­sent” an­other well be­gins to break down (un­less the size of the leg­is­la­ture grows ex­po­nen­tially in the num­ber of effec­tive di­men­sions)

What is the effec­tive di­men­sion of poli­ti­cal ide­ol­ogy in re­al­ity? It de­pends how you mea­sure it. If you look at US in­cum­bent poli­ti­ci­ans’ vot­ing records us­ing a method­ol­ogy like DW-NOMINATE, you get a num­ber be­tween 1 and 2. If you look at all peo­ple’s opinions on all pos­si­ble is­sues, you’d surely get a much higher num­ber. I per­son­ally think that healthy poli­ti­cal de­bate would prob­a­bly set­tle at about 𝓮 effec­tive di­men­sions — that is, a bit less than 3. For get­ting a sense of this, 5^𝓮≈80, 10^𝓮≈500, 20^𝓮≈3500; so, if I’m right, a well-cho­sen rea­son­ably-sized leg­is­la­ture could do a rea­son­ably good job rep­re­sent­ing an ar­bi­trar­ily-large pop­u­la­tion.

But if I’m wrong, and the vari­a­tion in poli­ti­cal opinions/​in­ter­ests that are poli­ti­cally salient in an ideal world is much higher, AND it’s im­pos­si­ble to deal well with “one is­sue at a time” (that is, trade­offs due to high-di­men­sional struc­ture are mean­ingful), then the very “re­pub­li­can idea” of rep­re­sen­ta­tive democ­racy is prob­le­matic. A large-enough leg­is­la­ture could still serve as a ran­dom pol­ling sam­ple, re­pro­duc­ing the opinions of the larger pop­u­la­tion on any ar­bi­trary sin­gle ques­tion with low er­ror. But as the leg­is­la­ture de­bated mul­ti­ple is­sues, con­sid­er­ing the bal­ance of pos­si­ble trade­offs be­tween sev­eral re­lated choices, the chances that there would be some com­bi­na­tion of points of view that ex­ists in the pop­u­la­tion but not in the leg­is­la­ture would go to 100%.

Un­der­stand­ing di­men­sion­al­ity is es­pe­cially im­por­tant for an­a­lyz­ing vot­ing meth­ods that in­volve del­e­ga­tion. If poli­ti­cally-salient ide­olo­gies are rel­a­tively low-di­men­sional, then your fa­vorite can­di­date’s sec­ond-choice can­di­date is likely to be some­one you’d like. If they’re high-di­men­sional, that two-steps-from-you can­di­date is not sig­nifi­cantly bet­ter from your point of view than a ran­dom choice of can­di­dates. I’ll talk more about del­e­ga­tion meth­ods, which can help sim­plify bal­lots and thus al­low broader over­all choice, later.

God’s pro­por­tional vot­ing method

Say that an om­ni­scient be­ing with in­finite pro­cess­ing power wants to se­lect a leg­is­la­ture of size S to rep­re­sent a pop­u­la­tion 𝒱 of V cho­sen ones. God knows the dis­tri­bu­tion of pro­pos­als that this leg­is­la­ture will vote on (and for sim­plic­ity, that dis­tri­bu­tion is in­de­pen­dent of who is in the leg­is­la­ture), and she knows the dis­tri­bu­tion of util­ity that each out­come op­tion will give to each per­son v in 𝒱. (For tech­ni­cal rea­sons, let’s as­sume that each cho­sen one’s dis­tri­bu­tion of util­ity for each op­tion is ab­solutely con­tin­u­ous with re­gard to some com­mon in­ter­val. If you don’t know what that means, you can safely ig­nore it.) Her goal is to use a pro­cess for choos­ing a leg­is­la­ture that will:

  • Have zero bias. For any op­tion the leg­is­la­ture might con­sider, the ex­pected dis­tri­bu­tion of util­ities of the leg­is­la­ture is (effec­tively) the same as the ex­pected dis­tri­bu­tion of util­ities of the vot­ers. (That is to say, the first mo­ments of the two dis­tri­bu­tions are iden­ti­cal, and their higher mo­ments are at-least-similar in a way that would be te­dious to write out.)

  • Tends to choose more-pi­ous leg­is­la­tors, if it doesn’t cause bias. Be­cause she’s putting a higher pri­or­ity on un­bi­ased­ness than on piety, she’ll have to fo­cus on that com­po­nent of a cho­sen-one’s piety that is in­de­pen­dent of the util­ity they’d get from all op­tions. (In the real world, of course, we’d sub­sti­tute some other qual­ifi­ca­tions for piety. Note that these qual­ifi­ca­tions could be sub­jec­tive; for in­stance, “how much vot­ers who are similar to this can­di­date want to be rep­re­sented by them”.)

  • Min­i­mizes the var­i­ance of the dis­tri­bu­tion of dis­tri­bu­tions of util­ities of leg­is­la­tors, as long as that doesn’t cause bias or re­duce piety. In or­di­nary terms, min­i­miz­ing the var­i­ance of the meta-dis­tri­bu­tion means she’s try­ing to get a leg­is­la­ture that rep­re­sents the peo­ple well con­cretely, not just in ex­pec­ta­tion. In other words, if 90% of cho­sen ones like ice cream, around 90% of the leg­is­la­tors should like ice cream; it’s not good enough if it’s 100% half the time and 80% half the time for an av­er­age of 90%.

Of course, since this is God we’re talk­ing about, she could come up with se­lec­tion rules that are bet­ter than any­thing I could come up with. But if I had to ad­vise her, here’s the pro­ce­dure I’d sug­gest:

1. Divide the pop­u­la­tion up into equal-sized groups, which we’ll call “con­stituen­cies”, in a way that min­i­mizes the to­tal ex­pected within-group var­i­ance for an op­tion that’s ran­domly-cho­sen from the dis­tri­bu­tion of votable pro­pos­als. In essence: draw max­i­mally-com­pact equal-pop­u­la­tion dis­tricts in ide­ol­ogy space.

2. Within each con­stituency, give each voter a score that is their piety, di­vided by the ex­pected piety of a per­son in that group with their util­ities. (I’m in­ten­tion­ally be­ing a bit vague on defin­ing that ex­pec­ta­tion; there are tighter or looser defi­ni­tions that would both “work”. Tighter defi­ni­tions would do less to op­ti­mize piety but would be un­bi­ased in a stric­ter sense; vice versa for looser defi­ni­tions.)

3. Choose one rep­re­sen­ta­tive from each group, us­ing prob­a­bil­ities weighted by score.

In other words: strat­ified sor­ti­tion, us­ing prob­a­bil­ities weighted by piety and “in­verse piety propen­sity”.

What’s the point of even dis­cussing this case? Ob­vi­ously, if God were choos­ing, she wouldn’t need a leg­is­la­ture at all; she could just choose the best op­tion each time. But I bring this up to illus­trate sev­eral points:

  • Hav­ing in­di­vi­d­ual rep­re­sen­ta­tion, where each voter can point to the rep­re­sen­ta­tive they helped elect, and each rep­re­sen­ta­tive was elected by an equal num­ber of vot­ers, is good from sev­eral per­spec­tives. Even with­out this ex­am­ple, it would have been ob­vi­ous that in­di­vi­d­ual rep­re­sen­ta­tion en­sures a form of pro­por­tion­al­ity, and that it is good for ac­countabil­ity. But this ex­am­ple shows that it can be seen as a var­i­ance-re­duc­tion tech­nique: if our God had skipped step 1, she’d still have got­ten a leg­is­la­ture that was un­bi­ased and pi­ous, but it would have had higher var­i­ance.

  • It’s pos­si­ble (at least, if you’re om­ni­scient) to de­sign a method that op­ti­mizes for sev­eral differ­ent pri­ori­ties. For in­stance, in this case, we could op­ti­mize for piety even while also en­sur­ing un­bi­ased­ness and min­i­miz­ing var­i­ance.

God’s MSE de­com­po­si­tion?

Let’s say God’s busy, so she asks the ar­changel Uriel to choose the leg­is­la­ture. Uriel isn’t om­ni­scient, so he doesn’t know ev­ery­one’s util­ity for each pro­posal op­tion, and even if he did, he couldn’t perfectly in­te­grate that over the high-di­men­sional dis­tri­bu­tion of pro­pos­als the leg­is­la­ture might vote on. So he merely asks each cho­sen one to write down how they feel about each can­di­date on a lit­tle piece of pa­per, then, us­ing the pa­pers, does his best to choose a rep­re­sen­ta­tive body. In other words, he holds an elec­tion.

When God re­turns, how will she judge Uriel’s elec­tion? First off, let’s say that she doesn’t just judge the par­tic­u­lar out­come of one elec­tion, but the ex­pected out­come of the method across some mul­ti­ver­sal dis­tri­bu­tion of pos­si­bil­ities. She’s om­ni­scient, so she di­rectly knows the ex­pected mean squared er­ror (MSE) be­tween the per­centage sup­port­ing each pro­posal in an ideal­ized ver­sion of di­rect democ­racy (cor­rect­ing to­wards co­her­ent ex­trap­o­lated vo­li­tion, as much as re­pub­li­can­ism al­lows), and the per­cent sup­port­ing within the cho­sen leg­is­la­ture. But to help Uriel un­der­stand how he’s be­ing judged, I ex­pect she’d de­com­pose this MSE into var­i­ous com­po­nents, fol­low­ing the same step-by-step logic as I out­lined in my sug­ges­tion above for her own se­lec­tion method.

Clas­si­cally, MSE can be de­com­posed into squared bias plus var­i­ance. In many cases, each of these can be de­com­posed fur­ther:

  • Var­i­ance can be de­com­posed us­ing “Eve’s law”, and in par­tic­u­lar, break­ing down into the sum of var­i­ances of *in­de­pen­dent* summed com­po­nents. When com­po­nents are not in­de­pen­dent, we can still break down into two sep­a­rate terms plus one cross term.

  • Squared bias can be de­com­posed in a similar way; as the sum of squares of or­thog­o­nal com­po­nents, or as the sum of squares of near-or­thog­o­nal com­po­nents plus cross-term cor­rec­tions for non-or­thog­o­nal­ity.

So let’s go through the steps of God’s pro­cess; sug­gest a way to es­tab­lish a cor­re­spon­dence be­tween that and an ar­bi­trary vot­ing method; and see what com­po­nents of MSE might have arisen at each stage.

))(I bet you missed the big re­veal! That is to say, I think the most novel and philo­soph­i­cally-meaty idea in this es­say was buried in clause 2 of the sen­tence just above. Estab­lish­ing a step-by-step cor­re­spon­dence be­tween an ar­bi­trary vot­ing method, and a sin­gle un­re­al­iz­able “perfect” method, is, I be­lieve, an im­por­tant idea, as I’ll dis­cuss fur­ther be­low. I would have said so ear­lier, but I didn’t see a way to get there di­rectly with­out lay­ing all the ground­work above.)((

God’s first step was to di­vide vot­ers into S equal-sized “con­stituency” groups, min­i­miz­ing the vari­abil­ity within each con­stituency. Multi-win­ner vot­ing meth­ods do not nec­es­sar­ily di­vide vot­ers into S groups. How­ever, they do, by defi­ni­tion, elect S win­ners. So if we want to as­sign vot­ers into S (pos­si­bly-over­lap­ping) con­stituen­cies, we have merely to trace the chain of re­spon­si­bil­ity back­wards: for each of the win­ners, which vot­ers were re­spon­si­ble for helping them to win?

In more quan­ti­ta­tive terms: for each can­di­date c and voter v, we want to as­sign a quan­tity of re­spon­si­bil­ity r(c,v), such that the sum over v for any given c is always 1. I’ll be say­ing a LOT more about how this might work be­low.

There are sev­eral kinds of er­rors that could hap­pen at this step:

  • Inequal­ity: some vot­ers could have higher “to­tal vot­ing power” — that is, sum over c of r(c,v) — than oth­ers. In fact, there are already vot­ing meth­ods built around min­i­miz­ing the var­i­ance of one ver­sion of quan­tity. This would usu­ally tend to lead to bias, as the views of vot­ers with lower to­tal vot­ing power would tend to be un­der­rep­re­sented in the leg­is­la­ture.

    • An im­por­tant spe­cial case of this is that some votes are, in prac­tice, wasted, with zero vot­ing power; they did noth­ing to help elect a can­di­date. In prac­tice, it’s im­pos­si­ble to cre­ate a vot­ing method that elects S equally-weighted can­di­dates with­out wast­ing some votes; typ­i­cally at least some­where be­tween V/​2S and V/​(S+1) of them. Count­ing wasted/​in­effec­tive votes is one sim­ple way to rate vot­ing meth­ods; in fact, a sim­ple for­mula for es­ti­mat­ing wasted votes, known as the “effi­ciency gap”, was at the heart of the plain­tiff’s ar­gu­ments in the US Supreme Court case Gill v. Whit­ford.

  • Im­pre­ci­sion: in­stead of di­vid­ing vot­ers into S sep­a­rate con­stituen­cies with one rep­re­sen­ta­tive per group, the method might di­vide them into fewer groups with more than one rep­re­sen­ta­tive each. This “over­lap” would usu­ally tend to lead to higher var­i­ance at the next step, as rep­re­sen­ta­tives se­lected from larger over­lap­ping “ide­olog­i­cal dis­tricts” would be ide­l­og­i­cally-farther from the con­stituents they rep­re­sent.

  • Inap­pro­pri­ate­ness: The S con­stituen­cies could be equal and non-over­lap­ping (in ide­olog­i­cal space), but still less com­pact than they could have been. That is, the vot­ing method could have done a poor job of group­ing similar vot­ers to­gether, so that no mem­ber from each group could do a good job rep­re­sent­ing that whole group. This could lead to bias, var­i­ance, or both.

  • (I should also men­tion that turn­ing a com­plex vot­ing method out­come into r(c,v) mea­sure­ments in­evitably loses some in­for­ma­tion; if it’s done poorly, any re­sult­ing mea­sure of vot­ing method qual­ity may be mis­lead­ing.)

Rough pic­ture of the above is­sues. Not perfect; don’t @ me. Note that the space of the di­a­gram is in­tended to be ide­olog­i­cal, not ge­o­graph­i­cal.

God’s sec­ond step was to choose one rep­re­sen­ta­tive for each of the equal-sized groups, us­ing an al­gorithm that bal­anced the goals of ex­em­plary “piety” and ide­olog­i­cal un­bi­ased­ness. Note that in rat­ing a vot­ing method, we’re do­ing this “back­wards”: we con­structed the “set of re­spon­si­ble vot­ers” from the can­di­date, not vice versa. Still, we can use this “back­wards” pro­ce­dure to con­sider the in­her­ent bi­ases and var­i­ances of what­ever ar­bi­trary “for­wards” vot­ing method that Uriel ac­tu­ally used. Th­ese might in­clude:

  • Pos­si­ble bias in choos­ing from each con­stituency: Some­thing about the vot­ing method, or about the in­ter­ac­tion be­tween the vot­ing method and voter be­hav­ior, could cause bias.

    • Rep­re­sen­ta­tives might be sys­tem­at­i­cally drawn from one side (eg “the north”) of their con­stituency, across the board

    • Rep­re­sen­ta­tives might be sys­tem­at­i­cally drawn from the cen­troid of their con­stituency, but the voter sets might be cho­sen in a way such that these “cen­troids” are bi­ased in cer­tain di­men­sions.

    • Rep­re­sen­ta­tives might be cho­sen in a way that in­ap­pro­pri­ately ig­nores ex­em­plary “piety” (that is, those di­men­sions on which they *should* be bi­ased com­pared to the voter pop­u­la­tion).

  • Pos­si­ble var­i­ance in choos­ing from each con­stituency: Rep­re­sen­ta­tives might be sys­tem­at­i­cally far from the cen­ter of their con­stituency. Even in cases where this doesn’t cause bias, it’s still a prob­lem.

I’ve listed var­i­ous differ­ent pos­si­ble prob­lems, and sug­gested that they could be the ba­sis of an MSE de­com­po­si­tion. That is to say, God could be­gin grad­ing the out­come by us­ing the above prob­lems as a rubric; sub­tract­ing points sep­a­rately for prob­lems of each type. But of course, that kind of MSE de­com­po­si­tion only works if the differ­ent kinds of prob­lem are or­thog­o­nal/​in­de­pen­dent. So in prac­tice, God might end up adding or sub­tract­ing a less-ef­fable “fudge fac­tor” at the end.

Allo­cat­ing re­spon­si­bil­ity: a tricky (and im­por­tant) problem

A key step in “God’s MSE de­com­po­si­tion” for eval­u­at­ing a multi-win­ner vot­ing method is as­sign­ing re­spon­si­bil­ity; de­cid­ing, us­ing the bal­lots, which vot­ers should be con­sid­ered “con­stituents” of which win­ning rep­re­sen­ta­tive. That’s not at all as sim­ple as it might seem. Even if you know the bal­lots, the vot­ing method, and the win­ners, it’s not nec­es­sar­ily easy to make a func­tion r(c,v) to say which bal­lots helped elect each of the win­ners and by how much. You know not only ex­actly the out­come in re­al­ity, but also the coun­ter­fac­tual out­comes for ev­ery other pos­si­ble com­bi­na­tion of bal­lots; still, how do you turn all that into the num­bers you need?

Be­fore go­ing on, we should note that al­though we are in­ter­ested in us­ing the an­swer to this prob­lem as one sub-step in eval­u­at­ing multi-win­ner vot­ing meth­ods, the prob­lem it­self ap­plies equally-well to sin­gle-win­ner vot­ing meth­ods: given that can­di­date X won, which vot­ers are re­spon­si­ble, and by how much? Thus, un­til fur­ther no­tice, most of the con­crete cases we’ll be con­sid­er­ing come from the sim­pler do­main of sin­gle-win­ner meth­ods.

With sin­gle-win­ner plu­ral­ity, it’s easy to al­lo­cate re­spon­si­bil­ity us­ing sym­me­try: all the N peo­ple who voted for the win­ner did so equally, so each of them has 1/​N re­spon­si­bil­ity. This is in­de­pen­dent of how many vot­ers there are or how many voted for other can­di­dates; N could be 99%, or 50.01% in a two-way race, or even 1% in a 101-way race.

But that kind of sym­me­try ar­gu­ment doesn’t work for com­plex vot­ing meth­ods with many more valid ways that each voter could vote than there are can­di­dates. In such cases, even briefly con­sid­er­ing edge cases such as Con­dorcet cy­cles or im­pos­si­bil­ity re­sults such as Ar­row’s or Gib­bard-and-Sat­terth­waite’s the­o­rems sug­gests that al­lo­cat­ing re­spon­si­bil­ity will be no sim­ple task.

The prob­lem of al­lo­cat­ing re­spon­si­bil­ity has been con­sid­ered be­fore in the con­text of sin­gle-win­ner vot­ing meth­ods; there are ex­ist­ing for­mu­las for “voter power” in­dices, and my even­tual an­swer is similar to one of them. But this is­sue ac­tu­ally goes well be­yond vot­ing. There are many situ­a­tions when an out­come has oc­curred, but even though the in­puts and coun­ter­fac­tu­als are more-or-less agreed-upon, peo­ple can still ar­gue as to which of those in­puts are re­spon­si­ble.

Here’s my plan for dis­cussing this ques­tion:

  • Start with a story of how judges and lawyers have failed to re­solve this ques­tion over thou­sands of years.

  • Give an ex­am­ple or two of why it’s difficult to defini­tively al­lo­cate re­spon­si­bil­ity, even when you know all the rele­vant facts and coun­ter­fac­tu­als.

  • Tell an in­ter­est­ing story of how math­e­mat­i­ci­ans strug­gled and ul­ti­mately suc­ceeded in cleanly solv­ing a prob­lem that’s similar but differ­ent in key ways. I think that the solu­tion they even­tu­ally came to for that prob­lem is ex­actly wrong for this one, but the kind of cog­ni­tive leaps they had to make may help sug­gest ap­proaches.

  • Sketch out what an ideal solu­tion to al­lo­cat­ing re­spon­si­bil­ity should look like.

  • Con­struct a solu­tion that I think is as ideal as pos­si­ble, even though it’s not en­tirely perfect or math­e­mat­i­cally clean.

  • Ar­gue that it may be im­pos­si­ble to do much bet­ter.

Con­sider the prob­lem of le­gal li­a­bil­ity. There are many cases of laws that where one of the crite­ria for a per­son be­ing li­able is that their ac­tions “cause” a cer­tain out­come. One tra­di­tional le­gal defi­ni­tion for “cause” in this sense is “sine qua non”, Latin for “with­out this, not”; in other words, X causes Y iff Y wouldn’t have hap­pened with­out X. But what if two peo­ple in­de­pen­dently set sep­a­rate fires on the same day, and as a re­sult my house burns down? Un­der a “sine qua non” stan­dard, per­son A isn’t li­able be­cause even with­out them fire B would have burned my house, and vice versa.

This merely shows that the tra­di­tional le­gal stan­dard of “sine qua non” is silly, not that it’s not easy to find a good stan­dard. I know; the fact that lawyers and judges haven’t figured out how to al­lo­cate re­spon­si­bil­ity, doesn’t prove that do­ing so is truly hard. But it does at least show that it’s not triv­ial.

If all you care about is whether or not some­one was par­tially re­spon­si­ble for an out­come, and not how much re­spon­si­bil­ity they de­serve, there’s a more-or-less sim­ple, satis­fy­ing an­swer (and the lawyers are just be­gin­ning to dis­cuss it, though it’s not a set­tled mat­ter): a fac­tor is causal if it’s a nec­es­sary el­e­ment of a suffi­cient set (NESS). (Ac­tu­ally, in the le­gal situ­a­tion, there’s an ad­di­tional com­pli­ca­tion needed for a good defi­ni­tion of re­spon­si­ble causes. If some­thing it was “es­tab­lished/​known” “af­ter” a suffi­cient set already ex­isted, it is not el­i­gible to be a cause; and defin­ing those terms in a way that gives satis­fy­ing an­swers is tricky. For in­stance, a fire set af­ter my house burned down can­not be a cause, even if it is a NESS. But dis­cussing this is­sue is out of scope here, be­cause in the case of vot­ing meth­ods, only bal­lots mat­ter, and all bal­lots are con­sid­ered to be cast si­mul­ta­neously.)

NESS an­swers the ques­tion of who shares re­spon­si­bil­ity, but it doesn’t an­swer that of how much re­spon­si­bil­ity they have. For in­stance, imag­ine that a group of peo­ple made stone soup with 1 stone, 1 pot, wa­ter, fire, and in­gre­di­ents; and that in or­der to be a soup, it needed a pot, wa­ter, and at least 3 in­gre­di­ents. NESS tells us that the per­son who brought the stone was not re­spon­si­ble for the soup, and that ev­ery­one else was; but how do we di­vide re­spon­si­bil­ity among the oth­ers? Sim­ple sym­me­try shows that each in­gre­di­ent gets the same re­spon­si­bil­ity, but there are many fa­cially-valid ways you could set rel­a­tive re­spon­si­bil­ity weights be­tween an in­gre­di­ent and the pot.

There’s also an­other flaw with NESS: similar out­comes. Con­sider a trol­ley prob­lem: you just pul­led the lever, sav­ing 5 peo­ple by di­vert­ing the trol­ley to crush 1. By NESS, you are now a mur­derer; a nec­es­sary el­e­ment of the set of ac­tors lead­ing to that one guy’s death. But, un­like the fools who let the trol­ley run away in the first place, you are not in a NESS for the event “some­body dies here to­day”. In pro­por­tional vot­ing terms, we will need to cre­ate such com­pos­ite events — “at least one of this sub­set of can­di­dates is elected” — be­cause oth­er­wise, fre­quently, a suffi­cient set will not be com­plete un­til there are so few vot­ers out­side the set that they couldn’t tip the bal­ance be­tween two similar can­di­dates.

Pas­cal’s Other Wager? (The Prob­lem of Points)

This is be­gin­ning to look a lot like an­other fa­mous prob­lem from the his­tory of math­e­mat­ics: the “Prob­lem of Points”. This ques­tion was origi­nally posed by Luca Pa­cioli (most famed for in­vent­ing dou­ble-en­try book­keep­ing): say that two peo­ple are gam­bling, and they have agreed that a pot of money will go to the first player to win a to­tal of N games. But for some rea­son they are in­ter­rupted af­ter nei­ther player has won; say that one of them has won P and the other Q, with P>Q. How should the pot be di­vided fairly? Clearly, player P should get more, be­cause they were closer to win­ning; but how much more?

Like the retroac­tive al­lo­ca­tion of re­spon­si­bil­ity that we’re con­sid­er­ing, this is a case of al­lo­cat­ing a fixed re­source be­tween var­i­ous agents in a well-defined situ­a­tion (one with­out any “un­known un­kowns”). And similar to our case, solv­ing it is de­cep­tively hard. Sev­eral smart math­e­mat­i­ci­ans of the time—such as Tartaglia and Car­dano, ri­val co-dis­cov­er­ers of the cu­bic for­mula—pro­posed “solu­tions”, but there was some­thing un­satis­fy­ing about all of them. For in­stance, Pa­cioli sug­gested that the two play­ers should split the pot in pro­por­tion to the num­ber of games they’d won; but this would give the en­tire pot to some­body who’d won just 1 game to 0 out of a tour­na­ment to 100, a tiny lead in con­text.

The prob­lem was defini­tively solved in let­ters be­tween Fer­mat and Pas­cal. Fer­mat got the right an­swer, and Pas­cal re­fined the ar­gu­ment to sup­port it; Pas­cal’s let­ter of Au­gust 24, 1654 is con­sid­ered one of the found­ing doc­u­ments of a branch of math. I’ll tell that an­swer in a mo­ment, but in case you don’t already know it, I’ll give you a mo­ment to pause and see how you’d solve the prob­lem.


The an­swer is: the pot should be split in pro­por­tion to the prob­a­bil­ity that each player will win it.

This an­swer may seem al­most ob­vi­ous and triv­ial to us, but to Fer­mat and Pas­cal, it re­quired a leap of in­sight. This was ar­guably the first time peo­ple used math (prob­a­bil­ity) to rea­son about events in an ex­plic­itly coun­ter­fac­tual fu­ture, as op­posed to sim­ply gen­er­al­iz­ing about fu­ture events that closely re­sem­ble oft-re­peated past ones. This key idea — that in or­der to bet­ter de­scribe the uni­verse as it is, we must some­times rigor­ously an­a­lyze states that are not and will never be — is the foun­da­tion of risk anal­y­sis, which is it­self key to mod­ern so­ciety. (Here’s a video which tells this story, and makes this ar­gu­ment for its im­por­tance, at length.)

In that case, Pas­cal and Fer­mat were con­sid­er­ing how to “justly” al­lo­cate a fixed re­source: a pot of money. The prob­lem had been posed with­out a clear defi­ni­tion of what “justly” would mean, but the solu­tion they came up with was so math­e­mat­i­cally beau­tiful that once you’ve un­der­stood it, it’s hard to imag­ine any other defi­ni­tion of jus­tice in that case.

It’s clear that this is an in­ter­est­ing story, but why do I spend so much time on it, when the prob­lem at hand is al­lo­cat­ing ret­ro­spec­tive re­spon­si­bil­ity, not prospec­tive odds? Am I about to ar­gue that odds and risk anal­y­sis are key to an­swer­ing this prob­lem, too?

No! In fact, quite the op­po­site. I think this ex­am­ple is rele­vant in part be­cause the very ar­gu­ments that pushed Pas­cal and Fer­mat to think prob­a­bil­is­ti­cally, should push us to avoid bas­ing our an­swer pri­mar­ily in prob­a­bil­ities. A key ar­gu­ment they used against ear­lier pro­posed “solu­tions” such as Pa­cioli’s was that Pa­cioli’s an­swer would always give the same share for given val­ues of P and Q, in­de­pen­dent of N; but clearly a player who leads by 7 games to 4 is in a much bet­ter po­si­tion if the thresh­old to win the pot is 8 games than if the thresh­old is 100 games. How­ever, in or­der to al­lo­cate re­spon­si­bil­ity retroac­tively, we already saw us­ing sym­me­try ar­gu­ments that in the case of plu­ral­ity, re­spon­si­bil­ity is in­versely pro­por­tional only to win­ning vote to­tal, in­de­pen­dent of the thresh­old it would have taken to win.

In­ter­lude: Am I bark­ing up the wrong tree? Is it even clear to you what tree I’m bark­ing up?

So far in this es­say, I’ve been ex­plain­ing things I be­lieve I un­der­stand pretty well. That is, most of what I’ve said so far, I think that if you asked me to, I could ex­plain at greater length and with more rigor. This is about to change. This sec­tion, in par­tic­u­lar, deals with ideas that I am not yet done chew­ing over. So I’ll make some claims here that I can’t en­tirely back up with con­crete ar­gu­ments.

But that doesn’t mean this is pure spec­u­la­tion, ei­ther. I’ve spent liter­ally decades of my life think­ing about this stuff, and that pur­suit has in­cluded get­ting a doc­torate in statis­tics. So if you, read­ing this, think you see a fatal flaw in my ideas, you may well be right; but if you think that the flaw is ob­vi­ous and that I haven’t at least con­sid­ered it, you’re very prob­a­bly wrong. My counter-counter-ar­gu­ment may be hand­wavy and ul­ti­mately in­cor­rect, but it at least ex­ists.

If you look at pre­vi­ous at­tempts to solve retroac­tive re­spon­si­bil­ity prob­lems, most rely im­plic­itly or ex­plic­itly on the kind of prob­a­bil­ity-based ar­gu­ments that you use for proac­tive prob­lems like the prob­lem of points. For in­stance, voter power in­dices might be based on the prob­a­bil­ity that a given (set of) ac­tor(s) is pivotal, given some dis­tri­bu­tion of ac­tions for the other ac­tors. I be­lieve that this logic is ul­ti­mately a dead end. Once you have a retroac­tive sce­nario, where ac­tual vot­ers each cast an ac­tual bal­lot, rea­son­ing that’s based on “what was the prob­a­bil­ity that those peo­ple would have acted differ­ently” will in­evitably be sen­si­tive to as­sump­tions in a way that un­der­mines it. Fur­ther­more, this prob­lem is ex­po­nen­tially com­pounded by the need to con­sider more than two pos­si­ble out­comes, and the is­sues that re­sult from that — that is, Con­dorcet cy­cles, Ar­row’s the­o­rem, the Gib­bard-Sat­terth­waite the­o­rem, etc.

But I didn’t tell the story of the Prob­lem of Points just to say what the an­swer to retroac­tively al­lo­cat­ing re­spon­si­bil­ity isn’t. It was also a good ex­am­ple of a prob­lem speci­fied in loose prac­ti­cal terms (who “de­serves” what share of the pot), where just trans­lat­ing into rigor­ous math terms turned out to be more than half the work of find­ing a solu­tion that pretty much any­one can agree is “cor­rect”. So, if the goal is to solve a similar loosely-defined prob­lem, our first step should be to restate why we want a solu­tion and what char­ac­ter­is­tics we know that solu­tion should have.

We’re try­ing to es­ti­mate the ex­pected mean squared er­ror (squared bias plus var­i­ance) of a vot­ing method, by look­ing at an out­come of that vot­ing method. All we nec­es­sar­ily have is the bal­lots and the win­ners; we imag­ine that came from some kind of de­ci­sion pro­cess that might be rep­re­sented as vot­ers and can­di­dates lo­cated in an ide­olog­i­cal space where vot­ers tend to pre­fer can­di­dates who are closer, but we don’t have enough in­for­ma­tion to re­li­ably place vot­ers and can­di­dates in that space. So we model the vot­ing sys­tem as hav­ing had two steps: group­ing vot­ers into (pos­si­bly-over­lap­ping) weighted con­stituen­cies, and choos­ing a win­ner in each con­stituency. As­sign­ing re­spon­si­bil­ity means giv­ing each voter a weight in each con­stituency — how much power they ex­er­cised to help elect that win­ner. If we do a good job as­sign­ing re­spon­si­bil­ity, then it should be easy to find coun­ter­fac­tual out­comes where chang­ing re­spon­si­ble votes changed the win­ner, but harder to find coun­ter­fac­tu­als where chang­ing non-re­spon­si­ble votes did so.

When I put it like that, it may sound as if I’m solv­ing the wrong prob­lem. I’m try­ing to find an “al­lo­cated re­spon­si­bil­ity” mea­sure as a one-di­men­sion-per-can­di­date ap­prox­i­ma­tion to a more-com­plex causal struc­ture of the ac­tual vot­ing method; so that I can then com­bine that with some ap­prox­i­ma­tion of how can­di­dates are se­lected within the con­stituen­cies; so I can use that to ap­prox­i­mate the ex­pected MSE be­tween the vot­ers’ (ex­em­plari­ness-weighted) ide­olog­i­cal dis­tri­bu­tion and that of the leg­is­la­tors; so I can use that to ap­prox­i­mate the util­ity lost be­tween those two things; when, to start out with, the vot­ers’ ide­olog­i­cal dis­tri­bu­tion was it­self an ap­prox­i­ma­tion to the util­ity-max­i­miz­ing out­come, jus­tified by the Con­dorcet Jury The­o­rem. There’s already 5 differ­ent lev­els/​types of ap­prox­i­ma­tion in there, and I’m finished yet; surely, there must be a bet­ter way?

I be­lieve there prob­a­bly isn’t. That is to say: I be­lieve that each of the ap­prox­i­ma­tion steps so far is plau­si­bly the low­est-er­ror way to ac­com­plish its task. Speci­fi­cally, all of them are “neu­tral” in terms of ide­olog­i­cal bias in a way that al­ter­nate ap­prox­i­ma­tion schemes of­ten are not; and all of them are, plau­si­bly, among the low­est-var­i­ance pos­si­ble ap­prox­i­ma­tion strate­gies that have such neu­tral­ity. Just as in Churchill’s fa­mous dic­tum that “democ­racy [that is, the last 2-3 ap­prox­i­ma­tion steps in that chain] is the worst form of Govern­ment ex­cept for all those other forms that have been tried from time to time”, I be­lieve that re­spon­si­bil­ity-al­lo­ca­tion may be the worst way to eval­u­ate mul­ti­win­ner vot­ing meth­ods ex­cept for all the other forms that have been tried (or have oc­curred to me over my years of think­ing about this) from time to time.

Fur­ther­more, while this con­cept of nu­mer­i­cally al­lo­cat­ing re­spon­si­bil­ity is used in this case as one step in a larger strat­egy of eval­u­at­ing vot­ing meth­ods, I think that nu­mer­i­cally-al­lo­cated re­spon­si­bil­ity could be use­ful in other con­texts. I’ve already men­tioned that it re­lates to le­gal li­a­bil­ity. I also think it could be use­ful to in­di­vi­d­ual vot­ers in con­sid­er­ing vot­ing tac­tics or even the de­ci­sion of whether it’s worth vot­ing in the first place. I’ll say a bit more about that lat­ter is­sue later.

Ten­ta­tive answer

Thanks to Thomas Sepul­chre’s com­ment be­low, I have a solu­tion to this prob­lem that I con­sider “cor­rect”. I’ll copy his ex­pla­na­tion here:

I think there’s ac­tu­ally one way to set rel­a­tive re­spon­si­bil­ity weight which makes more math­e­mat­i­cal sense than the oth­ers. But, first, let’s slightly change the prob­lem, and as­sume that the mem­bers of the group ar­rived one by one, that we can or­der the mem­bers by time of ar­rival. If this is the case, I’d ar­gue that the com­plete re­spon­si­bil­ity for the soup goes to the one who brought the last nec­es­sary el­e­ment.
Now, back to the main prob­lem, where the mem­bers aren’t or­dered. We can set the re­spon­si­bil­ity weight of an el­e­ment to be the num­ber of per­mu­ta­tions in which this par­tic­u­lar el­e­ment is the last nec­es­sary el­e­ment, di­vided by the to­tal num­ber of per­mu­ta­tions.
This method has sev­eral qual­ities : the sum of re­spon­si­bil­ities is ex­actly one, each use­ful el­e­ment (each Ne­c­es­sary Ele­ment of some Suffi­cient Set) has a pos­i­tive re­spon­si­bil­ity weight, while each use­less el­e­ment has 0 re­spon­si­bil­ity weight. It also re­spects the sym­me­tries of the prob­lem (in our ex­am­ple, the re­spon­si­bil­ity of the pot, the fire and the wa­ter is the same, and the re­spon­si­bil­ity of each in­gre­di­ent is the same)
In a sub­tle way, it also takes into ac­count the scarcity of each ressource. For ex­am­ple, let’s com­pare the situ­a­tion [1 pot, 1 fire, 1 wa­ter, 3 in­gre­di­ents] with the situ­a­tion [2 pots, 1 fire, 1 wa­ter, 3 in­gre­di­ents]. In the first one, in any or­der, the re­spon­si­bil­ity goes to the last el­e­ment, there­fore the fi­nal re­spon­si­bil­ity weight is 17 for each el­e­ment. The sec­ond situ­a­tion is a bit trick­ier, we must con­sider two cases. First case, the last el­e­ment is a pot (1/​4 of the time). In this case, the re­spon­si­bil­ity goes to the sev­enth el­e­ment, which gives 17 re­spon­si­bil­ity weight to ev­ery­thing but the pots, and 114 re­spon­si­bil­ity weight to each pot. Se­cond case, the last el­e­ment is not a pot (3/​4 of the time), in which case the re­spon­si­bil­ity goes to the last el­e­ment, which gives 16 re­spon­si­bil­ity weight to ev­ery­thing but the pots, and 0 to each pot. In to­tal, the re­spon­si­bil­ities are 156 for each pot, and 956 for each other el­e­ment. We see that the re­spon­si­bil­ity has been di­vided by 8 for each pot, ba­si­cally be­cause the pot is no longer a scarce ressource.
Any­way, my point is, this method seems to be the good way to gen­er­al­ize on the idea of NESS : in­stead of just check­ing whether an el­e­ment is a Ne­c­es­sary Ele­ment of some Suffi­cient Set, one must count how many times this el­e­ment is the Ne­c­es­sary Ele­ment of some per­mu­ta­tion.

I be­lieve that TS prob­a­bly in­de­pen­dently “in­vented” this idea (as a com­pet­i­tive pro­gram­mer, I ex­pect he’d have seen lots of prob­lems and have a good toolbox of solu­tions and strate­gies for re-adapt­ing those solu­tions to new prob­lems). But this solu­tion is ac­tu­ally a ver­sion of the clas­sic (1954) Shap­ley-Shu­bik voter power in­dex.

((Tech­ni­cal note, feel free to skip: the key differ­ence is that origi­nally S-S was made for mea­sur­ing prospec­tive voter power in sin­gle-win­ner elec­tions; in the retroac­tive re­spon­si­bil­ity, multi-win­ner case, we only con­sider coun­ter­fac­tu­als in which bal­lots sup­port a given can­di­date “less” than they did in re­al­ity. This in­volves defin­ing this con­cept of “less sup­port”; ideally this would be done in such a way that the vot­ing method was mono­tonic over bal­lots, but in cases of com­plex vot­ing meth­ods, it may be worth­while to ac­cept some non-mono­ton­ic­ity in or­der to get stric­ter and more-use­ful defi­ni­tions of “less” sup­port. ))

In­ter­est­ingly, the or­di­nary Shap­ley-Shu­bik in­dex can be in­ter­preted as a Bayesian prob­a­bil­ity, by us­ing the right prior (Straf­fin, P. D. The Shap­ley-Shu­bik and Banzhaf power in­dices as prob­a­bil­ities. in The Shap­ley value: es­says in honor of Lloyd S. Shap­ley). From an in­di­vi­d­ual voter’s per­spec­tive, what Straf­fin calls the “ho­mo­gene­ity as­sump­tion” can be stated as a Polya(1,1) prior; in prac­ti­cal terms, this could mean that the voter ex­pects that their own de­ci­sion is acausally re­lated to all other vot­ers’ de­ci­sions, and is us­ing some acausal de­ci­sion the­ory. (For more dis­cus­sion and links on vot­ing and acausal de­ci­sion the­o­ries, see here.)

((The Polya pa­ram­e­ters 1,1 there mean a rel­a­tively high cor­re­la­tion be­tween the ego vot­ers’ de­ci­sion and that of other vot­ers, though it is slightly weaker than with the Jeffreys prior of Polya(.5, .5). A strictly func­tional-de­ci­sion-the­ory agent would prob­a­bly ex­pect a weaker cor­re­la­tion with other vot­ers, as they would only con­sider cor­re­la­tions that came from other vot­ers us­ing very similar de­ci­sion pro­ce­dures. That would prob­a­bly lead to pri­ors more like Polya(to­tal_pol­led_yesses_across_rep­utable_polls, to­tal_pol­led_nos_across_rep­utable_polls), which would as­sign much lower power to most vot­ers — more similar to the Banzhaf power in­dex. Per­son­ally, I find Polya(1,1) to be a rea­son­able prior; aside from hav­ing an easy-to-calcu­late al­gorithm as given by Sepul­chre, it feels “about right” for my per­sonal de­ci­sion the­ory.))

For the mod­ified, retroac­tive Shap­ley-Shu­bik in­dex we’re us­ing here, I’m not en­tirely sure what prior it’s equiv­a­lent to, or even if there is such a prior. How­ever, in­tu­itively, it seems to me that it’s likely to be some­thing like “you’re cor­re­lated to all the vot­ers who are at least as sup­port­ive as you are of any can­di­date you help elect, us­ing a Polya(1,1) prior”. This idea — that for pur­poses of de­ci­sion the­ory, you should give more con­sid­er­a­tion to your acausal links to other vot­ers if those other vot­ers hold similar un­der­ly­ing prefer­ences to you — is in­tu­itively ap­peal­ing to me.

Shorter “solu­tion” statement

Say you have a full set of bal­lots ℬ and an out­come O. For each in­di­vi­d­ual bal­lot b_i in ℬ, and each set of can­di­dates 𝒸 con­tain­ing ex­actly one win­ner c_j for O, take ℒ(b_i, 𝒸) to be the set of pos­si­ble le­gal bal­lots which sup­port all can­di­dates in 𝒸 “less” than or equal to what bal­lot b_i does. Take k_1...k_V to be a ran­dom per­mu­ta­tion of the in­te­gers 1...V; this is the only ran­dom­ness here so all ex­pec­ta­tions are with re­spect to it. Take K(i) to be that in­te­ger n such that k_n = i. E(...) de­notes ex­pec­ta­tion and I(...) de­notes an in­di­ca­tor func­tion of an event; that is, 1 if that event is true and 0 if it is false.

Voter i’s “ret­ro­spec­tive power” (RP) to elect a mem­ber of the set 𝒸 is:

(V/​S)*E( I(some can­di­date in 𝒸 wins for all elec­tions such that b_{k_1} to b_{k_{K(i)}} are held con­stant, and b_{k_{K(i)+1}} to b_{k_V} are al­lowed to be re­placed by ℒ(..., 𝒸) ) - I(some can­di­date in 𝒸 wins for all elec­tions such that b_{k_1} to b_{k_{K(i)-1}} are held con­stant, and b_{k_{K(i)}} to b_{k_V} are al­lowed to be re­placed by ℒ(..., 𝒸) ).

(To do: define a bit more no­ta­tion up front so I can write that in clean LaTeX.)

When calcu­lat­ing the RP to elect an in­di­vi­d­ual can­di­date c_j, use the power to elect a set, choos­ing 𝒸 as fol­lows: of all the sets pos­si­ble sets 𝒸 which in­clude c_j, choose the one with the small­est mea­sure among the ones which as­sign RP to the fewest vot­ers.

The con­stant V/​S (num­ber of vot­ers di­vided by num­ber of seats) is added up front so that the av­er­age RP for each voter will, by con­struc­tion, always be 1. Since the av­er­age is con­stant, min­i­miz­ing the stan­dard de­vi­a­tion will be the same as min­i­miz­ing the mean of squares.

How­ever, this mean of squares, while it is nat­u­rally rele­vant as an es­ti­ma­tor of the util­i­tar­ian loss, is hard to ex­plain as-is; for in­stance, its nat­u­ral units are com­pletely ab­stract. Thus, in the over­all “Effec­tive Vot­ing” met­ric be­low, I’ll find a (mono­tone) trans­for­ma­tion from this into a more-in­ter­pretable ver­sion.

Equal­iz­ing voter power (RP)

Now that we have a defn­i­ni­tion of ret­ro­spec­tive voter power, we can use it as an in­gre­di­ent in a met­ric of multi-win­ner vot­ing meth­ods. That is, all else equal, we might want to avoid the fol­low­ing:

  • Unequal voter power (high stan­dard de­vi­a­tion of RP across vot­ers), be­cause this would lead to “loss” di­rectly.

  • Too much over­lap be­tween the sets of vot­ers re­spon­si­ble for elect­ing differ­ent can­di­dates, be­cause this would tend to be in­di­rectly as­so­ci­ated with lower fidelity of rep­re­sen­ta­tion. To vi­su­al­ize this in ge­o­graphic terms, two rep­re­sen­ta­tives from a large dis­trict would tend to live farther from their vot­ers than one rep­re­sen­ta­tive from a dis­trict half the size.

I’m go­ing to fo­cus for now on just the first of these two. We’ll deal with “effec­tive choice” (which al­lows vot­ers to en­sure their reps are “close” to them ide­olog­i­cally) sep­a­rately. In par­tic­u­lar, it’s pos­si­ble that “cake-cut­ting-like” al­gorithms could si­mul­ta­neously im­prove ex­em­plari­ness, have lit­tle cost in terms of effec­tive choice, and cause high over­lap in mea­sured voter power; so di­rectly avoid­ing over­lap is not nec­es­sar­ily a good thing.

Phrag­mén and Thiele

In fact, there is already a pro­por­tional vot­ing method that is de­signed to en­sure vot­ers have roughly equal power, and that defines that power in a way that’s en­tirely com­pat­i­ble with, though less gen­er­ally-ap­pli­ca­ble than, our defi­ni­tion above. And in fact, it was first con­sid­ered in the 1890s by two of the ear­liest the­o­rists of pro­por­tional vot­ing, Scan­d­i­na­vian math­e­mat­i­ci­ans Ed­vard Phrag­mén and Thor­vald Thiele. How­ever, since nei­ther of them ul­ti­mately set­tled on this method as their fa­vorite, it fell into ob­scu­rity; oc­ca­sion­ally re­con­sid­ered or rein­vented (by Saint-Laguë in 1910, Ebert in 2003, and Mora in 2016).

The method was in­vented in a Swedish con­text of weak par­ties and ap­proval-style bal­lots. In the case of such ap­proval bal­lots, there’s an ob­vi­ous way to calcu­late each voter’s re­spon­si­bil­ity for elect­ing each win­ner: 1/​p, where p is the pro­por­tion of bal­lots that ap­proved that win­ner. Thus, one of the meth­ods Phrag­mén con­sid­ered was to ex­plic­itly try to min­i­mize the var­i­ance/​stan­dard de­vi­a­tion of vot­ers’ to­tal power. Since the sum of to­tal power is con­stant, this is equiv­a­lent to min­i­miz­ing the sum of each vot­ers’ squared power; known as the “squared load”.

Note that this is an op­ti­miza­tion prob­lem, and a naive al­gorithm for find­ing the op­ti­mum might have to check a com­bi­na­to­rial ex­plo­sion of pos­si­ble win­ner sets. Thus, in prac­tice, the method would prob­a­bly done se­quen­tially: at each step, elect one new can­di­date, min­i­miz­ing the sum of squared loads.

I men­tion this method not be­cause I think it’s bet­ter than other pro­por­tional meth­ods, but be­cause it shows that the idea of “min­i­miz­ing var­i­ance in voter power” goes all the way back to the ori­gins of pro­por­tional vot­ing meth­ods. Over­all, I think that ap­proval-bal­lot-based pro­por­tional meth­ods tend to lead to too much “over­lap” and thus to poor effec­tive choice, un­less voter strat­egy is im­plau­si­bly pre­cise. Also, as Phrag­mén him­self pointed out, it of­ten isn’t “op­ti­mal” us­ing this rule to pick a can­di­date who is ap­proved by liter­ally all the vot­ers, as it leads to ap­par­ent “im­bal­ance” in voter power.

Mea­sur­ing “Effec­tive Choice”, sep­a­rate from power

I’ve already men­tioned the idea of “effec­tive choice” sev­eral times, in con­nec­tion with voter power over­lap.

Imag­ine two raw piz­zas, where all the curls of shred­ded cheese are vot­ing on which 3 pep­per­o­nis should rep­re­sent them. On one pizza, the vot­ing method gives each slice of 13 the power to elect one pep­per­oni from around the cen­ter of the slice. The cheese curls all have equal vot­ing power, and the av­er­age dis­tance be­tween a cheese curl and its rep­re­sen­ta­tive pep­per­oni is about as low as it can pos­si­bly be.

Now, on the other pizza, one of the 13 slices gets to elect its pep­per­oni, as be­fore. But the other two slices com­bine to elect two pep­per­o­nis, both close to their com­bined cen­ter of grav­ity. Since those two pep­per­o­nis are both farther from the cen­ter of the in­di­vi­d­ual slices, the av­er­age dis­tance be­tween a cheese curl and its rep­re­sen­ta­tive pep­per­oni is higher. Clearly, this is a worse out­come, even though the RP is still perfectly even.

Clearly, what I’m driv­ing at is that effec­tive choice is some­how re­lated to the av­er­age “dis­tance” be­tween a voter and their rep­re­sen­ta­tive. But just us­ing av­er­age dis­tance would mean that the qual­ity of elec­tions would vary be­tween a small pizza and a large one, even if the out­come was ex­actly ho­molo­gous. So in­stead, we’ll define the effec­tive choice for one voter i us­ing a nor­mal­ized/​stan­dard­ized quan­tity: the por­tion rho(v_i, r_j) of other vot­ers who are farther from i’s rep­re­sen­ta­tive than i is.

That’s all very well and good for ar­tifi­cial ex­am­ples like the pizza, where we can some­how know ex­actly where each voter and can­di­date is lo­cated in ide­ol­ogy space. In the real world, though, that’s im­pos­si­ble, for nu­mer­ous rea­sons. So we’ll have to use the bal­lots to in­fer dis­tances, or at least, to in­fer the ex­pected/​av­er­age dis­tance.

In or­der to make such in­fer­ences, we will have to make as­sump­tions about the voter dis­tri­bu­tion. Per­haps the most fun­da­men­tal of those as­sump­tions, the one which will in­form all the oth­ers, is our be­lief about the effec­tive di­men­sion­al­ity of the voter space. In this case, I be­lieve that it’s best to as­sume one-di­men­sion­al­ity; this is not nec­es­sar­ily re­al­is­tic, but un­like any other di­men­sion­al­ity, it leads to rea­son­ably-ob­vi­ous and easy-to-vi­su­al­ize ways to make the rest of the as­sump­tions. Also, it tends to be “friendly” to vot­ing meth­ods, giv­ing a rough “best-case” anal­y­sis; I think this is fine.

Note that we’ve defined effec­tive choice such as the per­cent of vot­ers that are farther from the can­di­date, not closer, so that it’s 1 at best and 0 at worst.

Ad­ding a di­gres­sion: The defi­ni­tion we’ve given, in terms of “pro­por­tion of vot­ers who are farther from the can­di­date”, can be thought of as can­di­date-cen­tric; that is, as­sum­ing af­finity be­tween can­di­dates and vot­ers is sym­met­ric, it pri­ori­tizes situ­a­tions where ev­ery can­di­date gets “their fa­vorite vot­ers”. But that is not pre­cisely what we’d want, in an ideal world. For in­stance, in a one-di­men­sional ide­olog­i­cal space with vot­ers dis­tributed uniformly from 0 to 100, it does not dis­t­in­guish be­tween a two-win­ner elec­tion where the win­ners are at 0 and 100, from one where they’re at 25 and 75. Note that in this ex­am­ple, the prob­lem arises when the can­di­dates al­ign with the most-ex­treme frac­tion of vot­ers. With more win­ners, the num­ber of win­ners in this po­ten­tially-ex­treme po­si­tion would be lower.

There are two ob­vi­ous ways we might try to ad­dress this, but ei­ther one has is­sues, so by de­fault we’ll use nei­ther.

The first way would be to make some rough “para­met­ric” di­men­sional as­sump­tions to back out from “num­ber of vot­ers that are farther from the given can­di­date than the given voter is” to “dis­tance be­tween the given can­di­date and the given voter”. For in­stance, if ide­ol­ogy space is 2-di­men­sional, we might as­sume the dis­tance is (1-rho(v_i,r_j))^(1/​2), be­cause the num­ber of vot­ers who can fit in the space closer to the can­di­date is pro­por­tional to the square of the dis­tance. In gen­eral, then, for d di­men­sions, and sub­tract­ing from 1 so that 100% is good, this would be 1-(1-rho(v_i,r_j))^(1/​d). We could also use d=0.5, not to rep­re­sent “half of a di­men­sion”, but merely to get a kind of squared-dis­tance met­ric that best fa­vors find­ing a rep­re­sen­ta­tive at the av­er­age ide­ol­ogy for their con­stituents. Still, all of this re­lies on some strong as­sump­tions, and so we’ll use d=1 un­less oth­er­wise stated.

The sec­ond way would be to try to get a voter-cen­tric mea­sure; in­stead of “pro­por­tion of vot­ers who are farther from the given can­di­date”, to mea­sure “pro­por­tion of can­di­dates who are farther from the given voter”. The prob­lem with this is, it de­pends on the dis­tri­bu­tion of can­di­dates. This makes it com­pletely un­work­able for real-world elec­tions; but for in-sili­cio monte carlo ex­per­i­ments, we can sim­ply cre­ate a can­di­date for each voter, en­sure that all those can­di­dates are ideal in terms of any non-ide­olog­i­cal di­men­sions that all vot­ers agree on, and then count. Note that while this count­ing it­self will not be af­fected by the di­men­sion­al­ity of the ide­olog­i­cal space, in or­der to do it, we will have had to cre­ate an en­tire in sili­cio elec­tion, with its own ide­ol­ogy space of some ar­bi­trary di­men­sion. Also note that higher-di­men­sional ide­olog­i­cal spaces will re­quire ex­po­nen­tially more seats in the leg­is­la­ture to get the effec­tive choice above a given fixed thresh­old, even us­ing an ideal vot­ing method; this is not true for the can­di­date-cen­tric view, as voronoi di­a­grams are pos­si­ble in any di­men­sion­al­ity of space.

So, define EC_d(v_i, r_j) = 1-rho(v_i, r_j)^(1/​d).

For ex­am­ple, say we had an IRV (aka sin­gle-win­ner ranked choice vot­ing) elec­tion with 4 can­di­dates and the fol­low­ing votes:

30 bal­lots: A>B>C>D

10 bal­lots: B>A>C>D

5 bal­lots: B>C>A>D

5 bal­lots: B>C>D>A

9 bal­lots: C>B>A>D

11 bal­lots: D>C>A>B

30 bal­lots: D>C>B>A

The out­come is that A wins, af­ter C and B are elimi­nated. To find the av­er­age RQ_1 of each group, we’d be­gin by lin­ing up the groups from “clos­est to A” to “farthest from A”. Us­ing the sim­ple as­sump­tion that vot­ers are closer to A the higher prefer­ence they give, and on av­er­age the same dis­tance when the prefer­ence level is the same, gives the fol­low­ing or­der:

A>B>C>D (30 bal­lots to­tal; so the group has an av­er­age RQ_1 of 85%)

B>A>C>D (10 to­tal, so group av­er­age RQ_1 of 65%

B>C>A>D, C>B>A>D, and D>C>A>B (25 to­tal, so cross-group av­er­age RQ_1 of 47.5%)

B>C>D>A and D>C>B>A (35 to­tal, av­er­age RQ_1 of 17.5%)

Aver­age Voter Effec­tive­ness: a com­bined pro­por­tion­al­ity metric

We now have all the in­gre­di­ents nec­es­sary for mak­ing an over­all met­ric for multi-win­ner out­comes, which I’m dub­bing “Aver­age Voter Effec­tive­ness” (AVE).

To be­gin with, note that “stan­dard de­vi­a­tion of RP” is mono­ton­i­cally re­lated to “av­er­age squared load”, which in turn could be viewed as “weighted av­er­age lob­by­ing power”. That is to say, av­er­age vot­ing power, where the av­er­age is it­self weighted by vot­ing power.

Imag­ine a rep­re­sen­ta­tive who was un­sure how to vote on some ques­tion, so she de­cided to “ask a con­stituent”. She draws a voter at ran­dom, with prob­a­bil­ity pro­por­tional to that voter’s ret­ro­spec­tive power to elect her, then listens to them. Fur­ther­more, say that the leg­is­la­tor as a whole let a ran­domly-cho­sen rep­re­sen­ta­tive have dic­ta­to­rial say over this is­sue. So ul­ti­mately, one voter will be de­cid­ing, and the ques­tion is: what is the ex­pected to­tal ret­ro­spec­tive vot­ing power of that voter?

For in­stance, if half the vot­ers split the vot­ing power evenly, then the “av­er­age squared load”/​”weighted av­er­age lob­by­ing power” is 2²/​2=2. The peo­ple with power, have twice as much power on av­er­age as if ev­ery­body were equal. And in this sim­ple case, the re­cip­ro­cal of that is the num­ber of wasted votes.

What I’m say­ing is: we can think of the re­cip­ro­cal of the av­er­age squared load as be­ing the “Effec­tive Voter Inequal­ity”, and thus one minus that as the “Effec­tive Voter Equal­ity” (EVE). This is one form of vote wastage; votes that have lit­tle or no im­pact on the out­come.

We can then com­bine this with an­other form of vote wastage: for those votes that did have an im­pact, how much choice did they re­ally have? How of the ide­ol­ogy space were they able to effec­tively screen out? This is EC. To avoid dou­ble-count­ing wasted votes, we can take the av­er­age EC, weighted by to­tal vot­ing power, and mul­ti­ply the un­wasted votes by that num­ber.

So in a sin­gle-win­ner plu­ral­ity elec­tion won by a bare ma­jor­ity, half the vot­ers split the vot­ing power equally, so vote wastage is al­most 50%; and those whose votes do mat­ter, on av­er­age like the win­ner ide­olog­i­cally more than 75% of the other vot­ers, so that’s an­other 25% de­duc­tion to effec­tive votes. In other words, the over­all vote wastage would be 62.5%, or equiv­a­lently, 37.5% av­er­age vote effec­tive­ness.

Where to, next?

So far, I’ve spent about 10,000 words to (mostly) define a fea­si­ble met­ric for eval­u­at­ing the qual­ity of vot­ing meth­ods. Next, I’m go­ing to look at some ac­tual multi-win­ner vot­ing meth­ods and try to es­ti­mate how they’d do by that met­ric. Note that this will in­volve some rough-and-ready ap­prox­i­ma­tions, but I be­lieve it will at least be pos­si­ble to class meth­ods into at least two qual­ity lev­els (with ger­ry­man­dered first-past-the-post far be­hind most al­ter­na­tives). I also hope it will be pos­si­ble to give num­bers that un­der­mine some spe­cific bad ar­gu­ments that peo­ple use in this area.

Vote wastage in sin­gle-win­ner plurality

To warm up, let’s calcu­late effec­tive voter equal­ity and effec­tive choice for some sim­ple sce­nar­ios un­der sin­gle-win­ner plu­ral­ity.

First off, the sim­plest pos­si­ble case: a 2-can­di­date elec­tion. The win­ner will have X>.5 (that is, over 50%) of the votes. Ret­ro­spec­tive vot­ing power will be of course be 0 for the los­ing votes. For each of the win­ning ones it will be 1/​X; that is, be­tween 1 (for a unan­i­mous out­come) and just un­der 2 (for the barest ma­jor­ity). Aver­age effec­tive choice for rep­re­sented vot­ers will be be­tween 50% (unan­i­mous) and 75% (bare ma­jor­ity). All told, av­er­age vot­ing effec­tive­ness will be be­tween 37.5% (bare ma­jor­ity) and 50% (unan­i­mous).

Now, con­sider a 3-way race in which no can­di­date has a ma­jor­ity, and the mar­gin be­tween 2nd and 3rd place is more than twice the differ­ence be­tween 1st place and 50%. For an ex­am­ple, you could imag­ine a sim­plified ver­sion of the US 2000 pres­i­den­tial elec­tion, with na­tional pop­u­lar vote and Gore 49%, Bush 48%, and Nader 3% (thus, Gore wins). Imag­ine all the votes start out “face down”, and we turn them “face up” one at a time in a ran­dom or­der. As we do so, set vari­ables G/​B/​N to be the face-up Gore/​Bush/​Nader votes as a per­cent of the to­tal, and D to be the face-down votes. Gore is guaran­teed to win iff G>B+D. Turn­ing over a Gore vote in­creases the left hand side of that in­equal­ity by 1 (LHS+1) and similarly RHS-1. Turn­ing over a Nader vote is LHS= and RHS-1. Turn­ing over a Bush vote is LHS= and RHS=.

So, ob­vi­ously, the Bush vot­ers have no ret­ro­spec­tive power to elect Gore. I still have to do the math more care­fully, but I’m pretty sure that as the to­tal num­ber of vot­ers grows, the ret­ro­spec­tive power of the Nader vot­ers con­verges to pre­cisely one half that of the Gore vot­ers, just as their power to move the differ­ence be­twen the LHS and RHS to­tals is half as big. (I’ve worked out one sim­ple ex­am­ple fully but I still don’t have an in­duc­tive proof of this). Thus a Gore voter would have ret­ro­spec­tive power 1/​.505, and a Nader voter, 11.01. This leads to an effec­tive voter equal­ity of 51.26%. (1/​(.49 * 1/​.505^2 + .03 * 11.01^2)).

The effec­tive choice de­pends on how of­ten a Bush voter is closer to Gore than a Gore voter is. In the “me­dian voter the­o­rem” case, where Gore and Bush were both ide­olog­i­cally close to the thresh­old be­tween them, the av­er­age RQ for Gore vot­ers would be .53 and .06 for Nader vot­ers. If Gore were ide­olog­i­cally like his me­dian sup­porter, it would be .755 and .48. And if Nader vot­ers were all “left wing” and Gore was left of his me­dian sup­porter, it would be .765 and .505. Us­ing the sec­ond (mid­dle) as­sump­tions, the over­all AVE would be 38.49% (1/​(.49 * 1/​.505^2 + .03 * 11.01^2)) * (.49*1/​.505^2*.755 + .03 * 11.01^2 * .48)/​(.49 * 1/​.505^2 + .03 * 11.01^2).

None of this is at all sur­pris­ing, but it’s com­fort­ing to be able to put it on a rigor­ous foot­ing.

And some of the con­clu­sions even for sin­gle-win­ner plu­ral­ity might be a bit more sur­pris­ing than that. For in­stance, in a 3-way elec­tion with to­tals 48%, 26%, 26%, vot­ers for both of the los­ing can­di­dates count as hav­ing had ret­ro­spec­tive power to elect the win­ner; 12.44, or 14 as much as the win­ning vot­ers, to be ex­act. (When the losers are nearly but not pre­cisely tied, the power of those vot­ers de­pends on the over­all size of the elec­torate in a way that makes calcu­lat­ing ex­act pow­ers te­dious. The larger los­ing group will range some­where be­tween 14 and 12 the power per voter as the win­ning group.)

Real-world multi-win­ner vot­ing meth­ods (un­der con­struc­tion)

I’m go­ing to come back to all the ideas above. But it’s fi­nally time to dis­cuss some spe­cific multi-win­ner vot­ing meth­ods.

Sin­gle-seat plu­ral­ity (aka sep­a­rate dis­tricts, as in US House, or par­li­a­ment in Canada/​UK/​In­dia):

This is not a pro­por­tional method, but we can still ap­prox­i­mate its rep­re­sen­ta­tional fair­ness.

Let’s say that the two-way vote share by dis­trict is drawn from a beta(3,3); this is a rough by-eye ap­prox­i­ma­tion to a few years of re­cent US data. That means that the av­er­age win­ning two-way mar­gin is about 30 points, with a stan­dard de­vi­a­tion of 10 points; that is, most dis­tricts are safe, with win­ning vote to­tals of around 65% or more. That would mean that an av­er­age dis­trict has 65% of vot­ers with 1/​.65=1.54 ret­ro­spec­tive power, and 35% wasted votes with 0 vot­ing power. At best, those 65% have an av­er­age 67.5% effec­tive choice, dis­tributed uniformly from 100% down to 35%. This mul­ti­plies out to an over­all AVE of 44%. (Note that it turns out that all the main steps in this ex­am­ple are lin­ear, so the AVE of the av­er­age precinct is the same as the av­er­age AVE of all precincts).

Sin­gle trans­ferrable vote (STV):

Each voter ranks the can­di­dates in prefer­ence or­der. We define a quota, the num­ber of votes you must beat to win; this may be ei­ther the Hare quota (tends to break ties in fa­vor of smaller par­ties, be­cause larger par­ties use up more votes on their early seats), or the Droop quota (op­po­site of above). Votes are always tal­lied for their top non-elimi­nated choice.

If a can­di­date has at least one quota of votes, they are elected (and thus “elimi­nated” from fur­ther con­sid­er­a­tion), and 1 quota worth of their tal­lied votes are “ex­hausted” (re­moved from all later tal­lies). This has the effect of trans­fer­ring their ex­cess votes.

If no can­di­date has a quota, and there are still seats left to fill, then the can­di­date with the low­est tally is elimi­nated (and so their votes are trans­ferred).

To un­der­stand how the com­po­nents of Rep­re­sen­ta­tional Fair­ness would work with STV, let’s take a sim­ple sce­nario, with S=9, and us­ing the Droop quota (10% in this case) as is most cus­tom­ary. Say the can­di­dates are the cap­i­tal let­ters, and we have the fol­low­ing bal­lots:

55%: A>B>C>D>E>F>G>H>.… (alpha­bet­i­cal)

36%: Z>Y>X>W>V>U>… (re­verse alpha­bet­i­cal)

9%: E>T>A>O>I>N>.… (in or­der of the Lino­type key­board, based on let­ter fre­quency in English)

STV elects A and Z im­me­di­ately, since both have more than a quota. Ex­cess votes spill over to B and Y, also elected; then C and X, also elected; then W and D, of whom only D has a full quota; then E; and fi­nally F. At this point, ABCDE and ZYX have been elected (5 seats from the be­gin­ning of the alpha­bet and 3 from the end), and F, W, and T have 5%, 6%, and 9%, re­spec­tively. At this point, all other let­ters are elimi­nated be­cause their tal­lies are 0. Then F is elimi­nated, and those votes trans­fer to the next sur­viv­ing can­di­date in alpha­bet­i­cal or­der, T, who then has a quota and gets the last seat. So the fi­nal win­ner set is ABCDETXYZ.

For can­di­date A, any 10% from the alpha­bet­i­cal vot­ers would be a suffi­cient set, so those vot­ers all had 1/​(9*.55) power to elect. For can­di­date B, any 20% of that group would be a suffi­cent set, so those vot­ers had the same 1/​(9*.55) power to elect. Similarly for can­di­dates C and D (with 30% and 40% be­ing the SS mag­ni­tude). So far, those vot­ers have 49 * 1/​.55 RP.

For can­di­date E, any set of 50% from the com­bined alpha­bet­i­cal and fre­quen­tist vot­ers would be suffi­cient, so those vot­ers each get 1/​(9 * .64).

For can­di­date T, char­ac­ter­iz­ing suffi­cient sets is a bit more com­plex. To elect T alone, a suffi­cient set would be si­mul­ta­neously 60% from the com­bined alpha­bet­i­cal and fre­quen­tist vot­ers, and 40% from the com­bined anti-alpha­bet­i­cal and fre­quen­tist vot­ers; for in­stance, 52% alpha, 32% anti-alpha, and 8% freq.

But those sets are at least 91% in to­tal mag­ni­tude. Can we get a smaller num­ber of suffi­cient vot­ers by ex­pand­ing the set of “similar” can­di­dates? Yes! To elect some mem­ber of the set {F...U} (that is, to en­sure W doesn’t win the last seat), it suffices to have one quota (10%) of votes from the alpha­bet­i­cal or fre­quen­tist camps re­main­ing af­ter 5 alpha­bet­i­cal can­di­dates have been elected. That means a to­tal of 60% from that 64% of vot­ers. Similarly, to elect some mem­ber of {G...V}, it would suffice to have 40% from the 45% com­bined anti-alphas and freqs. Since those are the small­est “suffi­cient” voter sets for a can­di­date set in­clud­ing T, we use that for our power calcu­la­tion.

As­sign­ing RP to the anti-alpha­bet­i­cal vot­ers for ZYX is straight­for­ward as for ABCD; those vot­ers get a to­tal of 39 * 1/​.36.

This leaves the fol­low­ing RPs:

55% Alpha­bet­i­cal: 4/​9*1/​.55 + 19 * 1/​.64=0.98

36% Anti-alpha­bet­i­cal: 3/​9*1/​.36 + 19 * 1/​.45=1.17

9% Fre­quen­tist: 1/​9*1/​.64 + 19 * 1/​.45=0.42

I think that, given these votes, this method, and this out­come, this as­sign­ment of voter power is not quite ideal; I’d pre­fer an al­gorithm which gives the fre­quen­tists more credit for get­ting their sec­ond-choice T than it gives to the anti-alpha­bet­i­cals for en­sur­ing F loses. I can even imag­ine how to tweak the EVE defi­ni­tion to im­prove that by ex­plic­itly re­duc­ing over­lap in as­signed voter power when­ever pos­si­ble. But I think this is close enough; the ex­tra com­plex­ity of such tweaks (both in terms of defi­ni­tion, and, more cru­cially, in terms of com­putabil­ity) would not be worth it.

So this is 84.8% effec­tive votes (1/​(.55*.98^2 + .36*1.17^2 + .9*0.42^2)).

In­clud­ing the EC, we get 1/​(.55*.98^2 + .36*1.17^2 + .9*0.42^2) * (.55*.98^2*(1-.55/​2) + .36*1.17^2*(1-.36/​2) + .9*0.42^2*(1-.09/​2))/​(.55*.98^2 + .36*1.17^2 + .9*0.42^2) = 67.4% AVE over­all. (The for­mula I gave there took some short­cuts, ne­glect­ing to sep­a­rately count the EC for the small slice of vot­ing power that the anti-alpha­bet­i­cals had to elect T, but this will not ma­te­ri­ally change the re­sult.) This is already sub­stan­tially bet­ter than the 44% we saw for sin­gle-seat dis­tricts. And note that, for sim­plic­ity, we used a sce­nario with large blocks of party-line vot­ers; if, in­stead, the vot­ers had a more-re­al­is­tic va­ri­ety in their first-choice prefer­ences while still group­ing into the same three over­all blocs, vote wastage might be as low as 80.5%.

Closed list pro­por­tional (a la Is­rael)

In this vot­ing method, vot­ers just choose a party. Each party which gets more than some min­i­mum thresh­old of votes is as­signed seats pro­por­tional to their votes. To de­cide who gets those seats, each party has pre-reg­istered an or­dered list, so if they get four seats, those go to the first four peo­ple on their list.

Let’s use the April 2019 Is­raeli leg­is­la­tive elec­tion as an ex­am­ple. I chose this one be­cause it shows the effect of the thresh­old more than the two more-re­cent “re­match” elec­tions; so the es­ti­mate of vote wastage that I get will be

Phrag­men’s method:

Ebert/​Phrag­men method:

War­ren Smith’s “phi vot­ing”:



Real-world pro­por­tional vot­ing re­form (empty)

Ger­ry­man­der­ing and “Fair Maps” (empty)

Prob­lems with pro­por­tion­al­ity: Is­rael, Italy, etc. (empty)

Is­rael: Too Many Par­ties (empty)

Italy: Un­sta­ble rules (empty)

Wales: strate­gic vot­ing /​ clone par­ties (empty)

Poland: high (party) thresh­olds (empty)

God­win vi­o­la­tion: not ac­tu­ally rele­vant (empty)

Novel meth­ods (empty)

Mod­ified Bavar­ian MMP (empty)

PLACE (empty)

Broad Gove (empty)

Prospects for change in US (empty)

Med­ford, MA (empty)

How you can help (empty)

Canada, UK, In­dia… (empty)