Posts

Comments

Comment by ClayShentrup on Voting reform seems overrated · 2021-04-14T06:21:13.754Z · EA · GW

My view is the complete opposite: voting reform is the biggest bang-for-the-buck human welfare increaser, by a huge margin. But don't equate electoral reform with proportional representation.

Two different math PhD's have run computer simulations to estimate the benefit of alternative voting methods:

  1. A Princeton math PhD named Warren Smith, who advocates score voting (formerly known as range voting) produced these Bayesian regret calculations. https://www.rangevoting.org/BayRegsFig
  2. A Harvard stats PhD named Jameson Quinn used some slightly different modeling assumptions and gave the results in an inverted normalized form called voter satisfaction efficiency or VSE, where 100% VSE is the same as a Bayesian regret of zero.
    https://electionscience.github.io/vse-sim/vse.html

Warren Smith found that an increase from plurality voting to score voting increased human welfare about as much as democracy vs. non-democratic random selection. He summarizes thusly:

there are very few causes out there with this much "bang for the buck." Examine the numbers yourself. I do not believe religious causes can compete. Disaster relief cannot compete (in the long term; for large disasters in the short term, it can). Curing diseases also cannot compete except for the biggest killers. E.g, ending malaria or halving illiteracy each would cause an amount of good comparable to range voting, but would probably be more difficult to accomplish.

There is a great deal of empirical evidence that bolsters such claims. In a 2014 exit poll for the Maine gubernatorial race, switching from plurality voting to approval voting led to a complete reversal in finish order. A bombastic climate change denier won the real election due to vote splitting between the Democrat and a Democrat-ish independent.

Warren Smith is actually fairly skeptical of proportional representation, mind you.

My view is that if the impact of alternative voting methods was properly understood, it would comprise the vast majority of the EA community's efforts. You just can't get this kind of impact (per dollar) from cash transfers, malaria nets, deworming, etc.

Comment by ClayShentrup on The Center for Election Science Appeal for 2020 · 2020-12-23T18:52:37.008Z · EA · GW

Fixing the process by which we decide who runs the government and crafts the policy that ultimately addresses every other issue, from zoning to emissions to pandemic response, sure seems like an effective focus to me. With the latest win in St Louis, I think the Center for Election Science is poised to take off like a rocket.

Comment by ClayShentrup on The Center for Election Science Appeal for 2020 · 2020-12-23T18:51:57.888Z · EA · GW

Fixing the process by which we decide who runs the government and crafts the policy that ultimately addresses every other issue, from zoning to emissions to pandemic response, sure seems like an effective focus to me. With the latest win in St Louis, I think the Center for Election Science is poised to take off like a rocket.

Comment by ClayShentrup on Thoughts on electoral reform · 2020-02-26T06:28:50.698Z · EA · GW

I discussed this in my post:

We just start with a random utility distribution, then turn that into preferences by mangling it with an "ignorance factor"

The ignorance factor represents a disparity between the actual utility impact a candidate will have on a voter, and the assumed utility impact which forms the basis for her vote. Even with lots of ignorance, there's still a significant difference in performance from one voting method to another.

In addition, I believe a lot of our ignorance comes from "tribal" thinking. If we have two parties (tribes), and each party must pick one side of any issue (abortion, guns, health care, etc.). Thus voters will tend to retroactively justify their beliefs about a given issue based on how it comports with their stated party affiliation. Note that this forced binary thinking is so powerful that we even have a party divide over the objective reality of climate change!

With a system like approval voting, candidates can easily run outside of the party system and still be viable. Thus they can take any arbitrary position on any issue, giving voters the freedom to move freely through the issue axes. A new offshoot of the GOP could form that is generally socially conservative and pro gun rights, but totally committed to addressing climate change. With 3-5 viable parties able to constantly adjust to changing realities, this is expected to reduce the amount of voter ignorance considerably, by allowing voters to consider issues which were once taken as given as part and parcel of their party affiliation.

Comment by ClayShentrup on Thoughts on electoral reform · 2020-02-25T20:35:33.255Z · EA · GW

@Matt_Lerner,

A democratic system is not the same as a utility-maximizing one.

In my utilitarian view, these are one in the same. An election is effectively just "a decision made by more than one person", thus the practical measure of democratic-ness is "expected utility of a voting procedure". I would argue it could be perfectly democratic to replace elections with a random sample of voter opinions over a statistically significant subset of the eligible voting population. This would probably be more democratic than the current system, which is distorted by demographic disparities in turnout. The issue would be making the process provably random, so as to ensure legitimacy.

The various criteria used to evaluate voting systems in social choice theory are, generally speaking, formal representations of widely-shared intuitions about how individuals' preferences should be aggregated or, more loosely, how democratic governments should function.

Yes, this is why the utilitarian camp within the electoral reform community eschews voting method criteria in favor of utility efficiency calculations, traditionally expressed as Bayesian regret, or more recently inverted into voter satisfaction efficiency. The procedure is pretty straightforward. We just start with a random utility distribution, then turn that into preferences by mangling it with an "ignorance factor", then turn that into a cast ballot by normalizing it and adding strategy. Then we compute the winner and measure the utility lost by not electing the social utility maximizer.

This allows us to property assess the combined effect of all criteria at once, even ones we never thought to consider, with their proper utility-decreasing weight, times frequency. There are of course externalities, like complexity and cost of voting machine upgrades, but luckily the better performing methods like approval voting tend to also be simpler than ranked voting methods too.

So an individual gains utility from a voting system if and only if the utility gained by its superior representation of their preferences exceeds the utility lost in other areas lost by switching.

I don't see how there is any appreciable utility lost by adopting approval voting. There might be a tiny amount lost from the physical cost of things like new voting machines if we upgrade to a more complex ranked system, but even then I believe the utility gain exceeds that by an order of magnitude.

In the simplest terms possible: we know that some voting systems are better than others when it comes to meeting our intuitive conception of democratic government. But we're concerned about people's welfare beyond just having people's electoral preferences represented, and we don't know what the relationship between these things is.

I have argued above that we do know. We have voter satisfaction efficiency and Bayesian regret. That is indeed the utilitarian lens through which many of the foundational members of the approval voting community see the world, and the basis of much of their support.

It is totally possible that voting systems that violate the Condorcet criterion also dominate systems that meet the criterion with respect to social welfare. We simply don't know.

This is in fact true! Score voting violates the Condorcet criterion, and also outperforms Condorcet methods in utility efficiency calculations.

Comment by ClayShentrup on Thoughts on electoral reform · 2020-02-21T08:23:03.656Z · EA · GW

A variety of utility distribution models were tried, and it turned out not to matter very much.

The simulations by N. Tideman had methodological flaws and didn't measure the right thing, thus being approximately as useful as a random coin flip in this mathematician's view.

Comment by ClayShentrup on Thoughts on electoral reform · 2020-02-21T07:35:41.230Z · EA · GW

> All things considered, I think electoral reform, while probably not a “top tier” intervention, should be part of the longtermist EA portfolio.

My view is that electoral reform, specifically alternative voting methods, is by far the biggest “bang for the buck” utility-increasing reform for humanity. I began studying alternative voting methods in 2006, largely focused on the findings of a Princeton math PhD named Warren Smith, who created RangeVoting.org to publicize his research. What was very surprising and counterintuitive were the Bayesian regret calculations he performed via computer simulation, showing that an upgrade from plurality voting to score voting (aka range voting) nearly doubled the human-welfare increasing impact of democracy. Approval voting is just score voting on a 0-1 scale and it performs almost as well as score voting with a range like 0-5; and of course it has the advantage of using ordinary ballots and not requiring voting machine upgrades. As Smith puts it:

there are very few causes out there with this much "bang for the buck." Examine the numbers yourself. I do not believe religious causes can compete. Disaster relief cannot compete (in the long term; for large disasters in the short term, it can). Curing diseases also cannot compete except for the biggest killers. E.g, ending malaria or halving illiteracy each would cause an amount of good comparable to range voting, but would probably be more difficult to accomplish.

More recently, a Harvard stats PhD named Jameson Quinn performed his own computer simulations using slightly different modeling assumptions, and presenting the results inverted and normalized to voter satisfaction efficiency (VSE) rather than Bayesian regret. However he found similar results to Smith's.

That last point about the cost (the second component in "bang for the buck") is crucial. Aside from the one-time political campaigning cost, voting reform is essentially free. Unlike, say, mosquito nets or medicines which have to be manufactured and distributed.

Full disclosure: Smith, Quinn, and myself are all former board members of the Center for Election Science, but I am no longer associated with them and am speaking only for myself.

Approval voting is vulnerable to tactical voting.

All deterministic voting methods are vulnerable to tactical voting. But computer simulation and game theoretical analysis shows that approval voting is especially resistant to tactical voting. For instance, as Warren Smith and Jameson Quinn showed in their computer simulations, approval voting worked very well with wide variances in assumptions about voter behavior and the preponderance of tactical behavior. The difference was so large that in some of Smith's models, approval voting performed better with 100% strategic voters than instant runoff voting (aka "IRV", the most prominent ranked method) did with 100% honest voters. There is even a mathematical theorem that, given plausible models of voter strategy, approval voting always elects a Condorcet winner (a candidate who beats every rival by a head-to-head majority) whenever one exists. This is a very mild, some would say beneficial, reaction to strategy.

It's also notable that approval voting was shown generally favorable to ranked methods in the William Poundstone book Gaming the Vote, and is specifically advocated by Steve Brams, an NYU professor of political science and game theory, who wrote such page turners as Mathematics and Democracy.

It fails the later-no-harm criterion: approving a second candidate can hurt your favourite.

At the outset, I want to suggest eschewing properties as a means for evaluating voting methods, and instead focus on voter satisfaction efficiency as described above. Focusing on properties is kind of analogous to evaluating race cars based on characteristics such as horsepower and aerodynamics. You might intuitively think a car with 10% more horsepower will be better, but once you perform a statistically significant number of timed trials, you may find instead that other factors such as aerodynamics or tire quality (or even properties you never thought to consider) have a countervailing effect that causes the more powerful car to surprisingly lose. Voter satisfaction efficiency (or Bayesian regret) simultaneously measures the combined effects of all those properties, even ones you forgot to consider. And indeed, those figures I cited already account for later-no-harm (LNH); and yet of the five alternative voting methods compared, IRV did the worst, even though it was the only one of those five methods to satisfy LNH.

Having said that, let's consider this failure of the later-no-harm criterion (LNH) more specifically. I actually contend that failing LNH is a benefit, not a flaw. Here's a thorough analysis by Warren Smith (again, full warning, Smith's writing is interspersed with venom for his enemies, but the content itself is high quality).

To summarize, later-ho-harm means that given you've started by showing support for your favorite, X, it cannot hurt X to indicate support for a lesser liked candidate, Y. E.g. suppose you rank the Green in first place. Now it cannot hurt the Green to rank the Democrat (or Labour) in second place. IRV is the only commonly discussed method that satisfies LNH. By contrast, if you cast a vote for the Green with approval voting, casting a second vote for the Democrat (or Labour) could cause the Green (your actual favorite) to lose. IRV proponents often argue that this creates an incentive for "bullet voting" only for one's favorite candidate. I admit this does seem problematic on its surface, but that goes away once you inspect a bit deeper.

See, all that assumed that you started by ranking your favorite candidate honestly in first place. E.g. you ranked the Green #1, and now you're considering whether to rank your honest second favorite as #2. But your best strategy with IRV is, similar to FPTP, to rank your favorite frontrunner in first place, not necessarily your favorite overall. We see this often in USA primary elections, where people are afraid to vote for their favorite candidate because they fear she'll lose the general election. For instance, right now my aunt in Iowa favors Democrat Elizabeth Warren over Joe Biden, but voted for Biden because she feels he's more likely to win in the next round against Trump. To convert this to its IRV analog, with all three of those candidates running on a single ranked ballot together, she would insincerely/strategically rank Biden in first place, to help ensure Warren (her actual favorite) is eliminated, thus giving the presumably more competitive Biden the chance to square off against Donald Trump in the next round. IRV fails the favorite betrayal criterion.

And contrary to those concerned about bullet voting with approval voting, the best strategy is actually to approve everyone you prefer to the expected utility of the winner. Bullet voting is absolutely not the best general strategy. Case in point, imagine a Green Party supporter who normally casts a strategic vote for the Democrat. With approval voting, she still casts that strategic vote for the Democrat, but then she also casts a sincere vote for her true favorite, the Green, plus anyone else she prefers to the Democrat. This is the underlying game theory that makes approval voting so robust against strategy. There's even empirical data showing that approval voting often has less bullet voting than comparable elections with IRV.

Moving beyond strategy, to the fundamental social choice theory behind LNH, the problem with LNH is that it forces a voting method to ignore important preference data. E.g.

Imagine preferences like these:

35% LRC
34% RCL
16% CLR
15% CRL

C is eliminated first with 31%, and L wins with 51%.

But suppose just 2%, from the CLR faction, swap their LR preference, creating:

35% LRC
34% RCL
14% CLR
17% CRL

C is still eliminated, but now R wins with 51%.

A tiny change in preferences changes the winner from L to R.

But now suppose a comparatively massive change in preferences occurs. The LRC voters find out something terrible about R, causing them to lower R to 3rd place. And the RCL voters find out something super positive about L, causing them to elevate L to 2nd place.

35% LCR
34% RLC
14% CLR
17% CRL

An enormous shift in public opinion just took place, positive for L and harmful for R. But this huge change didn't change the winner from R to L, even though a tiny change of preferences moved the winner from L to R in the first place. LNH causes this fundamental distortion in sensitivity to preference information.

The winner, then, may not be the candidate with the most support, but the one that’s best at manipulating the system.

Absolutely true. But also consider that:

A. This happens with IRV too, and in some ways more so. Since you still have to worry about electability with IRV, a candidate like Biden can convince Warren's supporters to drop her for him just by running ads that make voters fear her to be unelectable. That cannot possibly happen with approval voting. The best they could do is convince Warren's supporters to also vote for Biden.

B. Approval voting behaves so much better in the general case (again, see the voter satisfaction efficiency results above) that it gives substantial margin for error, such that even if this effect you describe happens, approval voting plausibly still comes out ahead.

Admittedly, we have to see approval voting in the real world to get a better handle on such campaign dynamics that are beyond the capacity for a computer to simulate.

Approval voting radically re-interprets the common-sense notion of "having a majority"...For instance, approval voting sometimes selects a candidate even though a majority of voters would, in a head-to-head contest, prefer any other candidate. (This is the Condorcet loser criterion.)

Again, approval voting tends to elect Condorcet winners whenever they exist, in practice. And most ranked methods, including most Condorcet methods, are extremely vulnerable to tactics, meaning they may in practice be worse at electing Condorcet winners. Indeed, IRV can elect candidate X even though candidate Y is preferred to X by a huge majority and also has twice as many first place votes as X.

And to go more esoteric and technical, it is mathematically proven that a group may prefer candidate X even though a majority of voters in that group prefer candidate Y. See here and here. So it is not inherently wrong to avoid electing the Condorcet winner, or even the majority winner for that matter. Here are even some examples where it seems bad to elect the Condorcet winner. I contend the goal of social choice is to maximize expected utility, i.e. maximize the net utility of the group, not to elect "majority winners" per se. We can talk about the biological origins of that grey decision-making machine between our ears, and how that makes us effectively utility maximization machines, but I suspect the EA community is already fairly on board with this.

Indicating support or opposition for each candidate is more expressive than just having a single vote, but it is still binary and does not allow voters to express more nuanced preferences between different candidates.

Yes, this is the expressiveness issue. Ranked ballots do provide more expressiveness, but this is counteracted by two other factors: 1. their increased vulnerability to strategy, and 2. their generally decreased tabulation efficiency (particularly in the case of IRV). This is why approval voting generally outperforms the ranked methods in computer models. Even in the circumstances where some ranked methods perform better, it's only a marginal amount that I do not think justifies the complexity. Especially given ranked voting methods have been repealed in something like 60 U.S. cities, apparently largely on account of their complexity.

There is almost no track record of approval voting being successfully used in competitive elections. Where it was used, approval voting was often repealed later on - e.g. in Dartmouth alumni elections and in internal IEEE (Institute of Electrical and Electronics Engineers) elections (search for IEEE here).

But all indications are that it worked well in those situations, and it was repealed for "political" reasons, plausibly because it worked. Granted, that's little consolation if it manages to be repealed in American political contests. But there are good reasons to believe that's less likely to happen in government elections. E.g. it's harder to imagine a city like Fargo, where approval voting was adopted by a 64% majority, voting to repeal it by a majority via another ballot initiative. And given ranked voting has already been repealed so many times in the U.S., it's not clear that approval voting is riskier in this regard. We'll just to have find out empirically by seeing how it plays out in Fargo this June, and hopefully in other cities in the coming years.

My view is that the urgency of voting reform demands small experiments limited in their scope. If it turns out approval voting actually is better, and its simplicity allows it to scale faster, then I feel it will have been worth it. I think the urgency of issues like climate change and authoritarianism demands a massive scaling out of "electoral technology" that can increasingly democratize the USA in something like a decade. If approval voting fizzles out and/or IRV or other methods turn out to scale faster, it wasn't that expensive of an experiment.

the tendency to favour moderate candidates could also be considered a bias and is not universally viewed as a positive feature of a voting system.

Again, the key here is to look at it through the lens of utility efficiency, which already measures "bias" and any other distortionary factors in terms of human welfare, which is the ultimate metric we want to look at.

It seems to me that effective altruism has not examined approval voting (or alternatives) in sufficient detail.

Speaking as someone who has spent countless hours studying this subject since 2006, whereupon I soon did my first exit poll, I think it has been studied voluminously. Organizational elections, exit polling, computer simulation, and all manner of years-long debates between math PhD's and various engineers have taken place. An NYU professor of game theory and political science has worked on it for four decades and written multiple books on the subject. I've personally visited Kenneth Arrow, who won the Nobel Prize in economics for his study of voting theory going back to the 1950s. We'll still find out new things from what we see in municipal elections like those in Fargo, but I do believe the general arguments you raise have been analyzed inside and out.

In general, my impression is that discussions of voting reform suffer from the problem that people tend to pick their favourite method and then cherry-pick one-sided arguments in favour of it.

Perhaps, but what's relevant is the veracity of the arguments, not the personality traits of people who favor one system or another.

The Center for Election Science often talks about no favourite betrayal (which approval voting satisfies) and not much about later-no-harm (which it fails). FairVote doesn't talk much about no favourite betrayal and talks a lot about later-no-harm - because their favoured method (instant runoff voting) satisfies later-no-harm but fails no favourite betrayal.

I would put this the other way around. Favorite betrayal matters for important game theoretical reasons that I described in detail above, whereas LNH is effectively an "anti-criterion". And from those observations, I form a relative assessment of approval voting and IRV. I do not start with the system and then try to make the facts fit it. Indeed, Warren Smith initially ran his computer simulations with no particular foreknowledge of how it would turn out. He thought the winning system would vary depending on the assumptions (the "knob settings" of the program), but it turned out that score voting won in all 720 different permutations of those settings. And that led him to become a supporter of score voting. When I came across his work, I was initially skeptical and I sent him an email berating him. Then over the course of some email correspondence, he refuted my arguments and made me a convert.

A few years back, an associate of mine added a top two runoff to score voting and "STAR voting" was born. At first I was skeptical that it was a needless additional complexity. But then I saw that Jameson Quinn's voter satisfaction efficiency calculations were quite favorable to it, so I've shifted my thinking substantially.

Since there is (some degree of) consensus that plurality voting is bad, but no consensus on which alternative is best, we should focus on the reform proposals that are most viable. That’s arguably instant runoff voting (IRV, called ranked choice voting / RCV in the US), which is championed by FairVote.

I don't know if there is "consensus", but I do feel the facts objectively show approval voting to be superior to IRV (if not every ranked method) according to essentially every metric we have, from voter satisfaction to ballot spoilage rates to voting machine complexity. I also dispute that IRV is more viable. Approval voting was adopted by a 64% majority in Fargo, and is polling at 72% support in St Louis. I suspect we are about to see that approval voting is more politically viable, due to its simplicity and transparency.

Unlike approval voting, IRV has a track record in competitive elections and is much more in line with conventional notions of “majority”.

But again, it is mathematically proven that majoritarianism is not the right metric. Maybe you're talking more about public perception, but in that case I would again cite the 2-1 margin by which approval voting managed to pass in Fargo, and the 72% support it's getting in St Louis. If this majority failure is a risk in terms of political viability, we're not seeing it.

My personal favourite voting system would be a Condorcet method such as Ranked Pairs, but there are no large organisations advocating this, and it’s unlikely that Condorcet methods will be adopted.

I know I'm repeating myself, but approval voting may in practice be a better Condorcet method than real Condorcet methods. And it performs better under models of significant voter strategy. And of course is much simper and thus politically and logistically easier to implement and scale.

IRV seems clearly superior to plurality voting and has stood the test of time

It was used in nearly two dozen U.S. cities and repealed in all but one of them, then adopted in several cities again, and then repealed in four of them. So it's not clear what its long term staying power is yet.

For parliamentary, I think it’s best to use a form of proportional representation rather than (or in addition to) first-past-the-post in single-seat constituencies.

Whether PR systems can outperform the best single-winner systems, such as STAR voting and approval voting, is very much an open question and highly speculative. But it's also very difficult to adopt in the USA without first getting a system like score voting or approval voting which is capable of ending two-party domination. That's because federal law makes multi-winner districts illegal, and the two-party system will make that impossible to change.

Proportional representation tends to lead to multi-party systems that require cross-party collaboration and reduce the team sport mentality that drives US polarisation.

One could expect that a congress full of moderate/centrist candidates would be even less polarized than one that includes every faction from socialists to fascists.

A steelman of plurality voting is that it grants power to the largest coherent political coalition (coherent in the sense of being able to coordinate on a single candidate).

Computer simulation shows that it is extremely bad. And as a 41-year-old American, I've seen first-hand how much its two-party lock-in fosters binary tribal thinking that makes it impossible to view issues like impeachment or climate change through an objective non-partisan lens. And combined with Gerrymandering, it virtually nullifies democracy. As FairVote says, elections become so predictable that, "Under our current system, we can predict 379 of the 435 House seats — or more than 87 percent of the total — with high confidence."

Comment by ClayShentrup on Thoughts on electoral reform · 2020-02-19T06:40:21.652Z · EA · GW

(From Dartmouth math professor Robert Z. Norman) In 2007 there was a per voter average of voting for 1.81 [of the four] candidates. Hence the proportion of bullet votes had to be fairly small (or else nearly everyone voted for one or all three candidates, but not two, which would seem crazy).

Specifically, if all ballots approved either 1 or 2 candidates, there must have been 19% approve-1 and 81% approve-2 ballots. Norman in later email later hypothesized that actually there may have been a strategy of "either voting for the petition candidate or voting for all [3 opposing] nominated candidates." If that was the only thing going on then 60% of the votes would have been approve-1 and the remaining 40.5% approve-3s, but in this case approval voting was clearly showing its immense value by preventing an enormous "vote-split" among the 3. In any case the fraction of "approve≥2" ballots presumably had to be somewhere between 40.5% and 81%.

https://www.rangevoting.org/DartmouthBack