The case for taking AI seriously as a threat to humanity

post by anonymous_ea · 2018-12-23T01:00:08.314Z · score: 18 (9 votes) · EA · GW · 19 comments

This is a link post for https://www.vox.com/future-perfect/2018/12/21/18126576/ai-artificial-intelligence-machine-learning-safety-alignment

This seems likely to be a post that's referred to for some time.

19 comments

Comments sorted by top scores.

comment by Larks · 2018-12-24T01:40:43.332Z · score: 38 (14 votes) · EA(p) · GW(p)

Overall I think this is a great article. It seems like it could be one of the best pieces for introducing new people to the subject.

People sometimes try to gauge the overall views of an author by the relative amounts of page-space they dedicate to different topics, which is bad if you generally agree with something, but want to make a detailed objection to a minor point. I think Kelsey's article is good, and don't want the below to detract from this.

To try to help with this effect, I have deliberately made the top three paragraphs, where I explain that this article is very good before coming to the main point of the comment.

However, I do object to this section:

When you train a computer system to predict which convicted felons will reoffend, you’re using inputs from a criminal justice system biased against black people and low-income people — and so its outputs will likely be biased against black and low-income people too.

The text links to another Vox article, which ultimately linked to this ProPublica article, which argues that a specific reoffending-prediction system was bad because:

The formula was particularly likely to falsely flag black defendants as future criminals, wrongly labeling them this way at almost twice the rate as white defendants.

Separately it notes

When a full range of crimes were taken into account — including misdemeanors such as driving with an expired license — the algorithm was somewhat more accurate than a coin flip. Of those deemed likely to re-offend, 61 percent were arrested for any subsequent crimes within two years.

At this point, alarm bells should be ringing in your head. "More accurate than a coin flip" is not the correct way to analyze the accuracy of a binary test for an outcome unless the actual distribution is also 50:50! If fewer than 50% of people re-offend, a coin flip will get less than 50% right on those is classifies as high risk. Using the coin flip analogy is a rhetorical sleight of hand to make readers adopt the wrong analytical framework, and make the test look significantly worse than it actually is.

Now we've seen that the ProPublica authors perhaps cannot be entirely trusted to represent the data accurately, lets go back to the headline statement: that the false positive rate is higher for blacks than whites.

This is true, but in a trivial sense.

Blacks commit more crime than whites. This is true regardless of whether you look at arrest data, conviction data, or victimization surveys. (Even if you only asked Black people who committed crimes against them, this result still holds. Also holds true just looking at recidivism.) As a result of this base rate, any unbiased algorithm will have more false positives for blacks, even if it is equally accurate for both races at any given level of risk.

Here are some simple numbers, lifted from Chris's excellent presentation on the subject, to illustrate this point:

Simplified numbers: High risk == 60% chance of recidivism, low risk = 20%.
Black people: 60% labelled high risk * 40% chance of no recidivism = 24% chance of “labelled high risk, didn’t recidivate”.
White people: 30% labelled high risk * 40% chance of no recidivism = 12% chance of “labelled high risk, didn’t recidivate”.

It is a trivial statistical fact that any decent statistical test will have a higher false positive rate for subgroups with higher incidence. To avoid this, you'd have to adopt a test which included a specific "if white, increase risk" factor, and you would end up releasing more people who would reoffend, and keeping in jail people who would not. None of these seem like acceptable consequences.

Strangely however, neither the Vox article that this one linked to, nor the original ProPublica piece, mentioned this fact - I suspect due to the same political bias [EA · GW] kbog discussed recently. There are good reasons to be concerned about the application of algorithms in areas like these. But damning the algorithms as racist for statistically misleading reasons, without explaining to readers the underlying reasons for these statistics, suggests that the authors have either failed to understand the data, or are actively trying to mislead their readers. I would recommend against linking to either article in future as evidence for the claims.

EDIT: Washington Post had a very good article explaining this also.

comment by Dallas_Card · 2018-12-29T19:28:32.174Z · score: 5 (5 votes) · EA(p) · GW(p)

Thanks for raising this issue! I agree with some of your points, but I also think things are somewhat more complicated than you suggest.

The comments here suggest that most people have the intuition that a system should not treat individuals differently based only on the colour of their skin. That is a completely reasonable intuition, especially as doing so would be illegal in most contexts. Unfortunately, such a policy can still lead to undesired outcomes. In particular, a system which does not use race can nevertheless end up having differential impacts at the group level, which may also be illegal. (see here; "Disparate impact refers to policies or practices that are facially neutral but have a disproportionately adverse impact on protected classes.")

The first important thing to note here is that, except for one specific purpose (see below), it typically makes little difference whether a protected attribute such as race is explicitly included in a model such as this or not. (In the ProPublica example, it is not). There are enough other things that correlate with race, that it is likely that you can predict it from the rest of the data (probably better than you can predict recidivism in this case).

As a trivial example, one could imagine choosing to release people based only on postal code (zipcode). Such a model does not use race directly, but the input is obviously strongly correlated with race, and could end up having a similar effect as a model which makes predictions using only race. This example is silly because it is so obviously problematic. Unfortunately, with more complex inputs and models, it is not always obvious what combination of inputs might end up being a proxy for something else.

Part of the problem here is that whether or not someone will be re-arrested is extremely difficult to predict (as we would expect, since it may depend on many other things that will happen in the future). The OP wrote that "any unbiased algorithm will have more false positives for blacks, even if it is equally accurate for both races at any given level of risk", but strictly speaking that is not correct. First, I would say that the use of the term "unbiased" here is somewhat tendentious. Bias is a term that is somewhat overloaded in this space, and I expect you mean something more precise. More to the point, however, if we could predict who will be re-arrested with 100% accuracy, then there would be no difference between groups in rates of false positives or accuracy. That being said, it is true that if we cannot achieve 100% accuracy, and there is a difference in base rates, there is an inherent trade-off between being having equal levels of accuracy and equal rates of false negatives / false positives (which is sometimes referred to as equality of odds).

For example, in the extreme, a system which does not use race, and which supposedly "treats all individuals the same", could end up leading to a decision to always release members of one group, and never release members of another group. Although such a system might be equally accurate for both groups, the mistakes for one group would only be false positives, and mistakes for the other group would only be false negatives, which seems both morally problematic and possibly illegal.

The above example is a caricature, but the example reported on by ProPublica is a less extreme version of it. We can debate the merits of that particular example, but the larger problem is that we can't guarantee we'll obtain an acceptable outcome with a rule like "never let the model use race directly."

Unfortunately, the only way to explicitly balance false negatives and false positives across groups is to include the protected attribute (which is the exception hinted at above). If this information is explicitly included, it is possible to construct a model that will have equal rates of false negatives and false positives across groups. The effect of this, however, (ignoring the 100% accuracy case), will be to have different thresholds for different groups, which can be seen as a form of affirmative action. (As an aside, many people in the US seem strongly opposed to affirmative action on principle, but it seems in general like it might plausibly be given consideration as a possibility, if it can be shown to lead to better outcomes.)

Finally, the details of this example are interesting, but I would say the broader point is also important, namely, that models do in fact pick up on existing biases, and can even exacerbate them. Although there clearly are differences in base rates as highlighted by the OP, there are also clear examples of widespread discrimination in how policing has historically been conducted. Models trained on this data will capture and reproduce these biases. Moreover, given the lack of public understanding in this space, it seems likely that such systems might be deployed and perceived as "objective" or unproblematic, without a full understanding of the trade-offs involved.

While ProPublica's headline definitely suppresses some the complexity involved, the article was highly effective in initiating this debate, and is one of the best known examples of how these problems manifest themselves in practice. I would say that people should know about it, but also be provided with pointers to the subsequent discussion.

There are a number of other issues here, including the difficulty of learning from censored data (wherein we don't know what the outcome would have been for those who were not released), as well as the potential for a single model to prevent us from learning as much as we might from behaviour of many separate judges, but that goes well beyond the scope of the original post.

For more on this, there is a great interactive visualization of the problem here and some more discussion of the underlying technical issues here.

comment by Patrick · 2019-12-22T21:55:34.391Z · score: 1 (1 votes) · EA(p) · GW(p)

If you haven't read the article (as I hadn't, since I came by a direct link to this comment), you should know that there's exactly one sentence about algorithmic racial discrimination in the entire article. I was surprised that a single sentence (and one rather tangential to the article) generated this much discussion.

Whatever you think about the claim, it doesn't seem like a sufficient reason not to recommend the article as an introduction to the subject.

comment by Michael_S · 2018-12-24T14:44:03.095Z · score: 0 (10 votes) · EA(p) · GW(p)

In general, I'm glad that it was included because it ads legitimacy to the overall argument with Vox's center-left audience.

comment by Habryka · 2018-12-24T15:00:29.990Z · score: 19 (9 votes) · EA(p) · GW(p)

I strongly prefer building legitimacy with true arguments (I also expect trying to be rigorous and only saying true things will build better long-term legitimacy, though I think I would advocate for being truthful even without that)

comment by KelseyPiper · 2018-12-24T19:49:40.742Z · score: 12 (9 votes) · EA(p) · GW(p)

I agree with Habryka here - it seems potentially very damaging to EA for arguments to be advanced with obvious holes in them, especially if the motivation for that seems to be political. In that spirit I want to find a better source to cite for the point I'm trying to make here. I think EA is really hard. I think we'll consistently get things wrong if we relax our standards for accuracy at all.

I do think criminal justice predictive algorithms are a decent example of ML interpretability concerns and 'what we said isn't what we meant' concerns. I think most people do not actually want a system which treats two identical people differently because one is black and one is white; human values include 'reduce recidivism' but also 'do not evaluate people on the basis of skin color'. But because of the statistical problem, it's actually really hard to prevent a system from either using race or guessing race from proxies and using its best guess of race. That's illegal under current U.S. antidiscrimination law, and I do think it's not really what we want - that is, I think we're willing to sacrifice some predictive power in order to not use race to decide whether people remain in prison or not, just like we're willing to sacrifice predictive power to get people lawyers and willing to sacrifice predictive power to require cops to have a warrant and willing to sacrifice predictive power to protect the right not to incriminate yourself. But none of that nebulous stuff makes it into the classifier, and so the classifier is genuinely exhibiting unintended behavior - and unintended behavior we struggle to make it stop exhibiting, since it'll keep trying to find proxies for race and using them for prediction.

I'm curious if Larks/others think that this summary is decent and would avoid misleading someone who didn't know the stats background; if so, I'll try to write it up somewhere in more depth (or find it written up in more depth) so I can link that instead of the existing links.

comment by Habryka · 2018-12-25T15:52:49.744Z · score: 7 (4 votes) · EA(p) · GW(p)

I think I might disagree with the overall point? I don't have super strong moral intuitions here (maybe because I am from Germany, which generally has a lot less culture around skin-color based discrimination because it's a more ethnically uniform country?).

None of the examples you listed as being analogous to skin-color discrimination strike me as fundamentally moral. Let's walk through them one by one:

just like we're willing to sacrifice predictive power to get people lawyers

I would guess that we give people lawyers to ensure the long-term functionality of the legal system, which will increase long-run accuracy. Lawyers help determine the truth by ensuring that the state actually needs to do its job in order to convict someone. They strike me as essential instruments that increase accuracy, not decrease accuracy.

and willing to sacrifice predictive power to require cops to have a warrant

Again, we require cops to have a warrant in order to limit the direct costs of house searches. Without need for warrants there would be a lot more searches, which would come at pretty significant cost for the people being searched. The warrants just ensure that there is significant cause for a search, in order to limit the amount of collateral damage the police causes. I don't see how this is analogous to the skin-color situation.

and willing to sacrifice predictive power to protect the right not to incriminate yourself

I think the reason why we have this rule is mostly because the alternative isn't really functional. Forcing people to incriminate themselves will probably lead to much less cooperation with the legal system, and incur large costs on anyone involved with it. I don't personally see any exceptional benefits that come from having this rule, outside of its instrumental effects on the direct costs and long-run accuracy of the legal system.

This doesn't mean that I don't think there are good arguments for potentially limiting what information we want to use about a person, but I don't think the examples you used are illustrative of the situation with the skin-color discrimination.

I am currently 80% on "there is no big problem with using skin color as a discriminator in machine learning in criminal justice, in the same way we would obviously use height or intelligence or any other mostly inherent attribute". Given that, I think it would be a lot better to replace that whole section with something that actually has solid moral foundations for it (of which I think there are many).

comment by KelseyPiper · 2018-12-25T18:21:03.216Z · score: 9 (6 votes) · EA(p) · GW(p)

Huh, yeah, I disagree. It seems to me pretty fundamental to a justice system's credibility that it not imprison one person and free another when the only difference between them is the color of their skin (or, yes, their height), and it makes a lot of sense to me that U.S. law mandates sacrificing predictive power in order to maintain this feature of the system.

Similarly, I don't think all of the restrictions the legal system imposes on what kinds of evidence to use are, in fact, motivated by long-term harm-reduction considerations. I think they're motivated by wanting the system to embody the ideal of justice. EAs are mostly consequentialists (I am) and mostly only interested in harm, not in fairness, but I think it's important to realize that the overwhelming majority of people care a lot about whether a justice system is fair in addition to whether it is harm-reducing, and that this is the actual motivation for the laws I discuss above, even if you can technically propose a defense of them in harm-reduction terms.

comment by Habryka · 2018-12-25T18:41:02.361Z · score: 7 (8 votes) · EA(p) · GW(p)
but I think it's important to realize that the overwhelming majority of people care a lot about whether a justice system is fair in addition to whether it is harm-reducing, and that this is the actual motivation for the laws I discuss above, even if you can technically propose a defense of them in harm-reduction terms.

I take the same stance towards moral arguments as I do towards epistemic ones. I would be very sad if EAs make moral arguments that are based on bad moral reasoning just because they appeal to the preconceived notions of some parts of society. I think most arguments in favor of naive conceptions of fairness fall into this category, and I would strongly prefer to advocate for moral stances that we feel confident in, have checked their consistency and feel comfortable defending on our own grounds.

comment by KelseyPiper · 2018-12-26T20:38:01.004Z · score: 12 (10 votes) · EA(p) · GW(p)

Hmm. I think I'm thinking of concern for justice-system outcomes as a values difference rather than a reasoning error, and so treating it as legitimate feels appropriate in the same way it feels appropriate to say 'an AI with poorly specified goals could wirehead everyone, which is an example of optimizing for one thing we wanted at the expense of other things we wanted' even though I don't actually feel that confident that my preferences against wireheading everyone are principled and consistent.

I agree that most peoples' conceptions of fairness are inconsistent, but that's only because most peoples' values are inconsistent in general; I don't think it means they'd necessarily have my values if they thought about it more. I also think that 'the U.S. government should impose the same prison sentence for the same crime regardless of the race of the defendant' is probably correct under my value system, which probably influences me towards thinking that other people who value it would still value it if they were less confused.

Some instrumental merits of imposing the same prison sentence for the same crime regardless of the race of the defendant:

I want to gesture at something in the direction of pluralism: we agree to treat all religions the same, not because they are of equal social value or because we think they are equally correct, but because this is social technology to prevent constantly warring over whose religion is correct/of the most social value. I bet some religious beliefs predict less recidivism, but I prefer not using religion to determine sentencing because I think there are a lot of practical benefits to the pluralistic compromise the U.S. uses here. This generalizes to race.

There are ways you can greatly exacerbate an initially fairly small difference by updating on it in ways that are all technically correct. I think the classic example is a career path with lots of promotions, where one thing people are optimizing for at each level is the odds of being promoted at the next level; this will result in a very small difference in average ability producing a huge difference in odds of reaching the highest level. I think it is good for systems like the U.S. justice system to try to adopt procedures that avoid this, where this is sane and the tradeoffs relatively small.

(least important): Justice systems run on social trust. If they use processes which undermine social trust, even if they do this because the public is objectively unreasonable, they will work less well; people will be less likely to report crimes, cooperate with police, testify, serve on juries, make truthful decisions on juries, etc. I know that when crimes are committed against me, I weigh whether I expect the justice system to behave according to my values when deciding whether to report the crimes. If this is common, there's reason for justice systems to use processes that people consider aligned. If we want to change what people value, we should use instruments for this other than the justice system.

comment by KelseyPiper · 2018-12-25T18:26:25.902Z · score: 5 (3 votes) · EA(p) · GW(p)

Expanding on this: I don't think 'fairness' is a fundamental part of morality. It's better for good things to happen than bad ones, regardless of how they're distributed, and it's bad to sacrifice utility for fairness.

However, I think there are some aspects of policy where fairness is instrumentally really useful, and I think the justice system is the single place where it's most useful, and the will/preferences of the American populace is demonstrably for a justice system to embody fairness, and so it seems to me that we're missing a really important point if we decide that it's not a problem for a justice system to badly fail to embody the values it was intended to embody just because we don't non-instrumentally value fairness.

comment by Habryka · 2018-12-25T18:37:39.090Z · score: 15 (6 votes) · EA(p) · GW(p)

My perspective here is that many forms of fairness are inconsistent, and fall apart on significant moral introspection as you try to make your moral preferences consistent. I think the skin-color thing is one of them, which is really hard to maintain as something that you shouldn't pay attention to, as you realize that it can't be causally disentangled from other factors that you feel like you definitely should pay attention to (such as the person's physical strength, or their height, or the speed at which they can run).

Not paying attention to skin color has to mean that you don't pay attention to physical strength, since those are causally entangled in a way that makes it impossible to pay attention to one without the other. You won't ever have full information on someone's physical strength, so hearing about their ethnic background will always give you additional evidence. Skin color is not an isolated epiphenomenal node in the causal structure of the world, and you can't just decide to "not discriminate on it" without stopping to pay attention to every single phenomenon that is correlated with it and that you can't fully screen off, which is a really large range of things that you would definitely want to know in a criminal investigation.

comment by Kaj_Sotala · 2018-12-27T13:15:38.044Z · score: 8 (7 votes) · EA(p) · GW(p)
My perspective here is that many forms of fairness are inconsistent, and fall apart on significant moral introspection as you try to make your moral preferences consistent. I think the skin-color thing is one of them, which is really hard to maintain as something that you shouldn't pay attention to, as you realize that it can't be causally disentangled from other factors that you feel like you definitely should pay attention to (such as the person's physical strength, or their height, or the speed at which they can run).

I think that a sensible interpretation of "is the justice system (or society in general) fair" is "does the justice system (or society) reward behaviors that are good overall, and punish behaviors that are bad overall"; in other words, can you count on society to cooperate with you rather than defect on you if you cooperate with it. If you get jailed based (in part) on your skin color, then if you have the wrong skin color (which you can't affect), there's an increased probability of society defecting on you regardless of whether you cooperate or defect. This means that you have an extra incentive to defect since you might get defected on anyway. This feels like a sensible thing to try to avoid.

comment by KelseyPiper · 2018-12-26T19:52:09.694Z · score: 3 (3 votes) · EA(p) · GW(p)

This is not for criminal investigation. This is for, when a person has been convicted of a crime, estimating when to release them (by estimating how likely they are to commit another crime).

comment by Habryka · 2018-12-27T03:18:19.240Z · score: 3 (3 votes) · EA(p) · GW(p)

Will write a longer reply later, since I am about to board a plane.

I was indeed thinking of a criminal investigation context, but I think the question of how likely someone is to commit further crimes is likely to be directly related to their ability to commit further crimes, which will depend on many of the variables I mentioned above, and so the same argument holds.

I expect those variables to still be highly relevant when you want to assess the likelihood of another crime, and there are many more that are more obviously relevant and also correlated with race (such as their impulsivity, their likelihood to get addicted to drugs, etc.). Do you think we should not take into account someone's impulsivity when predicting whether they will commit more crimes?

comment by Habryka · 2018-12-25T18:34:52.737Z · score: 3 (2 votes) · EA(p) · GW(p)
Huh, yeah, I disagree. It seems to me pretty fundamental to a justice system's credibility that it not imprison one person and free another when the only difference between them is the color of their skin (or, yes, their height), and it makes a lot of sense to me that U.S. law mandates sacrificing predictive power in order to maintain this feature of the system.

If the crime was performed by someone who had to be at least 2m tall, and one of the suspects is 2.10m and other one is 1.60m tall, then it seems really obvious to me that you should use their height as evidence? I would be deeply surprised if you think otherwise.

comment by carl · 2018-12-26T06:29:27.811Z · score: -2 (6 votes) · EA(p) · GW(p)
If the crime was performed by someone who had to be at least 2m tall, and one of the suspects is 2.10m and other one is 1.60m tall, then it seems really obvious to me that you should use their height as evidence?

That's not what these articles describe--the algorithm in question wasn't being used to determine whether a suspect had committed a crime, it was being used for risk assessment, ie determine the probability that a person convicted of one crime will go on to commit another.

comment by Liam_Donovan · 2018-12-28T20:39:40.680Z · score: -2 (5 votes) · EA(p) · GW(p)

1. A system that will imprison a black person but not an otherwise-identical white person can be accurately described as "a racist systsem"

2. One example of such a system is employing a ML algorithm that uses race as a predictive factor to determine bond amounts and sentencing

3. White people will tend to be biased towards more positive evaluations of a racist system because they have not experienced racism, so their evaluations should be given lower weight

4. Non-white people tend to evaluate racist systems very negatively, even when they improve predictive accuracy

To me, the rational conclusion is to not support racist systems, such as the use of this predictive algorithm.

It seems like many EAs disagree, which is why I've tried to break down my thinking to identify specific points of disagreement. Maybe people believe that #4 is false? I'm not sure where to find hard data to prove it (custom Google survey maybe?). I'm ~90% sure it's true, and would be willing to bet money on it, but if others' credences are lower that might explain the disagreement.

Edit: Maybe an implicit difference is epistemic modesty regarding moral theories -- you could frame my argument in terms of "white people misestimating the negative utility of racial discrimination", but I think it's also possible for demographic characteristics to bias one's beliefs about morality. There's no a priori reason to expect your demographic group to have more moral insight than others; one obvious example is the correlation between gender and support for utilitarianism. I don't see any reason why men would have more moral insight, so as a man I might want to reduce my credence in utilitarianism to correct for this bias.

Similarly, I expect the disagreement between a white EA who likes race-based sentencing and a random black person who doesn't to be a combination of disagreement about facts (e.g. the level of harm caused by racism) and moral beliefs (e.g. importance of fairness). However, *both* disagreements could stem from bias on the EA's part, and so I think the EA ought not discount the random guy's point of view by assigning 0 probability to the chance that fairness is morally important.

comment by Michael_S · 2018-12-24T19:58:51.310Z · score: 9 (7 votes) · EA(p) · GW(p)

I'm not arguing for arguing for false arguments; I'm just saying that if you have a point you can make around racial bias, you should make that argument, even if it's not an important point for EAs, because it is an important one for the audience.