Reducing long-term risks from malevolent actors 2020-04-29T08:55:38.809Z · score: 236 (100 votes)
Descriptive Population Ethics and Its Relevance for Cause Prioritization 2018-04-03T13:31:32.112Z · score: 43 (28 votes)


Comment by david_althaus on EA considerations regarding increasing political polarization · 2020-06-21T11:00:13.182Z · score: 25 (11 votes) · EA · GW

Great post, thanks for writing this!

Aside from the interventions you and Tobias list, promoting (participation in) forecasting tournaments might be another way to reduce excessive polarization.

Mellers, Tetlock, and Arkes (2018) found that "[...] participants who actively engaged in predicting US domestic events were less polarized in their policy preferences than were non-forecasters. Self-reported political attitudes were more moderate among those who forecasted than those who did not. We also found evidence that forecasters attributed more moderate political attitudes to the opposing side."

However, people who are willing to participate in such a tournament for many months are presumably quite unrepresentative of the general population. Generally, my hunch is that it would be very difficult to convince many people to participate in such tournaments, especially if this requires active participation for a considerable amount of time. Still, promoting the institution of forecasting tournaments would have several other benefits.

(To be clear, I'm not sure that this intervention is particularly promising, I'm mostly brainstorming.)

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-05-11T11:56:18.281Z · score: 1 (3 votes) · EA · GW

Thanks for the pointer!

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-05-08T16:30:04.368Z · score: 5 (4 votes) · EA · GW

Thank you, great points.

It seems you think that one of the essential things is developing and using manipulation-proof measures of malevolence. If you were very confident we couldn't do this, how much of an issue would that be?

I wouldn't say it's "essential"—influencing genetic enhancement would still be feasible—though it would certainly be a big problem.

Regarding objective measures, there will be 'Minority Report' style objections to actually using them in advance, even if they have high predictive power

Yes, that would be an issue.

The area where I see this sort of stuff working best is in large organisations, such as civil services, where the organisations have control over who gets promoted. I'm less optimistic this could work for the most important cases, political elections, where there is not a system that can enforce the use of such measures.

That seems true though it seems at least conceivable that voters will demand such measures in the future. (As an aside, you mention large organisations but it seems such measures could also be valuable when used in smaller (non-profit) organizations?)

But it's not clear to me how much of an innovation malevolence tests are over the normal feedback processes used in large organisations.

Yeah, true. I guess it's also a matter of how much (negative) weight you put on malevolent traits, how much of an effort you make to detect them, and how attentive you are to potential signs of malevolence—most people seem to overestimate their ability to detect (strategic) malevolence (at least I did so before reality taught me a lesson).

It might be worth adding that the reason the Myers-Brigg style personality tests are, so I hear, more popular in large organisations than the (more predictive) "Big 5" personality test is that Myers-Briggs has no ostensibly negative dimensions.

Interesting, that seems plausible! I've always been somewhat bewildered by its popularity.

If this is the case, which seems likely, I find it hard e.g. Google will insist that staff take a test they know will assess them on their malevolence!

True. I guess measures of malevolence would work best as part of the hiring process (i.e., before one has formed close relationships).

As a test for the plausibility of introducing and using malevolence tests, notice that we could already test for psychopathy but we don't. That suggests there are strong barriers to overcome.

I agree that there are probably substantial barriers to be overcome. On the other hand, it seems that many companies are using "integrity tests" which go in a similar direction. According to Sacket and Harris (1984), at least 5,000 companies used "honesty tests" in 1984. Companies were also often using polygraph examinations—in 1985, for example, about 1.7 million such tests were administered to (prospective) employees (Dalton & Metzger, 1993, p. 149)—until they became illegal in 1988. And this even though polygraph tests and integrity tests (as well as psychopathy tests) can be gamed (rather easily).

I could thus imagine that at least some companies and organizations would start using manipulation-proof measures of malevolence (which is somewhat similar to the inverse of integrity) if it was common knowledge that such tests actually had high predictive validity and could not be gamed.

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-05-08T11:33:27.513Z · score: 4 (4 votes) · EA · GW

Thanks, that's a good example.

my impression is that corporations have rewarded malevolence less over time.

Yeah, I think that's probably true.

Just to push back a little bit, the pessimistic take would be that corporate executives simply have become better at signalling and public relations. Maybe also partly because the downsides of having bad PR are worse today compared to, say, the 1920s—back then, people were poorer and consumers didn't have the luxury to boycott companies whose bosses said something egregious; workers often didn't have the option to look for another job if they hated their boss, et cetera. Generally, it seems plausible to me that "humans seem to have evolved to emphasize signaling more in good times than in bad." (Hanson, 2009).

I wonder if one could find more credible signals of things like "caring for your employers", ideally in statistical form. Money invested in worker safety might be one such metric. Salary discrepancies between employees and corporate executives might be another one (which seems to have gotten way worse since at least the 1970s) though there are obviously many confounders here.

The decline in child labor might be another example of how corporations have rewarded malevolence less over time. In the 19th century, when child labor was common, some amount of malevolence (or at least indifference) was arguably beneficial if you wanted to run a profitable company. Companies run by people who refused to employ children for ethical reasons presumably went bankrupt more often given that they could not compete with companies that used such cheap labor. (On the other hand, it's not super clear what an altruistic company owner should have done. Many children also needed jobs in order to be able to buy various necessities—I don't know.)

Maybe this is simply an example of a more general pattern: Periods of history marked by poverty, scarcity, instability, conflict, and inadequate norms & laws will tend to reward or even require more malicious behavior, and the least ruthless will tend to be outcompeted (compare again Hanson's "This is the Dream Time", especially point 4).

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-05-06T14:07:25.500Z · score: 7 (3 votes) · EA · GW

Thanks, these are valid concerns.

  1. It seems that any research on manipulation proof measures for detection for malevolence, would help the development of tools that would be useful for a totalitarian state.

My guess is that not all research on manipulation-proof measures of malevolence would pose such dangers but it’s certainly a risk to be aware of, I agree.

  1. I'm sceptical of further research on malevolence being helpful in stopping these people being in positions of power. At first glance I don't think a really well developed literature on malevolence, would of changed leaders coming to power in 20th century.

In itself, a better scientific understanding of malevolence would not have helped, agreed. However, more reliable and objective ways to detect malevolence might have helped iff there also had existed relevant norms to use such measures and place at least some weight on them.

I think it really matters more who is in charge. I doubt bio-ethicists saying dark triad traits are bad will have much of an effect.

Bioethicists sometimes influence policy though I generally agree with your sentiment. This is also why we have emphasized the value of acquiring career capital in fields like bioinformatics.

In terms of further GWAS studies, I suspect by the time this becomes feasible more GWAS on desirable personality traits will have been undertaken.

I agree that this is plausible—though also far from certain. I’d also like to note that (very rudimentary) forms of embryo selection are already feasible, so the issue might be a bit time-sensitive (especially if you take into account that it might take decades to acquire the necessary expertise and career capital to influence the relevant decision makers).

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-05-06T13:32:43.907Z · score: 6 (5 votes) · EA · GW

Thank you! I agree that the distinction between affective and cognitive empathy is relevant, and that low affective empathy (especially combined with high cognitive empathy) seems particularly concerning. I should have mentioned this, at least in the footnote you quote.

And I remember being told during my psych undergrad that "psychopaths" have low levels of affective empathy but roughly average levels of cognitive empathy, while people on the autism spectrum have low levels of cognitive empathy but roughly average levels of affective empathy. (I haven't fact-checked this, though, or at least not for years.)

That sounds right. According to my cursory reading of the literature, psychopathy and all other Dark Tetrad traits are characterized by low affective empathy. While all Dark Tetrad traits except for narcissism also seem to correlate with low cognitive empathy, the correlation with diminished affective empathy seems substantially more pronounced (Pajevicc et al., 2017, Table 1; Wai & Tiliopoulos, 2012, Table 1).[1] As you write, people on the autism spectrum basically show the opposite pattern (normal affective empathy, lower cognitive empathy (Rogers et al., 2006; Rueda et al., 2015).

We focused on the Dark Tetrad traits because they overall seem to better capture the personality characteristics we find most worrisome. Low affective empathy seems a bit too broad of a category as there are several other psychiatric disorders which don’t seem to pose any substantial dangers to others but which apparently involve lower affective empathy: schizophrenia (Bonfils et al., 2016), schizotypal personality disorder (Henry et al., 2007, Table 2), and ADHD (Groen et al., 2018, Table 2).[2]

Of course, the simplicity of a unidimensional construct has its advantages. My tentative conclusion is that the D-factor (Moshagen et al., 2018) captures the most dangerous personalities a bit better than low affective empathy—though this probably depends on the precise operationalizations of these constructs. In any case, more research on (diminished) affective empathy seems definitely valuable as well.

  1. Though Jonason and Krause (2013) found that narcissism actually correlates with lower cognitive empathy and showed no correlation with affective empathy. ↩︎

  2. This list is not necessarily exhaustive. ↩︎

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-05-06T10:37:40.814Z · score: 3 (2 votes) · EA · GW

Thanks! Added your suggestion.

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-05-05T12:50:43.772Z · score: 4 (3 votes) · EA · GW

A lot of Nazis were interested in the occult, and Mao wrote poetry.

Good point, my comment was worded too strongly. I’d still guess that malevolent individuals are, on average, less interested in things like Buddhism, meditation, or psychedelics.

Do you know where this worry [that psychedelics sometimes seem to decrease people’s epistemic and instrumental rationality] comes from?

Gabay et al. 2019 found that MDMA boosted people's cooperation with trustworthy players in an iterated prisoner's dilemma, but not with untrustworthy players. I take that as some evidence that MDMA doesn't acutely harm one's rationality.

Interesting paper! Though I didn’t have MDMA in mind; with “psychedelics” I meant substances like LSD, DMT, and psilocybin. I also had long-term effects in mind, not immediate effects. Sorry about the misunderstanding.

One reason for my worry is that people who take psychedelics seem more likely to believe in paranormal phenomena (Luke, 2008, p. 79-82). Of course, correlation is not causation. However, it seems plausible that at least some of this correlation is due to the fact that consuming psychedelics occasionally induces paranormal experiences (Luke, 2008, p. 82 ff.) which presumably makes one more likely to believe in the paranormal. This would also be in line with my personal experience.

Coming back to MDMA. I agree that the immediate, short-term effects of MDMA are usually extremely positive—potentially enormous increases in compassion, empathy, and self-reflection. However, MDMA’s long-term effects on those variables seem much weaker, though potentially still positive (see Carlyle et al. (2019, p. 15).

Overall, my sense is that MDMA and psychedelics might have a chance to substantially decrease malevolent traits if these substances are taken with the right intentions and in a good setting—ideally in a therapeutic setting with an experienced guide. The biggest problem I see is that most malevolent people likely won’t be interested in taking MDMA and psychedelics in this way.

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-05-04T17:35:38.174Z · score: 16 (9 votes) · EA · GW

Thank you, excellent points!

I will probably add some of your intervention ideas to the article (I'll let you know in that case).

I felt that this article could have said more about possible policy interventions and that it dismisses policy and political interventions as crowded too quickly.

Sorry about that. It certainly wasn’t our intention to dismiss political interventions out of hand. The main reason for not writing more was our lack of knowledge in this space; which is why our discussion ends with “We nevertheless encourage interested readers to further explore these topics”. In fact, a comment like yours—containing novel intervention ideas written by someone with experience in policy—is pretty much what we were hoping to see when writing that sentence.

Better mechanisms for judging individuals. Eg ensuring 360 feedback mechanisms are used routinely to guide hiring and promotion decisions as people climb political ladders. (I may do work on this in the not too distant future)

Very cool! This is partly what we had in mind when discussing manipulation-proof measures to prevent malevolent humans from rising to power (where we also briefly mention 360 degree assessments).

For what it's worth, Babiak et al. (2010) seemed to have some success with using 360 degree assessments to measure psychopathic traits in a corporate setting. See also Mathieu et al. (2013).

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-05-01T11:49:30.053Z · score: 10 (8 votes) · EA · GW

It seem plausible that institutional mechanisms that prevent malevolent use of power may work well today in democracies.

I agree that they probably work well but there still seems to be room for improvement. For example, Trump doesn't seem like a beacon of kindness and humility, to put it mildly. Nevertheless, he got elected President. On top of that, he wasn't even required to release his tax returns—one of the more basic ways to detect malevolence.

Of course, I agree that stable and well-functioning democracies with good cultural norms would benefit substantially less from many of our suggested interventions.

Also, the major alternative to reducing the influence of malevolent actors may be in the institutional decision making itself, or some structural interventions. AI Governance as a field seems to mostly go in that route, for example.

Just to be clear, I'm very much in favor of such "structural interventions". In fact, they overall seem more promising to me. However, it might not be everyone's comparative advantage to contribute to them which is why I thought it valuable to explore potentially more neglected alternatives where lower-hanging fruits are still to be picked.

That said, I think that efforts going into your suggested interventions are largely orthogonal to these alternatives (and might actually be supportive of one another).

Yes, my sense is that they should be mutually supportive—I don't see why they shouldn't. I'm glad you share this impression (at least to some extent)!

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-04-30T17:31:13.754Z · score: 10 (8 votes) · EA · GW

(Also, it might not be obvious from my nitpicking, but I really like the post, thanks for it :-).)

Thank you. :) No worries, I didn't think you were nitpicking. I agree with many of your points.

[...] if top-secret security clearance was a requirement for holding important posts, a lot of grief would be avoided (at least where I am from). Yet we generally do not use this tool. Why is this? I suspect that whatever the answer is, it will apply to malevolence-detection techniques as well.

One worry with security clearances is that they tend to mostly screen for impulsive behaviors such as crime and drug use (at least, according to my limited understanding of how these security clearances work) and would thus often fail to detect more strategic malevolent individuals.

Also, your claim that “we generally do not use this tool [i.e., security clearances]” feels too strong. For example, 5.1 million Americans seem to have a security clearance. Sounds like a lot to me. (Maybe you had a different country in mind.)

I conjecture that the impact of this agenda will be bottlenecked on figuring out how to leave the malevolent people a line of retreat; making sure that if you score high on this, the implications aren't that bad.

Good point. I guess we weren’t sufficiently clear in the post about how we envision the usage of manipulation-proof measures of malevolence. My view is that their results should, as a general rule, not be made public and that individuals who are diagnosed as malevolent should not be publicly branded as such. (Similarly, my understanding is that if someone doesn’t get a top level security clearance because they, for instance, have a serious psychiatric disorder, they only don't get the job requiring the security clearance—it's not like the government makes their mental health problems public knowledge.)

My sense is that malevolent individuals should only be prevented from reaching highly influential positions like becoming senators, members of Congress, majors, CEOs of leading AGI companies, et cetera. In other words, the great majority of jobs would still be open to them.

Of course, my views on this issue are by no means set in stone and still evolving. I’m happy to elaborate on my reasons for preferring this more modest usage if you are interested.

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-04-30T16:15:40.093Z · score: 7 (6 votes) · EA · GW

Thank you!

Agree, these posts are excellent. For what it's worth, I share Gwern's pessimistic conclusion about the treatment of psychopathy. Other Dark Tetrad traits—especially if they are less pronounced—might be more amenable to treatment though I'm not especially optimistic.

However, even if effective treatment options existed, the problem remains that the most dangerous individuals are unlikely to ever be motivated to seek treatment (or be forced to do so).

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-04-30T12:19:04.820Z · score: 18 (8 votes) · EA · GW

Thank you, good points.

I agree that it’s not clear whether sadism is really worse than the Dark Triad traits.

e.g., Hitler was kind of cautious concerning the risk of nuclear weapons igniting the atmosphere,

I’m not sure whether Hitler and Stalin were more sadistic than they were Machiavellian or narcissistic. If I had to decide, I’d say that their Machiavellianism and narcissism were more pronounced than their sadism.

Regarding Hitler being “kind of cautious”: It seems plausible to me that Hitler was less cautious than a less malevolent counterfactual German leader.

What is more, it seems not too unlikely that Hitler would have used nuclear weapons in 1945 if he had had access to them. For instance, see this quote from Glad (2002, p. 32):

As the Allies closed in on Berlin, [Hitler] wanted churches, schools, hospitals, livestock, marriage records, and almost anything else that occurred to him to be destroyed. In April 1945 he wanted the entire leadership of the Luftwaffe to be summarily hanged. He considered bringing about the destruction of German cities by announcing the execution of all Royal Air Force war prisoners, so that there would be massive bombing in reprisal. He may also have given orders for all wounded German soldiers to be killed. His aim became the destruction of Germany in the greatest Gotterdaemmerung of history. As Albert Speer said about Hitler's obsession with architecture: "Long before the end I knew that Hitler was not destroying to build, he was building to destroy.

You write:

and Stalin was partially useful in WW II and avoided WW III.

I'm not sure about that. Stalin was kind of Hitler’s ally until Hitler invaded Russia. In other words, if it weren’t for Hitler (and his overconfidence), Stalin might have even helped Hitler to win WW II (though Hitler might have lost WW II even with Stalin's support, I don't know). I’m no historian but Stalin’s response to Hitler’s invasion also seems to have been rather disastrous (see also Glad, 2002, p.8) and might have ended up in Russia losing WW II.

Resources like Khlevniuk (2015) also suggest that Stalin, if anything, was detrimental to Russia’ military success, at least for the first years of the war.

It might be hard to spot dark tetrad individuals, but it’s not so hard to realize an individual is narcissistic or manipulative.

I generally agree but there also seem to exist exceptions. The most strategic individuals know that narcissism and manipulativeness can be very off-putting so they will actively try to hide these traits. Stalin, for example, often pretended to be humble and modest, in public as well as in his private life, at least to some success. My impression from reading the various books listed in the references is that Stalin and Mao managed to rise to the top, partly because they successfully deceived others into believing that they are more modest and less capable, ruthless, and strategic than they actually are.

Also, it’s often not clear where justified confidence ends and narcissism begins—certainly not to the admirers of the leader in question. Similarly, what some would see as benign and even necessary instances of being realistic and strategic, others describe as unprincipled and malicious forms of manipulativeness. (The Sanders supporters who don't want to vote for Biden come to mind here even though this example is only tangentially related and confounded by other factors.)

So why do such guys acquire power? Why do people support it? We often dislike acquaintances that exhibit one of the dark traits; then why do people tolerate it when it comes from the alpha male boss?

Good questions. Indeed, some people do not only seem to tolerate these traits in leaders, they seem to be actively drawn to such "strong men" who display certain Dark Tetrad characteristics. I have some theories for why this is but I don't fully understand the phenomenon.

If we could just acknowledge that dark triad traits individuals are very dangerous, even if they’re sometimes useful (like a necessary evil), then perhaps we could avoid (or at least be particularly cautious with) malevolent leaders.

I’m definitely sympathetic to this perspective.

One important worry here is that different movements/political parties seem to be in a social dilemma. Let’s assume—which seems relatively plausible—that leaders with a healthy dose of narcissism and Machiavellianism are, on average, really better at things like motivating members of their movement to work harder, inspiring onlookers to join their movement, creating and taking advantage of high-value opportunities, and so on. If one movement decides to exclude members who show even small signs of Dark Triad traits, it seems plausible that this movement will be at a disadvantage compared to other movements that are more tolerant regarding such traits. Conversely, movements that tolerate (or even value) Dark Triad traits might be even more likely to rise to power. It seems very important to avoid such outcomes.

I’m also somewhat worried about "witch hunts" fueled by simplistic conceptions of malevolence. I know of several individuals who exhibit at least some level of Machiavellian and narcissistic traits—many even say so themselves—but whom I’d still love to give more influence because I believe that they are “good people” who are aware of these traits and who channel these traits towards the greater good. (Admittedly, a supporter of, say, Mao might have said the same thing back in the days.)

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-04-30T11:14:51.550Z · score: 35 (16 votes) · EA · GW

There hasn't been a proper RCT yet though.

Pokorny et al. (2017) seems like a relevant RCT. They found that psilocybin significantly increased empathy.

However, even such results don’t make me very optimistic about the use of psychedelics for reducing malevolence.

The kind of individuals that seem most dangerous (people with highly elevated Dark Tetrad traits who are also ambitious, productive and strategic) seem less likely to be interested in taking psychedelics—such folks don't seem interested in increasing their empathy, becoming less judgmental or having spiritual experiences. In contrast, the participants of the Pokorny et al. study—like most participants in current psychedelics studies (I think)—wanted to take psychedelics which is why they signed up for the study.

Moreover, my sense is that psychedelics are most likely to increase openness and compassion in those who already started out with some modicum of these traits and who would like to increase them further. I’m somewhat pessimistic that giving psychedelics to highly malevolent individuals would make them substantially more compassionate. That being said, I'm certainly not confident in that assessment.

My intuition is partly based on personal experience and anecdotes but also more objective evidence like the somewhat disappointing results of the Concord Prison Experiment. However, due to various methodological flaws, I'd be hesitant to draw strong conclusions from this experiment.

Overall, I’d nevertheless welcome further research into psychedelics and MDMA. It would still be valuable if these pharmaceutical agents “only" increase empathy in individuals who are already somewhat empathic.

Anecdotally, a lot of Western contemplative teachers got started on that path because of psychedelic experiences (Zen, Tibetan Vajrayana, Vipassana, Advaita Vedanta, Kashmiri Shaivism). These traditions are extremely prosocial & anti-malevolent.

My guess is that most Western contemplative teachers who, as a result of taking psychedelics, got interested in Buddhism and meditation (broadly defined) were, on average, already considerably more compassionate, idealistic, and interested in spiritual questions than the type of practically oriented, ambitious, malevolent people I worry about.

As an aside, I'm much more optimistic about the use of psychedelics, empathogens, and entactogens for treating other issues such as depression or PTSD. For example, the early results on using MDMA for treatment-resistant PTSD seem extremely encouraging (and Doblin's work in general seems promising).

Aside from the obvious dangers relating to bad trips, psychosis, neurotoxicity (which seems only relevant for MDMA), et cetera[1], my main worry is that psychedelics sometimes seem to decrease people’s epistemic and instrumental rationality. I also observed that they sometimes seem to have shifted people’s interests towards more esoteric matters and led to “spiritual navel-gazing”—of course, this can be beneficial for people whose life goals are comparatively uninformed.

  1. Though my impression is that these risks can be reduced to tolerable levels by taking psychedelics only in appropriate settings and with the right safety mechanisms in place. ↩︎

Comment by david_althaus on Reducing long-term risks from malevolent actors · 2020-04-30T11:12:52.924Z · score: 4 (4 votes) · EA · GW

Thank you.

The section “What about traits other than malevolence?” in Appendix B briefly discusses this.

Comment by david_althaus on Genetic Enhancement as a Cause Area · 2020-01-20T17:02:46.986Z · score: 3 (3 votes) · EA · GW
I don't know why Tielbeek says that, unless he's confusing SNP heritability with PGS: a SNP heritability estimate is unconnected to sample size. Increasing n will reduce the standard error but assuming you don't have a pathological case like GCTA computations diverging to a boundary of 0, it should not on average either increase or decrease the estimate... Better imputation and/or sequencing more will definitely yield a new, different, larger SNP heritability, but I am really doubtful that it will reach the family-based estimates: using pedigrees in GREML-KIN doesn't reach the family-based Neuroticism estimate, for example, even though it gets IQ close to the IQ lower bound.

Thanks, all of that makes sense, agree. I also wondered why SNP heritability estimates should increase with sample size.

To summarize, my sense is the following: Polygenic scores for personality traits will likely increase in the medium future, but are very unlikely to ever predict more than, say, ~25% of variance (and for agreeableness maybe never more than ~15% of variance). Still, there is a non-trivial probability (>15%) that we will be able to predict at least 10% of variance in agreeableness based on DNA alone within 20 years, and more than >50% probability that we can predict at least 5% of variance in agreeableness within 20 years from DNA alone.

Or do you think these predictions are still too optimistic? shows that the family-specific rare variants (which are still additive, just rare) are almost twice as large as the common variants.

Interesting, thanks.

But couldn’t one still make use of rare variants, especially in genome synthesis? Maybe also in other settings?

The value of the possible selection for the foreseeable future will be very small, and is already exceeded by selection on many other traits, which will continue to progress more rapidly, increasing the delta, and making selection on personality traits an ever harder sell to parents since it will largely come at the expense of larger gains on other traits.

I agree that selecting for IQ will be much easier and more valuable than selecting for personality traits. It could easily be the case that most parents will never select for any personality traits.

However, especially if we consider IES or genome synthesis, even small reductions in dark personality traits—such as extreme sadism—could be very valuable from a long-termist perspective.

For example, assume it’s 2050, IES is feasible and we can predict 5% of the variance in dark traits like psychopathy and sadism based on DNA alone. There are two IES projects: IES project A only selects for IQ (and other obvious traits relating to e.g. health), IES project B selects for IQ and against dark traits, otherwise the two projects are identical. Both projects use 1-in-10 selection, for 10 in vitro generations.

According to my understanding, the resulting average psychopathy and sadism scores of the humans created by project B could be about one SD* lower compared to project A. Granted, the IQ scores would also be lower, but probably by no more than 2 standard deviations (? I don’t know how to calculate this at all, could also be more).

It depends on various normative and empirical views whether this is worth it, but it very well might be: 180+IQ humans with extreme psychopathy or sadism scores might substantially increase all sorts of existential risks, and project A would create almost 17 times** as many such humans compared to project B, all else being equal.

The case for trying to reduce dark traits in humans created via genome synthesis seems even stronger.

One could draw an analogy with AI alignment efforts: Project A has a 2% chance of creating an unaligned AI (2% being the prevalence of humans with psychopathy scores 2 SDs above the norm). Project B has only a 0.1% chance of creating an unaligned AI. Project B is often preferable even if it's more expensive and/or its AI is less powerful.

*See the calculation in my above comment: A PGS explaining 4% of variance in a trait can reduce this trait by 0.2 standard deviations in one generation. This might enable 1 SD (?) in 10 in vitro generations; though I don’t know, maybe one would run out of additive variance long before?

**pnorm(12, mean=10, sd=1, lower.tail=FALSE) / pnorm(12, mean=9, sd=1, lower.tail=FALSE) = 16.85. This defines extreme psychopathy and/or sadism as being 2 SDs or more above the norm, assumes that these traits are normally distributed, and that project B indeed has average scores of 1SD less than project A. (It also assumes IQ means for the two projects are identical, which is not realistic.)

Comment by david_althaus on Genetic Enhancement as a Cause Area · 2019-12-28T16:28:06.920Z · score: 17 (7 votes) · EA · GW

(Epistemic disclaimer: My understanding of genetics is very limited.)

If additive heritability for all the relevant personality traits was zero, many interventions in this area are pointless, yes.

I might have underestimated this problem but one reason why I haven’t given up on the idea of selecting against “malevolent” traits is that I’ve come across various findings indicating SNP heritabilities of around 10% for relevant personality traits. (See the last section of this comment for a summary of various studies).

SNP heritabilities of ~10% (or even more) for relevant personality traits seem also not implausible on theoretical grounds. If I understand Penke et al. (2007, see Table 1 in particular) correctly, balancing-selection models of personality predict that personality traits should show less additive heritability than, say, cognitive ability, but not (necessarily) zero additive heritability.

Granted, 10% is pretty low, but is it hopelessly low? According to Karavani et al. (2019), a polygenic score for IQ which explains 4% of the variance, would enable an average increase of 3 IQ points (assuming 10 available embryos). I infer from this that a polygenic score which can explain only ~4% of the variance in, say, psychopathy would still enable the reduction of ~ 1/5 of a standard deviation in average psychopathy scores, assuming 10 embryos. Polygenic scores explaining ~10% of the variance might thus enable considerably larger average reductions of ⅓ - ½ of a standard deviation or so (numbers pulled out of my posterior).

Again, ⅓ of a SD might seem underwhelming but, as you emphasize in your essay on embryo selection, small changes in the mean of a normal distribution can have large effects out on the tails, so this could still lead to surprisingly large reductions in the frequency of extreme psychopathy or sadism (~psychopathy scores 2-3 SDs above the norm), even in “normal” IVF embryo selection. When applied in iterated embryo selection (IES), this could result in much stronger effects still.

Again, I could easily be wrong about any of the above.

Will SNP heritability estimates increase with larger sample sizes?

This is at least what Tielbeek et al. (2017) suggest: “Recent GWASs on other complex traits, such as height, body mass index, and schizophrenia, demonstrated that with greater sample sizes, the SNP h2 increases. [...] we suspect that with greater sample sizes and better imputation and coverage of the common and rare allele spectrum, over time, SNP heritability in ASB [antisocial behavior] could approach the family based estimates.”

Higher additive heritability for personality disorders?

Another point that makes me somewhat hopeful is that specific personality disorders seem to show larger additive heritabilities than personality traits. For example, the meta-analysis by Polderman et al. (2015, Table 2) suggests that 93% of all studies on specific personality disorders “are consistent with a model where trait resemblance is solely due to additive genetic variation”. (Of note, for “social values” this fraction is still 63%).

And a lot of the benefits in this area might come from selecting against, say, antisocial or narcissistic personality disorder (sadly, sadistic personality disorder is not a thing anymore but it was included in the appendix of DSM-II).

But it’s been a while since I read the Polderman paper and I’m also a bit confused by how there can be high additive heritability for, say, narcissistic personality disorder but very low additive heritability for narcissism as a trait, so the above might be wrong.

Some interventions in this area don’t require additive heritability

There are also interventions that work, even if additive heritability is zero though they assume that the non-additive genetic variance is at least partly due to dominance and not solely due to epistasis; I think. For example, ensuring that the parents of the first generation of IES embryos score low on dark tetrad traits or influencing the first genome synthesis projects to make their first genomes as similar to those of people scoring low on dark tetrad traits as possible (alongside edits to achieve substantial IQ increases, of course).

Lastly, there are interventions that have nothing to do with genetic enhancement but would benefit from more research on and advocacy for screening against malevolent traits and are thus somewhat related to the above. For example, it seems valuable to develop better measures of malevolent traits, potentially ones that are impossible to game such as neuroimaging techniques. Such measures could then be used in various high-impact settings to. For example, they would enable decision makers to better screen for highly elevated dark tetrad traits in government officials, humans whose brains will be used to create the first ems, and human overseers in AI projects. (Currently, all measures of malevolence seem to be self-report questionnaires or interviews which seem easily gameable by smart psychopaths.)

Is non-additive genetic variance really useless?

(No need to reply to the questions in this section.)

Assume that all of the genetic variance in trait A is due to dominance. Wouldn’t it still be possible to achieve non-zero increases/decreases in trait A via (iterated) embryo selection?

And what about epistasis? Is it just that there are quadrillions of possible combinations of interactions and so you would need astronomical sample sizes to achieve sufficient statistical power after correcting for multiple comparisons?

Some evidence for non-zero SNP heritabilities of relevant personality traits

Table 4 of Sanchez‐Roige et al. (2018) provides a good summary. Below, I focus on studies examining traits that likely correlate with dark tetrad traits, such as agreeableness and conscientiousness.

The UK biobank (N ≈ 290k) estimates SNP heritabilities of around 10% for various personality traits, some of which probably even correlate with psychopathy, such as the items “do you often feel guilty?” and “Do you worry too long after an embarrassing experience?”. (Don’t get me wrong, I’m not saying that we should select against traits that only correlate with psychopathy while being completely fine in themselves, like e.g. not often feeling guilty. I'm just listing these items to support the hypothesis that as long as we can find SNP heritability for them, we can expect to find SNP heritabilities for related traits as well.) Unfortunately, the UK biobank didn’t seem to measure any personality trait apart from neuroticism (estimated SNP heritability: 11%). Also, they usually didn’t even use likert scales, only dichotomous yes/no responses, which might reduce heritability estimates (??).

A GWAS (N = 46,861) by Warrier et al. (2018) found an additive heritability, explained by all the tested SNPs, for the Empathy Quotient of 11%. (The Empathy Quotient contains items like “I get upset if I see people suffering on news programmes” and “I really enjoy caring for other people” and thus probably correlates negatively with dark tetrad traits.)

Verweij et al. (2012) give a SNP heritability of 6.6% for harm avoidance which likely correlates with dark tetrad traits.

Lo et al. (2017) estimate SNP-based heritabilities of 18% for extraversion, 8.5% for agreeableness and 9.6% for conscientiousness (see supplementary table 2). (N = 59,176).

Granted, Power and Pluess (2015) estimate SNP heritability of agreeableness and conscientiousness as 0%. However, their sample size of 5,011 is much smaller than the sample sizes above and they write: “It is worth noting that the large standard errors around the negative findings suggest increased sample size may identify a low but significant level of heritability.”

Comment by david_althaus on Genetic Enhancement as a Cause Area · 2019-12-26T16:01:15.546Z · score: 1 (1 votes) · EA · GW

Great! Sent you a PM.

Comment by david_althaus on Genetic Enhancement as a Cause Area · 2019-12-25T17:31:01.386Z · score: 15 (9 votes) · EA · GW

Great post!

I've been thinking along similar lines though I put the emphasis on selecting against dark tetrad traits.

I've written a longer Google Doc (which is not quite ready for publication yet). Would you be interested in taking a look?

Comment by david_althaus on Descriptive Population Ethics and Its Relevance for Cause Prioritization · 2018-04-16T11:54:47.767Z · score: 0 (0 votes) · EA · GW

Interesting, yeah, thx for the pointer!

Comment by david_althaus on Descriptive Population Ethics and Its Relevance for Cause Prioritization · 2018-04-05T09:04:29.252Z · score: 3 (3 votes) · EA · GW

That's not descriptive ethics though, that's regular moral philosophy.

Fair enough. I was trying to express the following point: One of the advantages of descriptive ethics, especially if done via a well-designed questionnaire/survey, is that participants will engage in some moral reflection/philosophy, potentially illuminating their ethical views and their implications for cause prioritization.

For the 2nd point, moral compromise on a movement level makes sense but not in any unique way for population ethics. It's no more or less true than it is for other moral issues relevant to cause prioritization.

I agree that there are other issues, including moral ones, besides views on population ethics (one’s N-ratios and E-ratios, specifically) that are relevant for cause prioritization. It seems to me, however, that the latter are comparatively important and worth reflecting on, at least for people who spent at most a very limited amount of time doing so.

Comment by david_althaus on What do DALYs capture? · 2017-09-28T18:46:13.578Z · score: 3 (3 votes) · EA · GW

As long as people make sensible judgements about the health states that actually occur, it doesn't matter what they say in impossible ones.

Good point. But I wonder whether they reinterpret the meanings of some of the dimensions of the ED-Q5 in order to make sense of some of the health states they are asked to rate.

Unless you know where neutral is you can't specify the minimum point on the scale, because it doesn't make sense.


What would -1 mean here? DALYs and QALYs aren't well-being scales and can't straightforwardly be interpreted as such.

This depends on the study. I'm afraid it will take me a couple of paragraphs to explain the methodology, but I hope you'll bear with me :)

The literature review by Tilling et al. (2010) concluded that only 8% of all TTO studies even allow for subjects to rate health states as worse than death (i.e. as below 0), so for the vast majority of studies, the minimum point on the scale is indeed 0. I think this is problematic since e.g. health states like 33333 (if they are permanent) are probably worse than death for many, maybe even most people.

Of the few TTO studies that allow for negative values, the protocols by Torrence et al. (1982) and Dolan (1997) are used by almost all of them. Below a quote by Tilling et al. (2010), describing these two methods:

The method developed by Torrance et al. (1982) gives respondents a choice between a scenario of living in full health for ti years followed by the state to be valued for tj years (ti + tj= T), followed by death, and an alternative scenario, which is to die immediately. The value T is fixed (e.g., 10 y). The value of ti (and therefore also the value of tj) is varied until a point of indifference is found between the 2 scenarios. The utility value for that health state is then given by – ti/tj. [... Dolan (1997)] used a method similar to this, but the 1st scenario is to live in the health state to be valued for tj years followed by full health for ti years (i.e., the ordering of the 2 states is reversed).”

These two TTO protocols, in theory, would allow for extremely negative (and even infinite) negative values. Tilling et al. (2010) explain:

“[...] negative values can be extremely negative. A participant who would not accept any amount of time, however short, in a poor state of health is implying that such a state is infinitely bad.”

How do researchers respond? Again, I’ll quote Tilling et al. (2010, emphasis mine):

“Given the mathematical intractability of dealing with negative infinity (a single value of negative infinity in a sample of respondents would give a mean value of negative infinity), researchers usually censor such responses. Under such censoring, the lower bound is determined by the (relatively arbitrary) choice of the smallest unit of time the TTO procedure will iterate toward.”

In the two most commonly used TTO protocols, the smallest unit of time the TTO procedure iterates toward for SWD is 1 year. Consequently, the lower bound is -9. (Sometimes, the smallest united of time is 3 months, so the lowest possible value is -39.)

To give a concrete example: The subject is indifferent between A) living for 2 years in full health and for 8 years in health state 33333 and B) dying immediately. Thus, the value for health state 33333, for this subject, is - 8/2 = - 4.

Now almost all researchers then transform these values, such that the lowest possible value is -1. In my view, this is somewhat arbitrary.

Below some quotes by Devlin et al. (2011) on the matter:

“Because the elicitation procedure produces such extreme negative values, researchers have responded by doing ex post transformations to bound negative valuations to - 1 in various ways (Lamers, 2007). Crucially, once transformed, the negative numbers for SWD can no longer be interpreted as ‘utility’ scores, measured on the same scale as those for SBD (Patrick et al., 1994). Yet standard practice in calculating QALYs is to treat all values reported in value sets as commensurable. For example, an improvement from - 0.2 (an SWD) to 0, experienced over one year is interpreted as, producing a gain of 0.2 QALYs, and this is treated [...] as identical to an improvement from 0 to 0.2 experienced for one year, whereas the underlying ‘untransformed value’ for the SWD might suggest these two improvements in health are valued quite differently.”


“A related issue is whether or not values of negative states should be bounded to 1. It is not obvious why there should be no states worse than 1. For example, the phrase ‘it would have been better if he had never been born’ could truly be applied to people who have undergone torture and other types of brief but extreme suffering. There is no theoretical basis for imposing a limit on the level of disutility associated with these extreme sufferings.”

And here another quote by Tilling et al (2010):

“[...] it is not obvious why there should be no states worse than –1. Although it makes data analysis easier to transform values in this fashion, arguably 1 y of extreme pain and discomfort might provide as much disutility as 2 y of full health provides in utility.”

I hope this explains my previous comment.


Devlin, N. J., Tsuchiya, A., Buckingham, K., & Tilling, C. (2011, 02). A uniform time trade off method for states better and worse than dead: Feasibility study of the ‘lead time’ approach. Health Economics, 20(3), 348-361.

Dolan, P. (1997). Modeling Valuations for EuroQol Health States. Medical Care, 35(11), 1095-1108.

Tilling, C., Devlin, N., Tsuchiya, A., & Buckingham, K. (2010, 09). Protocols for Time Tradeoff Valuations of Health States Worse than Dead: A Literature Review. Medical Decision Making, 30(5), 610-619.

Torrance, G. W., Boyle, M. H., & Horwood, S. P. (1982, 12). Application of Multi-Attribute Utility Theory to Measure Social Preferences for Health States. Operations Research, 30(6), 1043-1069.

Comment by david_althaus on What do DALYs capture? · 2017-09-24T10:50:12.069Z · score: 4 (4 votes) · EA · GW

Great post!


For instance, the worst possible health state would be represented by “11111”.

I think "11111" usually refers to full health. (cf. the "EQ-5D Value Sets: Inventory, Comparative Review and User Guide" by Szende, Oppe & Devlin, 2007).

As part of a bigger project on descriptive (population) ethics, I've been working on a literature review of health economics. It also contains a section on the EQ-5D and its weaknesses. Here some excerpts:

Problem II: Impossible health states Another problem is that many health states, such as e.g. 22123 are psychologically impossible or at least very implausible. E.g. if you have “no problems with performing your usual activities (work, study, housework, family or leisure activities, etc.) ”, you can’t, simultaneously, suffer from “extreme depression”. This is immediately obvious to anyone who ever suffered from severe depression.

I’d guess that almost as much as 20% of all EQ-5D health states are psychologically impossible. This indicates that the whole system is suboptimal.

Problem III: Using “immediate death” Another problem is that subjects are often asked to choose between “immediate death” vs. the alternative scenario. However, this means that the subject is unable to say goodbye to their loved ones, or get their affairs in order. Arguably, the difference between dying immediately and dying in e.g. 3 months can make an enormous difference."

(Incorporating the TTO lead-time approach can easily overcome this problem.)

Anway, you write:

First, DALYs are biased towards physical health. The instruments used for eliciting them and affective forecasting errors cause mental health to be underrepresented.

I couldn't agree more.

IMHO, another big problem is the evaluation of states worse than death (SWD) (and states of severe mental illness such as depression arguably belong in this category). For example, most studies don't even allow for SWD assessments. Furthermore, most researchers transform negative evaluations, limiting them to a lower bound of -1. Assuming that people with a history of mental illness more often evaluate health states indicating severe mental illness as highly negative (i.e. give utilities as lower than -1), then this ex-post transformation causes their judgments to have less influence than the judgments of uninformed people who underestimate the severity of mental illness.

I discuss this problem, as well as other problems, in much greater detail in my doc.

I plan on publishing the doc within the next months, but if you're interested I'm happy to send you a link to the current version.

Comment by david_althaus on What the EA community can learn from the rise of the neoliberals · 2017-01-20T17:53:16.749Z · score: 1 (1 votes) · EA · GW

Otoh, a few decades later handwashing did become mainstream. So I'd think that correct and clearly useful models have a great advantage in becoming adopted eventually. Good strategy/movement building is more relevant for hastening the rate of adoption.

To take another example: Communism profited from extremely good strategy/movement building at the beginning (Engels being one of the first EtGlers ever). But it ultimately failed to become widely accepted because it brought about bad consequences. Admittedly, it's still pretty popular, probably because it appeals to human intuitions (such as anti-market bias, etc.)