Open Phil EA/LT Survey 2020: Introduction & Summary of Takeaways

post by reallyeli · 2021-08-19T01:01:19.503Z · EA · GW · 4 comments


  Methodological Notes and Limitations
  Summary of Takeaways


Between September and November 2020, Open Philanthropy ran a survey of ~200 people, most of whom were involved in or interested in longtermist work. Our aim was to get a better understanding of a) these people and b) the extent to which various community-building interventions have or haven’t been important to getting them involved and helping them have more impact. Ultimately, the goal was to make better-informed grantmaking decisions to support these people and increase their number. This project was done as part of our work in our EA community-building program area.

In contrast to the 2020 EA Survey run by Rethink Priorities, our survey was invitation-only. Invites were sent mostly to people we, or our advisors in various longtermist focus areas, knew and thought were likely to have careers with particularly high expected altruistic value (though our judgments here are very rough and we expect them to have missed a lot of such people; see more about this [EA · GW]); as a result, it was taken by about 10x fewer people than the 2020 EA Survey. It was also more in-depth: we asked a lot of questions to each respondent, including questions about their life stories and about specific organizations. And, since our goal was to get fairly up-to-date information about what interventions are affecting people in this space, we focused on people who we thought were likely to have been influenced fairly recently (if they were influenced at all).

This sequence of posts gives an overview of our survey and presents our findings. “I” throughout refers to me, Eli Rose. I did this work in my capacity as a Program Associate at Open Phil. “We” refers to me and Claire Zabel. Claire leads Open Phil’s grantmaking in this area; she project-managed this work.

We’re sharing this report only with the goal of informing others who are interested in our findings; it doesn’t include any updates on Open Phil grantmaking policy.

There are seven posts in this sequence:

You can read this whole sequence as a single Google Doc here, if you prefer having everything in one place. Feel free to comment either on the doc or on the Forum. Comments on the Forum are likely better for substantive discussion, while comments on the doc may be better for quick, localized questions.

This public version of the report doesn’t include all the data we gathered or all the analysis we did. It’s intended primarily to convey some of what we think are the most important takeaways (and how we arrived at them), rather than to transparently report our complete findings in a way that would allow readers to do their own analysis on top of them. For example, we show only a portion of our full numerical results about which organizations, pieces of content, etc. seemed to have had the most positive impact on our respondents (see here [EA(p) · GW(p)]), since for various reasons we didn’t want to publicly rank these items. We also decided to omit some results because they didn’t contribute to our main takeaways, because it seemed too difficult to share them while maintaining our respondents’ privacy, or because we thought there were other costs involved in sharing them. These results are included in a non-public version of the report. If you think having access to this non-public version might help you in your work, you can get in contact with us ( or and we can discuss what else we might be able to share from it (though we can’t guarantee anything).

Methodological Notes and Limitations

Our survey has a time bias, such that it’s better at catching impacts which happened in some years than in others. We tried to bias towards surveying people who entered longtermist priority work fairly recently (because we care most about the current set of processes that cause this to happen), so we tend to miss impacts that happened to people who have been around for a longer time, and hence miss a disproportionate amount of impacts that happened a long time ago. On the other end, because we tried to survey people that our advisors knew pretty well, we tended to survey fewer people who got involved in longtermist priority work very recently, because these people had had less time to become well-known by our advisors. See Who We Surveyed [EA · GW] for more details on this process. Overall, though we don’t have clean data on this, I think most of the impacts recorded in this survey happened between 2015 - 2018. So keep that in mind when interpreting our results, especially our results concerning organizations or pieces of content which have had most of their impact outside that window.

We don’t do significance testing or much other statistical analysis in this analysis; instead, we mostly look at means and medians, make graphs, and present qualitative data. Our sense was that this approach was the best one for our dataset, where the n on any question was at maximum 217 (the number of respondents) and often lower, though we’re open to suggestions about ways to apply statistical analysis that might help us learn more. (As non-experts in statistics, we’re more enthusiastic about simple and transparent methods.)

Relatedly, this report often presents results that are based on fairly small sample sizes. The results would be more robust if sample size were larger. But the type of person we’re trying to study (quite-promising, “recent”-ish people influenced by EA/EA-adjacent ideas and doing longtermist priority work; see here [EA · GW] for more detail) is, as far as we know, rare enough in the world that there just isn’t that much data to get, and rare enough that what we got represents a substantial fraction of the total available. Also, since the average survey respondent seems to us to represent a lot of expected altruistic value, something that e.g. just 5 survey respondents say meaningfully affected them seems worth knowing about — even apart from the extent to which it suggests a larger trend.

Another big limitation of our survey is that it’s primarily data about success stories. It captures very little about people who started down the path to doing longtermist priority work but then veered off of it, or did not start down the path but could have under different conditions. Consequently, we expect it to reveal less (though not zero) information about the extent to which the influences we studied are repelling people from EA/EA-adjacent ideas and communities — since people who were wholly repelled wouldn’t have ended up in our respondent pool (though people who were partially repelled might have).

Sometimes, we’ll compare our results to results from the 2020 EA Survey. This survey wasn’t yet published at the time I was writing up this report, but Rethink Priorities shared some unpublished data with us for use in comparisons. I’ll link to the public results when possible.

Summary of Takeaways

In this section, I’ve attempted to summarize what I think are the most important insights from this project. If you have limited time, this is a good section to read first.

Reasonable people who are all looking at our raw data might disagree about some of our takeaways. This section is focused on expressing what we think we’ve learned, while the rest of the report focuses more on explaining how we came to think we’ve learned it. Each takeaway links to a section in a later post which provides more elaboration about what the claim means and how we arrived at it.

Some of the takeaways mention “impact on our respondents.” When I say that something “has a lot of impact on our respondents,” I’m talking about answers our respondents gave to survey questions that asked those respondents what they think most increased their expected positive impact on the world. At a high level, X “having a lot of impact on our respondents” means that, according to these answers, the total increase in our respondents’ expected positive impact on the world due to X was high. There are some more subtleties (we asked these questions in a few different ways, which gave rise to a few different “impact metrics” or ways of determining total impact on our respondents) — see this post [EA · GW] for details.

Note that we see our results from this survey as inputs into our models about what is having an impact — not as a wholesale replacement for them. They are part of the many types of evidence we take into account when doing grantmaking.


Comments sorted by top scores.

comment by Jonas Vollmer · 2021-08-19T15:50:41.940Z · EA(p) · GW(p)

I think it's really cool that you're making this available publicly, thanks a lot for doing this!

Replies from: MarkusAnderljung
comment by MarkusAnderljung · 2021-08-19T21:53:32.700Z · EA(p) · GW(p)

Came here to say the same thing :)

Replies from: reallyeli
comment by reallyeli · 2021-08-20T03:41:09.462Z · EA(p) · GW(p)

Ah, glad this seems valuable! : )

comment by David_Moss · 2021-08-19T10:41:33.522Z · EA(p) · GW(p)

We don’t do significance testing or much other statistical analysis in this analysis... Our sense was that this approach was the best one for our dataset, where the n on any question was at maximum 217 (the number of respondents) and often lower, though we’re open to suggestions about ways to apply statistical analysis that might help us learn more. 


Because you have so much within-subjects data (i.e. multiple datapoints from the same respondent),  you will actually be much better powered than you might expect with ~200 respondents. For example, if you asked each respondent to rate 10 things (and for some questions you actually asked them to rate many more) you'd have 2000 datapoints and be better able to account for individual differences.

You might, separately, be concerned about the small sample size meaning that your sample is not representative of the population you are interested in: but as you observe here [EA · GW], it looks like you actually managed to sample a very large proportion of the population you were interested in (it was just a small population).