I am the Principal Research Manager at Rethink Priorities working on, among other things, the EA Survey, Local Groups Survey, and a number of studies on moral psychology, focusing on animal, population ethics and moral weights.
In my academic work, I'm a Research Fellow working on a project on 'epistemic insight' (mixing philosophy, empirical study and policy work) and moral psychology studies, mostly concerned either with effective altruism or metaethics.
I've previously worked for Charity Science in a number of roles and was formerly a trustee of EA London.
Many thanks for checking in and sharing the survey! I can confirm that we're leaving the survey open until the end of the year now, like last year. Please also view this post about the extra questions about FTX that have been added now, and the separate survey people can take with questions about FTX, if they have already completed the survey.
All of the questions you had substantive comments about the wording of were external requests which we included verbatim.
Re. the order of the dates: when testing, we found some people thought this was more intuitive left to right and some top to bottom (for context, it's not specifically designed as 'grid', it just happens that the columns are similar lengths to the rows). It could be put in a single column, though not without requiring people to scroll to see the full questions, or changing the style so that it doesn't match other questions. Exactly how you see the questions will vary depending on whether you're viewing on PC, phone or tablet though.
35% of people said personal contact with EAs was important for them getting involved
38% said personal contacts had the largest influence on their personal ability to have a positive impact
I just wanted to check whether you didn’t accidentally cite the OP Survey rather than the EA Survey? These results and question wording are identical to the 2020 EA Survey: 35.4% said “personal contact with EAs” was “important for [them] getting involved in EA”, 38.7% said personal contact with EAs… had the largest influence on your personal ability to have a personal impact?”
It’s true that the OP survey and EAS asked many of the same questions, and it’s true that the OP survey and EAS tend to get exceptionally similar results (when you filter for highly engaged EAs), but that seems like quite the coincidence, and I don't think OP actually asked both these questions.
Fwiw, I think the OP Survey and EAS are both complementary and it's generally good to cite both. Much more could be written about the circumstances in which it makes sense to use different results of each of these surveys, since I think it is not straightforward. I'd like the survey team to do this sometime, but we lack capacity at present.
Thanks for the suggestion! We can certainly add something about this to the landing page. [And have now done so]
I would also note that this text is usually also already included where the survey is distributed. i.e. when the survey is distributed through the EA Newsletter or CEA social media, it will go out with a message like "If you think of yourself, however loosely, as an “effective altruist,” please consider taking the survey — even if you’re very new to EA! Every response helps us get a clearer picture" before people see the survey. That kind of message didn't seem so necessary on the EA Forum announcement, since this is already a relatively highly engaged audience.
We'd like to include more questions in the extra credit section, and I agree it would be useful to ask more about the topics you suggest.
Unfortunately, we don't find that adding more questions to the extra credit section is completely 'free'. Even though it's explicitly optional, we still find people sometimes complain about the length including the optional extra credit section. And there's still a tradeoff in terms of how many people complete all or part of the extra credit section. We'll continue to keep track of how many people complete the survey (and different sections of it) over time to try to optimise the number of extra questions we can include. For example, last year about 50% of respondents started the extra credit section and about 25% finished it.
Notably we do have an opt-in extra survey, sent out some time after the main EA Survey. Previously we've used this to include questions requested by EA (usually academic) researchers, whose questions we couldn't prioritise including in the main survey (even in extra credit). Because this is completely opt-in and separate from the EA Survey, we're more liberal about including more questions, though there are still length constraints. Last year about 60% (900) people opted in to receive this, though a smaller number actually completed the survey when it was sent out.
We've previously included questions on some of the topics which you mention, though of course not all of them are exact matches:
Moral views: We previously asked about normative moral philosophy, metaethics, and population ethics
Identification with EA label: up until 2018, we had distinct questions asking whether people could be "described as "an effective altruist"" and whether they "subscribe to the basic ideas behind effective altruism". Now we just have the self-report engagement scale. I agree that more about self-identification with the EA label could be interesting.
Best and worst interactions with EA: we've definitely asked about negative interactions or experiences in a number of different questions over the years. We've not asked about best interactions, but we have asked people to name which individuals (if any) have been most helpful to them on their EA journey.
Community building preferences: we've asked a few different open-ended questions about ways in which people would like to see the community improve or suggestions for how it could be improved. I agree there's more that would be interesting to do about this.
Is the paragraph below saying that surveying the general population would not provide useful information, or is it saying something like 'this would help, but would not totally address the issue'.
It's just describing limitations. In principle, you could definitely update based on representative samples of the general population, but there would still be challenges.
Notably, we have already run a large representative survey (within the US), looking at how many people have heard of EA (for unrelated reasons). It illustrates one of the simple practical limitations of using this approach to estimate the composition of the EA community, rather than just to estimate how many people in the public have heard of EA.
Even with a sample of n=6000, we still only found around 150 people who plausibly even knew what effective altruism was (and we think this might still have been an over-estimate). Of those, I'd say no more than 1-3 seemed like they might have any real engagement with EA at all. (Incidentally, this is roughly a ratio that seems plausible to me for how many people who hear of EA actually then engaged with EA at all, i.e. 150-50:1 or less.) Note that we weren't trying to see whether people were members of the EA community in this survey, so the above estimate is just based on those who happened to mention enough specifics- like knowing about 80,000 Hours- that it seemed like they might have been at all engaged with EA). So, given that, we'd need truly enormous survey samples to sample a decent number of 'EAs' via this method, and the results would still be limited by the difficulties mentioned above.
Thanks for asking! We would definitely encourage community builders to share it with their groups. Indeed, in previous years, CEA has contacted group organizers directly about this. We would also encourage EAs to share it with others EAs (e.g. on their Facebook page) using the sharing link. I would not be concerned about you 'skewing the results' by sharing and encouraging people to take the survey in this way, in general, so long as you don't go to unusual lengths to encourage people (e.g. multiple reminders, offering additional incentives to complete it etc.).
Thanks for asking! This is a question requested by another org (in 2019), so I can't give you the definitive authorial intent. But we would interpret this as including virtual personal contact too (not just in-person contact).
WAY too many of the questions only allow checking a single box, or a limited number of boxes. I'm not sure why you've done this? From my perspective it almost never seems like the right thing, and it's going to significantly reduce the accuracy of the measurements you get, at least from me.
Thanks for your comment. A lot of the questions are verbatim requests from other orgs, so I can't speak to exactly what the reasons for different designs are. Another commenter is also correct to mention the rationale of keeping the questions the same across years (some of these date back to 2014), even if the phrasing isn't what we would use now. There are also some other practical considerations, like wanting to compare results to surveys that other orgs have already used themselves.
That said, I'm happy to defend the claim that allowing respondents to select only a single option is often better than allowing people to select any number of boxes. People (i.e. research users) are often primarily interested in the _most_ important or _primary_ factors for respondents, for a given question, rather than in all factors. With a 'select all' format, one loses the information about which are the most important. Of course, ideally, one could use a format which captures information about the relative importance of each selected factor, as well as which factors are selected. For example, in previous surveys we've asked respondents to rate the degree of importance of each factor , as well as which factors they did not have a significant interaction with. But the costs here are very high, as answering one of these questions takes longer and is more cognitively demanding than answering multiple simpler questions. So, given significant practical constraints (to keep the survey short, while including as many requests as possible), we often have to use simpler, quicker question formats.
Regarding politics specifically, I would note that asking about politics on a single scale is exceptionally common (I'd also add that single-select format for religion is very standard e.g. in the CES). I don't see this as implying a belief that individuals believe in a single "simple political identity or political theory." The one wrinkle in our measure is that 'libertarian' is also included a distinct category (which dates back to requests in 2014-2015 and, as mentioned above, the considerations in favour of keeping questions consistent across years are quite strong). Ideally we could definitely split this out so we have (at least) one scale, plus a distinct question which captures libertarian alignment or more fine-grained positions. But there are innumerable other questions which we'd prioritise over getting more nuanced political alignment data.
Thanks! We think about this a lot. We have previously discussed this and conducted some sensitivity testing in this dynamic document.
I wonder whether it could be useful to survey a sample from an actually random frame to just get some approximate idea of the size and demographics of the EA community.
The difficulty here is that it doesn't seem to be possible to actually randomly sample from the EA population. At best, we could randomly sample from some narrower frame (e.g. people on main newsletter mailing list, EA Forum users), but these groups are not likely to be representative of the broader community. In the earliest surveys, we actually did also report results from a random sample drawn from the main EA Facebook group. However, these days the population of the EA Facebook group seems quite clearly not representative of the broader community, so the value of replicating this seems lower.
The more general challenge is that no-one knows what a representative sample of the EA community should look like (i.e. what is the true composition of the EA population). This is in contrast to general population samples where we can weight results relative to the composition found in the US census. I think the EA Survey itself represents the closest we have to such a source of information.
That said, I don't think we are simply completely in the dark when it comes to assessing representativeness. We can test some particular concerns about potential sources of unrepresentativeness in the sample (and have done this since the first EA Survey). For example, if one is concerned that the survey samples a disproportionate number of respondents from particular sources (e.g. LessWrong), then we can assess how the samples drawn from those sources differ and how the results for the respondents drawn from those sources differ. Last year, for example, we examined how the reported importance of 80,000 Hours differed if we excluded all respondents referred to the survey from 80,000 Hours, and still found it to be very high. We can do more complex/sophisticated robustness/sensitivity checks on request. I'd encourage people to reach out if they have an interest in particular results, to see what we can do on a case by case basis.
I'd say it's pretty uncertain when we'll start publishing the main series of posts. For example, we might work on completing a large part of the series, before we start releasing individual posts, and we may use a format this year where we put more results on a general dashboard, and then include a smaller set of analyses in the main series of posts. But best guess is early in the new year.
That said, we'll be able to provide results/analyses for specific questions you might want to ask about essentially immediately after the survey closes.
Thanks for the post! I agree that this would be interesting and potentially valuable.
Notably we have included some more philosophical questions in the EA Survey in previous years. We included some questions about population ethics, which were requested by an org, in 2019, and some additional measures on some broad normative questions (e.g. "The impact of our actions on the very long-term future is the most important consideration when it comes to doing good"), which were requested by a different org, in 2020. We haven't publicised the results of those requested questions in a post, though we could do so. In this year's EA Survey, we included an even greater number of broadly philosophical questions of that kind, though, as before, they are in the especially optional 'Extra Credit' section of the survey.
I agree that surveying EA's views about more (and more explicitly philosophical) questions would still be valuable. We haven't included more questions along these lines due to space constraints. Rethink Priorities would potentially be interested in running such a survey- we have a number of philosophers and social scientists on staff- though we think the risk of 'survey fatigue' means we should be careful before launching more surveys of the community. A survey like this could also potentially be combined with other projects that seem potentially valuable to the community (e.g. getting a more fine-grained sense of the beliefs and cause prioritizations of highly informed EAs).
Thanks for asking. We're thinking of having a public dashboard showing the results for each month. At present, we're not thinking of posting each month's results on the Forum, but rather posting key results and intermittent updates. We think separate Forum posts every month might be unnecessary, since many of the results of the monthly tracker element of EA Pulse will make most sense in the context of multiple months having been run.
If we look at the median age at which people first got involved in EA over the last few years (split by how many years they've been involved in EA to account for differential attrition), we can see that the median age of people first getting involved in EA in 2018, 2019 and 2020 declined (from 27 to 25 and then to 24).
I think the age at which people get involved in EA seems most relevant to your question, since average age across survey years is influenced by other factors (e.g. the mean age in EAS 2020 was lower, but this is largely due to EAS 2020 having more people in the newest cohorts than in previous survey years). But let me know if there's something else in particular you want to see.
For ease of comparison, we re-plotted the EA Survey data using the same format as your plot below.
Version 1 shows the actual distribution using the same categories you used. As you can see, it shows the EA community isn't actually very heavily skewed towards 16-25 year olds. In fact, there are more 26-35 year olds.
That said, these broad categories can be misleading. If we look at the cuts in version 2, we can see that there are more 19-25 year olds than 26-30 year olds, and more of a general pyramid shape with progressively fewer people in each older bracket. So one can see from this why one might view EA as being very skewed towards the young.
On the other hand you can also see from version 2 that there are very few people who are 18 or younger. Moreover, you can see from version 3, that there are also very few people who are 20 or younger. Indeed, there are fewer people who are 20 or younger than there are 36-45 (even though there are very few people who are 36-45 (only about 16% are >35).
In short, EA is correctly thought of as a community dominated by ~21-34 year olds (over 2/3rds of the community are in this bracket), but not as very skewed towards the youngest brackets (16-25).
Age of recruitment to EA
I also think it's worth noting that the median age at which people first get involved in EA is not super young (around 24 across years, close to the cusp of your two main categories). In our survey of the general public, people aged 18-24 were also less likely to report having heard of EA than 25-44 year olds. This is compatible with student-recruitment being important for the community (even if the average age of recruitment is somewhat older than student age), and slightly pushing the age of the community in a younger direction, but the effect is small.
This is not to say that the community wouldn't benefit from more older (>35) professionals (which seems very plausible). But the problem seems broader than a student focus. Indeed, the vast majority of people in EA do not get involved due to student outreach (and plausibly, not due to any kind of direct EA outreach); yet I think we still tend to attract and retain predominantly people in their mid-20s to early 30s.
Thanks Alexander! I appreciate the offer to meet to talk about your experiences, that sounds very useful!
Who are the users for this survey, how will they be involved with the design, and how will findings be communicated with them?
We envisage the main users of the survey being EA orgs and decision-makers. We’ve already been in touch with some of the main groups and will reach out to some key ones to co-ordinate again now that we’ve formally announced. That said, we’re also keen to receive suggestions and requests from a broader set of stakeholders in the community (hence this announcement).
The exact composition of the survey, in terms of serving different users, will depend on how many priority requests we get from different groups, so we’ll be working that out over the course of the next month as different groups make requests.
Will data, materials, code and documentation from the survey be made available for replication, international adaptation, and secondary analysis?
Related to the above, we don’t know exactly how much we’ll be making public, because we don’t know how much of the survey will be part of the core public tracker vs bespoke requests from particular decision makers (which may or may not be private/confidential). That said, I’m optimistic we’ll be able to make a large amount public (or shared with relevant researchers) regarding the core tracker (e.g. for things we are reporting publicly).
Was there a particular reason to choose a monthly cycle for the survey? Do you have an end date in mind or are you hoping to continue indefinitely?
We’re essentially trialing this for 12 months, to see how useful it is and how much demand there seems to be for it, after which, if all goes well, we would be looking to continue and/or expand.
The monthly cadence is influenced by multiple considerations. One is that, ideally, we would be able to detect changes over relatively short time-scales (e.g. in response to media coverage), and part of this trial will be to identify what is feasible and useful. Another consideration is that running more surveys within the time span will allow us to include more ad hoc time sensitive requests by orgs (i.e. things they want to know within a given month, rather than things we are tracking across time). I think it’s definitely quite plausible we might switch to a different cadence later, perhaps due to resource constraints (including availability of respondents).
I would agree that more general or fundamental attitudes are unlikely to change on a monthly cadence. I think it’s more plausible to see changes on a short time-frame for some of the more specific things we’re looking at (e.g awareness of or attitude towards particular (currently) low salience issues or ideas).
The short answer is simply that the vast majority of projects requested of us are highly time sensitive (i.e. orgs want them completed within very fast timeline), so we need to have the staff already in place if we’re to take them on, as it’s not possible to hire staff in time to complete them even if they are offering more than enough funding (e.g. 6 or 7 figures) to make it happen.
This is particularly unfortunate, since we want to grow our team to take on more of these projects, and have repeatedly turned down many highly skilled applicants who could do valuable work, exclusively due to lack of funding.
Still, I would definitely encourage people to reach out to us to see whether we have capacity for projects.
This seems very plausible to me. Personal connections repeatedly appear to be among the most important factors for promoting people's continued involvement in and increased engagement with EA (e.g. 2019, 2020).
That said very few EAs appear to have any significant number of EAs who they would "feel comfortable reaching out to to ask for a favor" (an imperfect proxy for "friend" of course).
Anecdotally, EAs I speak to are usually surprised by how low these numbers are. (These are usually highly engaged EAs with lots of connections, who therefore likely have a very unusual experience of the EA community).
And yet these numbers are themselves almost certainly a large over-estimate of the total community, since respondents are themselves more likely to be highly engaged, and have more connections. So fewer people from groups with less connections are in the survey and plausibly those who are in the survey are disproportionately likely to have personal connections.
Among our respondents, >60% of the most highly engaged EAs (e.g. EA org staff and local group leaders) have >10 connections, and >70% have 5 or more. Conversely, a majority of the least engaged half of respondents (levels 1-3) have <2 connections, with 0 being the modal response.
Of course, these responses are from 2019, so it is quite possible that the situation has changed since then.
I think (though, again, I only read it quickly) the paper includes estimates for both carbon sequestered and biomass growth for the plants.
I believe the plant which they used as the reference for that rough figure above increased in biomass by 132.5g but sequestered 56.4g carbon over several weeks, and the 0.8g carbon fixed per day comes from that latter figure.
one study found several common houseplants reduced CO2 concentration in a room by 15-20%. It seems reasonable to assume that placing several houseplants in a room would significantly increase this effect, though likely with diminishing returns.
Unfortunately, I am quite a bit less optimistic about this. (Caveat: I only looked into this very briefly)
From quickly looking at the conference paper you cite, it seemed that the plants were in 1 cubic meter chambers and reduced CO2 by ~50-100ppm for the most part (~5-12% reduction) based on table 1.
But bedrooms are generally a lot larger and contain at least one human breathing out carbon dioxide into them. This paper suggests that since humans breathe out about 300g per day, ~400 plants, in good conditions, would be needed to offset this. Naturally only a portion of that 300g is breathed out into one's bedroom at night, but it still seems like one might need ~100 plants for one person in a room, and one would need to ensure that these are plants which take up CO2 at night, rather than exhaling it at night.
we found a relatively weak correlation between what we call "expansive altruism" (willingness to give resources to others, including distant others) and "effectiveness-focus" (willingness to choose the most effective ways of helping others)
I don’t think we can infer too much from this result about this question.
The first thing to note, as observed here, is that taken at face value, a correlation of around 0.243 is decently large, both relative to other effect sizes in personality psychology and in absolute terms.
However, more broadly, measures that have been constructed in this way probably shouldn’t be used to make claims about the relationships between psychological constructs (either which constructs are associated with EA or how constructs are related to each other).
This is because the ‘expansive altruism’ and ‘effectiveness-focus’ measures were constructed, in part, by selecting items which most strongly predict your EA outcome measures (interest in EA etc.). Items selected to optimise prediction are unlikely to provide unbiased measurement (for a demonstration, see Smits et al (2018)). The items can predict well both because they are highly valid and because they introduce endogeneity, and there is no way to tell the difference just by observing predictive power.
This limits the extent to which we can conclude that psychological constructs (expansive altruism and effectiveness-focus) are associated with attitudes towards effective altruism, rather than just that the measures (“expansive altruism” and “effectiveness-focus”) are associated with effective altruism, because the items are selected to predict those measures.
So, in this case, it’s hard to tell whether the correlation between ‘expansive altruism’ and ‘effectiveness focus’ is inflated (e.g. because both measures share a correlation with effective altruism or some other construct) or attenuated (e.g. because the measures less reliably measure the constructs of interest).
Interestingly, Lucius’ measure of ‘impartial beneficence’ from the OUS (which seems conceptually very similar to ‘expansive altruism), is even more strongly correlated with ‘effectiveness-focus’ (at 0.39 [0.244-0.537], in a CFA model in which the two OUS factors, expansive altruism, and effectiveness-focus are allowed to correlate at the latent level). This is compatible with there being a stronger association between the relevant kind of expansive/impartial altruism and effectiveness (although the same limitations described above apply to the ‘effectiveness-focus measure’).
Is this survey going to be run again? It seems there isn't a 2021 survey? (Or at least the results are published yet)
Yeh, I noted in my reply to your earlier comment, there wasn't an EA Survey run in 2021, but we are planning to run one this year. (That is assuming that you are referring to the EA Survey, not the Groups Survey).
Thanks for your suggested questions as well. Unfortunately, space is very limited in the EA Survey, and there are a lot of requests from other orgs, so it may not be possible to add any new questions.
We have previously asked some questions, at least somewhat related to the ones you suggest, i.e. we have previously asked about donation targets (all years, though limited responses), political ID (all years pre-2020), volunteering (pre-2017), and occupation/experience with software engineering/machine learning (pre-2018 and 2019 respectively).
Thanks for your comment! (Just to clarify, this is a post about our separate EA Groups survey, but I assume you're asking about the EA Survey).
The EA Survey is distributed through a variety of different channels or 'referrers' (including e-mails and social media from the main EA orgs, the EA Forum, e-mailing past survey takers, and local groups). The vast majority of responses come from a relatively small number of those referrers though (80,000 Hours, EA Forum, Local Groups, e-mail to past respondents and the EA newsletter being the main ones). You can see more detail on the composition here.
We discuss representativeness issues in more detail here (the major issue, of course, is that no-one knows the true underlying composition of the EA community, so it's hard to assess the representativeness of the sample against the true population), and provide some sensitivity checks based on the different referrers here.
Our post on what we estimate the size of the EA community as a whole is here. We can estimate the size of the population of people who are highly engaged with EA reasonably well through comparing various benchmarks for which we do have data, and we estimate that in 2019 we sampled around 40% of that population (a little less in 2020). We estimate that we sampled much lower shares of people who are less engaged with EA, though it's also much more difficult to estimate the true size of these populations (e.g. it's much harder to know how many people in total have read a few articles about EA or listened to a few podcasts). We do have some more analyses forthcoming about estimating these different stages in the funnel following on from our work on how many people have heard of EA here (which is even harder to estimate).
There wasn't a survey run in 2021 because the 2020 survey was run very late in the year (right at the end in fact). On average, there's historically been around 15 months between EA Surveys, rather than 12 months, which means we also skipped 2016, and it's also better not to run the EA Survey during the very end of the year (due to holidays and so on), so we thought it better to run it during the middle of the year this year.
In contrast, in 2019 we were asked to include a broad forced-choice question between longtermism and other cause areas. 40.8% selected "Long Term Future."
Of course, this doesn't neatly capture a longtermism vs neartermism distinction since only a small number of cause areas outside "Long Term Future" are mentioned, and some of the people selecting these may nevertheless count as longtermists. For example, in our analysis of the fine-grained cause areas, we find that Meta is strongly associated with Longtermism, so some of these respondents would doubtless count as longtermists. As such, overall, I'm not a big fan of the broad forced choice question, especially since it didn't line up particularly well with responses to the more fine-grained categories.
We didn't dwell on the minimum plausible number (as noted above, the main thrust of the post is that estimates should be lower than previous estimates, and I think a variety of values below around 2% are plausible).
That said, 0.1% strikes me as too low, since it implies a very low ratio between the number of people who've heard of EA and the number of moderately engaged EAs. i.e. this seems to suggest that for every ~50 people who've heard of EA (and basically understand the definition) there's 1 person who's moderately engaged with EA (in the US). That would be slightly higher with your estimate of 0.3% who've heard of the movement at all. My guess would be that the ratio is much higher, i.e. many more people who hear of EA (even among those who could give a basic definition) don't engage with EA at all, and even fewer of those really engage with EA.
We'll probably be going into more detail about this in a followup post.
I find the proportion of people who have heard of EA even after adjusting for controls to be extremely high. I imagine some combination of response bias and just looking up the term is causing overestimation of EA knowledge.
Just so I can better understand where and the extent to which we might disagree, what kind of numbers do you think are more realistic? We make the case ourselves in the write-up that, due to over-claiming, we would we generally expect these estimates to err on the side of over-estimating those who have heard of and have a rough familiarity with EA, that one might put more weight on the more 'stringent' coding, and that one might want to revise even these numbers down due to the evidence we mention that even that category of responses seems to be associated with over-claiming, which could take the numbers down to around 2%. I think there are definitely reasonable grounds to believe the true number is lower (or higher) than 2% (and note the initial estimate itself ranged from around 2-3% if we look at the 95% HDI), but around 2% doesn't strike me as "extremely high."
For context, I think it's worth noting, as we discuss in the conclusion, that these numbers are lower than any of the previous estimates, and I think our method of classifying EAs were generally more conservative. So I think some EAs have been operating with more optimistic numbers and would endorse more permissive classification of whether people seem likely to have heard of EA (these numbers might suggest a downward update in that context).
given that I expect EA knowledge to be extremely low in the general population, I’m not sure what the point of doing these surveys is. It seems to me you’re always fighting against various forms of survey bias that are going to dwarf any real data. Doing surveys of specific populations seems a more productive way of measuring knowledge.
I think there are a variety of different reasons, some of which we discuss in the post.
Firstly, these surveys could confirm whether awareness of EA is generally low, which as I note above isn't completely uncontroversial, and as we discuss in the post, seems to be generally suggested by these numbers whatever the picture in terms of overclaiming (i.e. the estimates at least suggest that the proportion who have heard of EA according to the stringent standard is <3%).
I think just doing surveys on "specific populations" (I assume implicitly you have in mind populations where we expect the percentage to be higher) has some limitations, although it's certainly still valuable. Notably our data drawn from the general population (but providing estimates for specific populations) seems broadly in accord with the data from specific populations (notwithstanding the fact that our estimates seem somewhat lower and more conservative). So we should do both, with both sources of data providing checks on the other.
I think this is particularly valuable given that it is very difficult to get representative samples from "specific populations." I think it's sometimes possible to approximate this and one could apply weighting for basic demographics in a college setting, for example, but this is generally more difficult / what you can do is more limited. And for other "specific populations" (where we don't have population data) this would be impossible.
I also think this applies in cases like estimating how many US students have heard of EA, where taking a large representatively weighted sample, as we did here, and getting estimates for different kinds of students, seems likely to give better estimates than just specifically sampling US students without representative weighting, as in the earlier CEA-RP brand survey we linked.
I think that getting estimates in the general population (where our priors might be that the percentages are very low), also provides valuable calibration for populations where our priors may allow that the percentages are much higher (but are likely much more uncertain). If we just look at estimates in these specific populations, where we think percentages could be much higher, it is very hard to calibrate those estimates against anything to see if they are realistic. If we think the true percentage in some specific population could be as high as 30% or could be much lower, it is hard to test whether our measures which suggest the true figure is ~20% are well-calibrated or not. However, if we've employed these or similar measures in the general population, then we can get a better sense of how the measures are performing and whether they are under-estimating or over-estimating (i.e. whether our classification is too stringent or too permissive).
I think we get this kind of calibration/confirmation when we compare our estimates to those in Caviola et al's recent survey, as we discuss in the conclusion. Since we employed quite similar measures, and found broadly similar estimates for that specific population, if you have strong views that the measures are generally over-estimating in one case, then you could update your views about the results using similar measures accordingly (and, likewise, you can generally get independent confirmation by comparing the two results and seeing they are broadly similar). Of course, that would just be informal calibration/confirmation; more work could be done to assess measurement invariance and the like.
I would also add that even if you are very sceptical about the absolute level of awareness of terms directly implied by the estimates due to general over-claiming, you may still be able to draw inferences about the awareness of effective altruism relative to other terms (and if you have a sense of the absolute prevalence of those terms, this may also inform you about the overall level of awareness of EA). For example, comparing the numbers (unscreened) claiming to have heard of different terms, we can see that effective altruism is substantially less commonly cited than 'evidence-based medicine', 'cell-based meat', 'molecular gastronomy', but more commonly cited than various other terms, which may give a sense of upper and lower bounds of the level of awareness, relative to these other terms. One could also compare estimates for some of these terms to their prevalence estimated in other studies (though these tend not to be representative) e.g. Brysbaert et al (2019), to get another reference point.
Likewise, data from the broader population seems to be necessary to assess many differences across groups (and so, more generally, what influences exposure to and interest in EA). As noted, previous surveys on specific populations found suggestive interesting associations between various variables and whether people had heard of or were interested in EA. But since these were focused on specific populations, we would expect these associations to be attenuated (or otherwise influenced) by range restriction or other limitations. If you only look at specific population that are highly educated, high SAT, high SES, low age etc., then it's going to be very difficult to assess the influence of any of these variables. So, insofar as we are interested in these results, then it seems necessary to conduct studies on broader populations, otherwise we can't get informative estimates of the influence of these different factors (which are probably, implicitly, driving choices about which specific populations, we would otherwise choose to focus on).
Peter Singer seems to be higher profile than the other EAs on your list. How much of this do you think is from popular media, like The Good Place, versus from just being around for longer?
Interesting question. It does seem clear that Peter Singer is known more broadly (including among those who haven’t heard of EA, and for some reasons unrelated to EA). It also seems clear that he was a widely known public figure well before ‘The Good Place’ (it looks like he was described as “almost certainly the best-known and most widely read of all contemporary philosophers” back in 2002, as one example).
So, if the question is whether he’s more well known due to popular media (narrowly construed) like The Good Place, it seems likely the answer is ‘no.’ If the question is whether he’s more well known due to his broader, public intellectual work, in contrast to his narrowly academic work, then that seems harder to assign credit for, since much of his academic work has been extremely popular and arguably a prerequisite of the broader public intellectual work.
If the question is more whether he’s more well known than some of the other figures listed primarily because of being around longer, that seems tough to answer, since it implies speculation about how prominent some of those other figures might become with time.
I wonder if people who indicated they only heard about Peter Singer (as opposed to only hearing about MackAskill, Ord, Alexander, etc.) scored lower on ratings of understanding EA?
This is another interesting question. However, a complication here seems to be that, I think we’d generally expect people who have heard of more niche figures associated with X to be more informed about X, than people who have only heard of a very popular figure associated with X for indirect reasons (unrelated to the quality of information transmitted from those figures).
Also kinda sad EA is being absolutely crushed by taffeta.
Agreed. I had many similar experiences while designing this survey, where I conducted various searches to try to identify terms that were less well known than ‘effective altruism’ and kept finding that they were much more well known. (I remember one dispiriting example was finding that the 'Cutty Sark' seemed to be much more widely known than effective altruism).
For considering "recruitment, retention, and diversity goals" I think it may also be of interest to look at cause preferences across length of time in EA, across years. Unlike in the case of engagement, we have length of time in EA data across every year of the EA Survey, rather than just two years.
Although EAS 2017 messes up what is otherwise a beautifully clear pattern*, we can still quite clearly see that:
On average people start out (0 years) in EA favouring neartermist causes and gradually cohorts become more longtermist. (Note that this is entirely compatible with non-longtermists dropping out, rather than describing individual change: though we know many individuals do change cause prioritization, predominantly in a longtermist direction.)
Each year (going up the graph vertically) has gradually become more longtermist even among people who have only been EA 0 years. Of course, this could partly be explained by non-longtermists dropping out within their first year of hearing about EA, but it could also reflect EA recruiting progressively more longtermist people.
We can also descriptively see that the jump between 2015 and 2018-2020 is quite dramatic. In 2015 all cohorts of EA (however long they'd been in EA) were strongly near-termist leaning. By 2018-2020, even people who'd just joined of EA were dramatically more favourable to longtermism. And by 2020, even people who had been in EA a couple of years were on average roughly equally longtermist/neartermist leaning and beginning to be on average longtermist leaning.
* EAS 2017 still broadly shows the same pattern until the oldest cohorts (those who have been in EA the longest, which have a very low sample size). In addition, as Appendix 1 shows, while EAS 2018-2020 have very similar questions, EAS 2015-2017 included quite different options in their questions.
I've included a plot excluding EAS 2017 below, just so people can get a clearer look at the most recent years, which are more comparable to each other.
Fwiw, my intuition is that EA hasn't been selecting against, e.g. good epistemic traits historically, since I think that the current community has quite good epistemics by the standards of the world at large (including the demographics EA draws on).
I think it could be the case that EA itself selects strongly for good epistemics (people who are going to be interested in effective altruism have much higher epistemic standards than the world or large, even matched for demographics), and that this explains most of the gap you observe, but also that some actions/policies by EAs still select against good epistemic traits (albeit in a smaller way).
I think these latter selection effects, to the extent they occur at all, may happen despite (or, in some cases, because of) EA's strong interest in good epistemics. e.g. EAs care about good epistemics, the criteria they use to select for good epistemics are in practice the person expressing positions/arguments they believe are good ones, this functionally selects more for deference than good epistemics.
Do you have data on the trends over time? I’m interested to know if the three attributes are getting closer together or further apart at both ends of the engagement spectrum.
We only have a little data on the interaction between engagement and cause preference over time, because we only had those engagement measures in the EA Survey in 2019 and 2020. We were also asked to change some of the cause categories in 2020 (see Appendix 1), so comparisons across the years are not exact.
Still, just looking at differences between those two years, we see the patterns are broadly similar. True to your prediction, longtermism is slightly higher among the less engaged in 2020 than 2019, although the overall interaction between engagement, cause prioritisation and year is not significant (p=0.098). (Of course, differences between 2019 and 2020 need not be explained by particular EAs changing their views (they could be explained by non-longtermists dropping out, or EAs with different cause preferences being more/less likely to become more engaged between 2019 and 2020)).
EA is less focused on longtermism than people might think based on elite messaging. IIRC this is affirmed by past community surveys
This is somewhat less true when one looks at the results across engagement levels. Among the less engaged ~50% of EAs (levels 1-3), neartermist causes are much more popular than longtermism. For level 4/5 engagement EAs, the average ratings of neartermist, longtermist and meta causes are roughly similar, though with neartermism a bit lower. And among the most highly engaged EAs, longtermist and meta causes are dramatically more popular than neartermist causes.
Descriptively, this adds something to the picture described here (based on analyses we provided), which is that although the most engaged level 5 EAs are strongly longtermist on average, the still highly engaged level 4s are more mixed. (Level 5 corresponds to roughly EA org staff and group leaders, while level 4 is people who've "engaged extensively with effective altruism content (e.g. attending an EA Global conference, applying for career coaching, or organizing an EA meetup)."
One thing that does bear emphasising is that even among the most highly engaged EAs, neartermist causes do not become outright unpopular in absolute terms. On average they are rated around the midpoint as "deserv[ing] significant resources." I agree this may not (seem to be) reflected in elite recommendations about what people should support on the margins though.
Back when LEAN was a thing we had a model of the value of local groups based on the estimated # of counterfactual actively engaged EAs, GWWC pledges and career changes, taking their value from 80,000 Hour $ valuations of career changes of different levels.
The numbers would all be very out of date now though, and the EA Groups Surveys post 2017 didn't gather the data that would allow this to be estimated.
I think we would have had the capacity to do difference-in-difference analyses (or even simpler analyses of pre-post differences in groups with or without community building grants, full-time organisers etc.) if the outcome measures tracked in the EA Groups Survey were not changed across iterations and, especially, if we had run the EA Groups Survey more frequently (data has only been collected 3 times since 2017 and was not collected before we ran the first such survey in that year).
One other thing I'd flag is that, although I think it's very plausible that there is a cross-over interaction effect (such that people who are predisposed to be positively inclined to EA prefer the "Effective Altruism" name and people who are not so predisposed prefer the "Positive Impact" name), it doesn't sound like the data which you mention doesn't necessary suggest that.
i.e. (although I may be mistaken) it broadly sounds like you asked people beforehand (many of whom liked PISE) and you later asked a different set of people who already had at least some exposure to effective altruism (who preferred EAE). But I would expect people who've been exposed to effective altruism (even a bit) to become more inclined to prefer the name with "effective altruism" in it. So what we'd want to do is expose a set of people (with no exposure to EA) to the names and observe differences in those who are more or less positively pre-disposed to EA (or even track them to see whether they, in fact, go on to engage with EA long term).
Taking the question literally, searching the term ‘social justice’ in EA forum reveals only 12 mentions, six within blog posts, and six comments...
I worry EA is another exclusive, powerful, elite community, which has somehow neglected diversity.
I think it's worth distinguishing discussions of "social justice" from discussions of "diversity." Diversity in EA has been much discussed, and there is also a whole facebook group dedicated to it. There has been less discussion of "social justice" in those terms, partly, I suspect, because it's not natural for utilitarians to describe things in terms of "justice", and partly because, as mentioned, the phrase "social justice" has acquired specific connotations. However, there has been extensive discussion of broader social justice related critiques of EA, largely under the banner of "systemic change."
Note: we also track demographic diversity in the ~ annual EA Survey.
It seems like this would be relatively easy to test with an online experiment using a student-only sample.
This would have the advantage that we could test the effect of the different names without experimenting with an actual EA group by changing its name. On the other hand, this might miss any factors particular to that specific group of students (if there are any such factors), though it would be possible with the larger sample size that this would allow to examine the effects of different characteristics of the students or the university they attend. This would also allow us to test multiple additional names at the same time.
There are two broad reasons why I would prefer the ACSI items (considered individually) over the NPS (style) item:
The ACSI items are (mostly) more face valid
The ACSI items generally performed better than the NPS when we ran both of these in the EAS 2020
This depends on what you are trying to measure, so I’ll start with the context in the EAS, where (as I understand it) we are trying to measure general satisfaction with or evaluation of the EA community.
Here, I think the ACSI items we used (“How well does the EA community compare to your ideal? [(1) Not very close to the ideal - (10) Very close to the ideal]” and “What is your overall satisfaction with the EA community? [(1) Very dissatisfied - (10) Very satisfied]”) more closely and cleanly reflect the construct of interest.
In contrast, I think the NPS style item (“If you had a friend who you thought would agree with the core principles of EA, how excited would you be to introduce them to the EA community?”) does not very clearly or cleanly reflect general satisfaction. Rather, we should expect it to be confounded with:
Attitudes about introducing people to the EA community (different people have different views about how positive growing the EA community more broadly is)
Perceived/projected personal “excitement” (related to one’s (perceived) emotionality, excitability etc.)
Sociability/extraversion/interest in introducing friends to things in general, as well as one’s own level of social engagement with EA (if one is socially embedded in EA, introducing friends might make more sense than if you are very pro EA, but your interaction with it is entirely non-social)
I think some of these issues are due to the general inferiority of the NPS as a measure of what it’s supposed to be measuring:
And some of them are due to the peculiarities of the context where we’re using NPS (generally used to measure satisfaction with a consumer product) to measure attitudes towards a social movement one is a part of (hence the need to add the caveat about “a friend who you thought would agree with the core principles of EA”).
Some of the other contexts where you’re using NPS might differ. Likelihood to recommend may make more sense when you’re trying to measure evaluations of an event someone attended. But note that the ‘NPS’ question may simply be measuring qualitatively different things when used in these different contexts, despite the same instrument being presented. i.e. asking about recommending the EA community as a whole elicits judgments about whether it’s good to recommend EA to people (does spreading EA seem impactful or harmful etc?), whereas asking about recommending an event someone attended mostly just reflects positive evaluation of the course. Still, I slightly prefer a simple ACSI satisfaction measure over NPS style items, since I think it will be clearer, as well as more consistent across contexts.
Performance of measures
Since we included both the NPS item and two ACSI items in EAS 2020 we can say a little about how they performed, although with only 1-2 items and not much to compare them to, there’s not a huge amount we can do to evaluate them.
Still, the general impression I got from the performance of the items last year confirms my view that the two ACSI measures cohere as a clean measure of satisfaction, while NPS and the other items are more of a mess. As noted, we see that the two ACSI measures are closely correlated with each other (presumably measuring satisfaction), while the NPS measure is moderately correlated with the ‘bespoke’ measures (e.g. “I feel that I am part of the EA community”) which seem to be (noisily) measuring engagement more than satisfaction or positive evaluation. I think it’s ultimately unclear what any of those three items are measuring since they’re all just imperfectly correlated with each other, engagement and with satisfaction, so I think they are measuring a mix of things, some of which are unknown. Theoretically, one could simply run a larger suite of items, designed to measure satisfaction, engagement, and other things which we think might be related (such as what the bespoke measures are intended to measure) and tease out what the measures are tracking. But there’s not a huge amount we can do with just 5-6 items and 2-3 apparent factors they are measuring.
Benefits of multiple measures
As an aside, we put together some illustrations of the possible concrete benefits of using a composite measure of multiple items, rather than a single measure.
The plot below shows the error (differences between the measured value and the true value: higher values, in absolute terms, are worse) with a single item vs an average made from two or three items. Naturally, this depends on assumptions about how noisy each item is and how correlated each of the items are, but it is generally the case that using multiple items helps to reduce error and ensure that estimates come closer to the true value.
This next image shows the power to detect a correlation of around r = 0.3 using 1, 2 or 3 items. The composite of more items should have lower measurement error. When only a single item is used, the higher measurement error means that a true relationship between the measured variable and another variable of interest can be harder to detect. With the average of 2 or 3 items, the measure is less noisy, and so the same underlying effect can be detected more easily (i.e., with fewer participants). (The three different images just show different standards for significance)
Cool! Glad to see this, I've been harping on about the NPS for some time (1, 2, 3, 4).
We usually do this because we don’t want to take people’s time up by asking three questions. I haven’t done a very rigorous analysis of the trade-offs here though, and it could be that we are making a mistake and should use ACSI instead.
As you may have considered, you could ask just one of the ACSI items, rather than asking the one NPS item. This would have lower reliability than asking all three ACSI items, but I suspect that one ACSI item would have higher validity than the one NPS item. (This is particularly the case when trying to elicit general satisfaction with the EA community, but maybe less so if you literally want to know whether people are likely to recommend an event to their friends).
The added value of using three items to generate a composite measure is potentially pretty straightforward to estimate, esp if you have prior data with the items. Happy to talk more about this.
I think this post mostly stands up and seems to have been used a fair amount.
Understanding roughly how large the EA community seems moderately fairly, so I think this analysis falls into the category of 'relatively simple things that are useful to the EA community but which were nevertheless neglected for a long while'.
One thing that I would do differently if I were writing this post again, is that I think I was under-confident about the plausible sampling rates, based on the benchmarks that we took from the community. I think I was understandably uneasy, the first time we did this, basing estimates of sampling rates based on the handful of points of comparisons (EA Forum members, the EA Groups survey total membership, specific local groups, and an informal survey in the CEA offices), so I set pretty wide confidence intervals in my guesstimate model. But, with hindsight, I think this assigns too much weight to the possibility that the broader population of highly engaged EAs were taking the EA Survey at a higher rate than members of all of these specific highly engaged groups. As a result, the overall estimates are probably a bit too uncertain but, in particular, the smaller estimates of the size of the community are probably less likely.
One of the more exciting developments following this post is that, now that we have more than one year of data, we can use this method to estimate growth in the EA community (as discussed here and in the thread below). This method has since been used, for example, here and here. Estimating the growth of the EA community may be more important than estimating the size of the EA community, so this is a neat development. I put a guesstimate model for estimating growth here, which suggests around 14% growth in the number of highly engaged EAs (the number of less engaged EAs is much less certain). For simplicity of comparison, I left confidence intervals as wide as they were in 2019, even though, as discussed, I think this suggests implausible levels of uncertainty about the estimates.
It seems plausible that we should assign weight to what past generations valued (though one would likely not use survey methodology to do this), as well as what future generations will value, insofar as that is knowable.