Who do intellectual prizewinners follow on Twitter?post by Ben_West, Aadil Kara · 2021-08-25T15:26:19.293Z · EA · GW · 15 comments
Executive Summary Background Methodology Media Platform Choice Identification Results statistics Twitter usage Penetration Influencers Explorer Limitations Future Research Directions Datasets Possible Takeaways Appendix: EA-adjacent Influencers None 15 comments
In order to understand what types of content CEA’s target audience already consumes, we investigated which Twitter accounts are most followed by a proxy for that audience. We used intellectual prizewinners (IPWs) such as Rhodes Scholars as a proxy for especially promising students and young professionals. The hope is that by examining who is already successful at reaching IPWs, we can better reach them ourselves.
This document describes our results and we are also publishing the underlying data set. We hope that they are useful to others in the EA community, and that the data set itself can serve as a basis for research done by other parties. Our major findings are:
- Twitter usage is about as common amongst IPWs as in the general population, and is possibly more common amongst members of groups connected to party politics.
- About one third of IPWs follow at least one EA-adjacent influencer.
- Bill Gates is the only relatively popular EA-adjacent influencer (74 followers, making him the 8th most followed and 84th most liked). Ezra Klein (34 followers), Tim Urban (18 followers), Sam Harris (17 followers), and Max Roser (16 followers) are other somewhat popular EA-adjacent influencers.
- Amongst influencers who are more clearly connected to EA, the most popular are Julia Galef (16 followers), Eliezer Yudkowsky (12 followers), and Chris Olah (12 followers).
- The most followed and liked influencers are mostly leftist politicians (Barack Obama, AOC) and left-leaning news (New York Times, BBC).
- Some exceptions are Elon Musk, Greta Thunberg, and Nate Silver. These people have tweets which are frequently liked by IPWs (rather than just being followed), indicating meaningful engagement.
- Winners of STEM prizes (e.g. Intel Science Fair winners) follow notably different influencers than those of non-STEM prizes (e.g. Rhodes Scholars). They still engaged with some leftist politicians, but had substantial interaction with public scientists/science writers (Ed Yong, Michael Baym, Eric Topol) and scientific organizations (Nature [7th most followed influencer!], NASA, Science, SpaceX).
- The list of most popular EA-ish influencers didn’t change much between STEM and non-STEM IPWs.
CEA works to help especially promising students and young professionals (our "target audience") become highly engaged with EA. From the perspective of the EA Forum, EA Newsletter, CEA social media, and other content, we need to understand what types of communication attract/repel this target audience.
Many other organizations also support promising young people. For example, the Hertz Fellowship targets "the nation's most promising innovators in science and technology". There are numerous other intellectual scholarship and competition programs with similar audiences; in the rest of this document, I will refer to winners of these programs as "intellectual prizewinners" (IPWs). Most of the people CEA wants to support are not IPWs, but there is enough overlap to produce a useful convenience sample.
As an initial investigation step, we scraped the Twitter accounts of IPWs and identified the accounts they are most likely to follow. These are people who seem to already be good at reaching our target audience, which might indicate that we should learn from the way they communicate.
Paul Edward Dollete identified the accounts and Aadil Kara pulled and analyzed the Twitter and Goodreads data. This writeup is by Ben West.
Social Media Platform Choice
We originally looked at recent IMO participants in the US, UK, France, and Germany, using data from Goodreads, Twitter, Facebook, Instagram, LinkedIn, personal sites, and miscellaneous social media (e.g. Quora, Flickr, Snapchat). However, we had trouble with most of these sources:
- Facebook: Facebook profiles are usually private, so we were unable to gain much information about people from there.
- Instagram: We were unable to find many Instagram accounts. I believe that this is due to a genuine lack of Instagram use by IMO participants, but we didn’t closely investigate.
- LinkedIn: LinkedIn accounts were generally sparsely populated, and very few people shared links or followed influencers.
- Personal site: We were unable to find personal sites for most participants, and the ones we did find were frequently academic sites which only contained CV-like information.
- Other social media: none of the other social media sites were used frequently enough for us to believe it was worth investigating further.
This led us to focus on Twitter and Goodreads.
As quality control, I (Ben West) validated a randomly selected subset of the identified accounts. I found:
- Approximately one third of accounts were basically empty. This made it hard to validate that we had identified the correct account, but fortunately these accounts don’t affect the results very much (since they are empty).
- Approximately one quarter of accounts explicitly listed their prize in the account bio, or had a photo of the recipient receiving the prize, etc.
- Another quarter listed graduation dates which were in accordance with their estimated age from the prize they won, or mentioned research similar to what they won a prize for.
- The remaining 10% could only be matched on name. Again, these accounts tend to be less active, meaning that they affected our results less.
- My guess is that the results presented in this document are generally accurate, although it might be interesting to filter to only accounts we are confident in at some point in the future.
We only used data which is publicly available. We did not e.g. follow any of these users to get additional information about their accounts. The DataStudio dashboard we've made available from this research only displays aggregate information where nothing is personally identifiable.
|Total accounts identified||553|
|Number of accounts that liked at least one tweet||422|
|Total number of liked tweets||400,907|
|Number of accounts that followed at least one person||460|
|Number of follower-followed pairs||171,387|
Overall Twitter usage
|Data set||Number of IPWs's||# Goodreads Identified||# Twitter Identified||% Twitter Identified|
|All except IMO||1547||65||517||33.42%|
|STEM except IMO||1098||34||273||24.86%|
The IMO data set had an especially low rate of accounts we could identify on Twitter. I suspect that this was driven by us looking further back for IMO – Twitter usage is lower amongst older Americans.
We were able to identify about 1/5 of IPWs's, in comparison to 2/5 of Americans 18-29 saying they use Twitter. Excluding the IMO dataset, we were able to identify one third of IPWs. Given that many users have anonymous accounts, I would guess that this means IPWs (except IMO participants, maybe) use Twitter at about the same rate as the general population.
We were able to identify Twitter accounts for the majority of non-STEM IPWs, and as much as 86% of OUSU members had identifiable Twitter accounts. This fits with my stereotype that Twitter is unusually important for people who wish to pursue a career in party politics.
|Data set||Number following at least one EA||Number following at least one person||Percent of accounts following at least one person that are following at least one EA|
EA has moderate penetration into IPWs – 30% of IPWs follow at least one EA-adjacent influencer. I think this is a bit higher than I would have guessed. Penetration does not seem to be substantially higher in the STEM versus non-STEM data sets, which again surprises me.
Rather than present a series of static tables, we've created a dashboard which lets users filter and manipulate the data themselves. We encourage you to look through the results and comment/post any interesting findings.
The underlying data is stored in BigQuery; if you need SQL access for more in-depth analysis, please contact me.
We are looking at the most followed accounts, which by definition are skewed towards the "lowest common denominator". It could be that the influencers IPWs mostly engage with are idiosyncratic, making the "most followed accounts" list representative of only a small fraction of their feeds.
Additionally, Twitter usage may not be representative of IPW media consumption in general. I've heard anecdotal evidence that IPWs use social media less than their peers, and if this is true, the people we are measuring might be nonrepresentative in some way.
There are various other confounding variables which might distort the data, some of which I’ve listed as things to explore in the “future research directions” section.
Future Research Directions
We have also collected the following datasets, which we may publish, depending on how useful people find this one:
- Goodreads profiles of IPWs
- Hashtags and URLs tweeted by IPWs
- Raw tweets from IPWs
- A comparable dataset to the one published here, but filtered only to those who have engaged with EA
My hope is that this dataset will be used by people to answer questions I haven't even thought of yet. That being said, here are some specific things I have wondered:
- How do IPWs differ from the general population? For example, maybe everyone follows Barack Obama, so the fact that IPWs follow him isn't that significant. Are there any people who are unheard of in the general population but are heavily followed by IPWs?
- Relatedly, are there influencers who have a small but heavily dedicated following amongst IPWs's (e.g. as indicated by having a small number of followers but a large number of total likes)?
- To what extent is age a confounding variable? We looked further back for Putnam and IMO participants – is that confounding the results somehow? Do the ISEF 2015 winners differ in some meaningful way from the ISEF 2019 winners?
- I would be interested in people repeating this analysis, but for different definitions of “EA-adjacent influencer”. For example, what fraction of people are following an influencer who self-identifies as part of the EA community, or who has attended EA Global? What do the numbers look like if we include mainstream celebrities who sometimes talk about EA-related things (e.g. Andrew Yang)?
- No influencer is followed by more than one third of IPWs. If we wish for EA content to come across the Twitter feeds of IPWs, we will probably need to pursue multiple channels.
- EA already has moderate penetration into IPWs. Influencers on the periphery of EA, like Ezra Klein and Max Roser, are more popular than those at the “core”. We may wish to work more with these peripheral influencers.
- IPWs may use Twitter for content which is stylistically different from most EA content, e.g. political discussions or short news stories. This may indicate that Twitter is not a good forum for EA discussion, or that we should produce EA content which better fits with Twitter style.
Appendix: EA-adjacent Influencers
The following accounts were defined as "EA-adjacent":
Elon Musk is very popular, and is arguably EA-adjacent, but his twitter content seems divorced enough from EA priorities that he shouldn’t count. ↩︎
In this and the rest of the document, I will consider “STEM” data sets to be: ISEF, Hertz, Putnam, and IMO. Non-STEM are Rhodes, Truman, OU, and OUSU. ↩︎
Though note that many of these influencers are on the periphery of EA, like Bill Gates. ↩︎
Though the non-STEM data set is skewed towards Oxford, which is a very EA-centric university. ↩︎
Comments sorted by top scores.