This is a question post.
And for each question, I’m interested in how answers would vary depending on whether we’re talking about orgs or individuals, whether the orgs/individuals are explicitly EA or not, whether they’re in academia or not, and what their theory of change is.
Note that I’m focused on organisations or individuals whose main output is research or similar things (e.g., summaries or podcast episodes focusing on ideas from researchers). I think some other types of orgs (e.g., SHIC [EA · GW], EA London) decently often run surveys on the quality and/or impact of their work, but that’s not my focus here.
answer by MichaelA
) · GW
Overall, my independent impression is that more/most/all EA research orgs should run such surveys, and that it might be worth individual EA researchers experimenting with doing so as well. I also expect I’ll run another survey of this kind next year. This is all partly due to the potential upsides, but also partly about the seemingly low costs, as I think these surveys should usually take little time to run and reflect on. More details and caveats follow.
Indications of other people’s views
Rethink’s post about their survey [EA · GW] says:
Two of our founding values at Rethink Priorities are transparency and impact assessment. Here we present the results of our stated intention to annually run a formal survey to discover if one of our target audiences, decision-makers and donors in the areas which we investigate, has read our work and if it has influenced their decision-making. Due to the small sample of 47 respondents, the disproportionate importance of some of these respondents, and the ability to highlight comments from only those who opted to share their responses publicly, the precise results should not be taken too seriously. It is also very important to note that this is just one of the many ways we are assessing our impact. Nevertheless, we will present the overall results and the results by cause area.
Max Daniel commented [EA(p) · GW(p)] on that post:
Thanks for posting this! I'd really like to see more organizations evaluate their impact, and publish about their analysis.
(Though note that impact could also be evaluated without a survey of this kind.)
On the first page of their currently running annual impact survey, 80,000 Hours writes:
Your survey responses are extremely useful to us.
They help us understand whether we're doing any good, and if so what part of our work is actually helping you.
That means we can focus on doing more of the things that are having an impact, and deprioritise those that aren't valuable, or are even doing harm.
My views before running my survey
My views before running my survey were similar to my current view, though even more tentative and even less fleshed out. My basic reasoning was as follows:
First, I think that getting clear feedback on how well one is doing, and how much one is progressing, tends to be somewhat hard in general, but especially when it comes to:
- Actually improving the world compared to the counterfactual
- Rather than, e.g., getting students’ test scores up, meeting an organisation’s KPIs, or publishing a certain number of papers
(I also think this applies especially to relatively big-picture/abstract research, rather than applied research, and to longtermism [EA · GW]. This was relevant to my case, but isn’t central to the following points.)
Second, I think some of the best metrics by which to judge research are whether people:
- are bothering to pay attention to it
- think it’s interesting
- think it’s high-quality/rigorous/well-reasoned
- think it addresses important topics
- think it provides important insights
- think they’ve actually changed their beliefs, decisions, or plans based on that research
I think this data is most useful if these people have relevant expertise, are in positions to make especially relevant and important decisions, etc. But anyone can at least provide input on things like how well-written or well-reasoned some work seems to have been. And whoever the respondents are, whether the research influenced them probably provides at least weak evidence regarding whether the research influenced some other set of people (or whether it could, if that set of people were to read it).
Third, impact surveys are one way to gather data on these metrics. Such surveys aren’t the only way to do that, and these metrics aren’t the only ones that matter. But I expect it to tend to be useful to gather more data than people would by default, and to gather data from a more diverse set of sources (each with their own, different limitations).
Fourth, a lot of the data I’d gotten was from people actively reaching out to me, unprompted and non-anonymously. I expect this data to be biased towards positive feedback, because:
- people who like my work are more likely to reach out to me
- a lack of anonymity may bias people towards being friendly / avoiding being “rude” / avoiding hurting my feelings.
Surveys face similar sampling and response biases, but perhaps to a smaller extent, because:
- people are at least prompted to participate (though they still choose at that point to opt in or out)
- respondents are anonymous.
With my survey in particular, I wanted to get additional inputs into my thinking about:
- whether EA-aligned research and/or writing is my comparative advantage (as I’m also actively considering a range of alternative pathways)
- which topics, methodologies, etc. within research and/or writing are my comparative advantage
- specific things I could improve about my research and/or writing (e.g., topic choice, how rigorous vs rapid-fire my approach should be, how concise I should be)
How I’ve updated my views based on further thought
Potential relevance of one’s theory of change
I’d guess that key components of Rethink Priorities and 80,000 Hours’ theories of change involve relatively directly influencing key decisions that are (a) made by people outside their organisations, and (b) not just about further research. This could include things like career and donation decisions.
There may be many research orgs for which that is not the case. For example, some orgs’ may view their research as being almost entirely intended to lay the groundwork for further research done by themselves or by others. (I expect Rethink and 80k also have this as one goal for their research, but that for them this goal isn’t as dominant.) Orgs in this category may include MIRI and most academic institutes (including GPI and FHI).
If this is true, it might mean that orgs like Rethink and 80k have an unusually large and hard-to-pin-down key audience for their work. Perhaps many other orgs can be satisfied simply with things like:
- seeing how many citations their papers get, and how those papers build on their papers
- getting a sense of their reputation among the other few dozen relevant researchers at a conference for their field
Potential relevance of how diverse one’s topic choices are
Similarly, if an org/individual writes only on a handful of relatively narrow areas, it may be easier for them to identify a small set of particularly relevant people and get their feedback without running a survey. In contrast, if an org/individual’s writings span many areas, it may be more valuable to publicly post a survey in fora where their writings are read.
Potential relevance of one’s speed of output
Perhaps if an org/individual produces something like 1-5 outputs per year, it makes sense to just solicit input on individual pieces. In contrast, a larger amount of output per unit of time might increase the value of a survey gathering views on all of this output, including on which pieces were most widely read, seemed most useful, etc.
Potential reputational relevance of “EA vs not” and “academic vs not”
Perhaps outside of EA, and perhaps in academia, running this sort of survey would seem weird and somehow tarnish one’s reputation? (I have no real reason for believing this; it just seems plausible.)
I think of those points except the last one would merely somewhat reduce how useful surveys are, rather than making them useless. It still seems to me that surveys would often provide relevant and useful data, and data with different limitations to data from other sources (which is useful because it means findings that come up consistently are more likely to accurately reflect reality).
And I think surveys could be made, promoted, analysed, and reflected on in:
- in 2 hours if one really wants to go fast
- In fewer than 10 hours in most cases
- Exceptions would be cases where one constructs a particularly large survey, gets text responses from a particularly large sample, or wants to reflect particularly rigorously/extensively on the results. E.g., I expect the process for 80,000 Hours’ impact survey this year will end up having taken substantially more than 10 hours.
So it seems to me that the expected value of a research org running such a survey will tend to offset the costs, given that it:
- will probably usually provide at least slightly useful info
- will probably have a nontrivial chance of providing very useful info
- will probably not take much time
I’m unsure if this is true for individual researchers, as they’ll tend to have less output and write on a smaller set of topics. But I do think it was worthwhile for me to run my survey. (Though note that I’ve written unusually many outputs and written on many different areas. This is in turn partly because I’ve been writing posts rather than papers.)
How I’ve updated my views based on the survey
I spent ~1 hour 10 minutes creating my survey; writing a post and comment to promote it and explain my rationale behind it; publishing that to the EA Forum and LessWrong; and later also publishing shortform comments promoting the survey. Replicating these steps next year would likely take closer to 20 minutes.
I spent ~2 hours analysing and reflecting on the results. I expect I could do this about twice as fast next year, though I may also get more responses, which would cause the reflection process to take longer.
I spent around 5 hours writing up my reflections publicly [EA(p) · GW(p)], as well as this post and comment. I think I was inefficient in how I did this. But in any case, other orgs/researchers could skip this step, or do a much smaller version of it. (The main reasons I did this step the way I did were that I’m interested in the question of whether and how others should run similar surveys, and because I think seeing my data, reflections, and thoughts might be useful for others.)
I think I benefitted noticeably, but not incredibly much, from the survey data. For details, see my reflections [EA(p) · GW(p)].
↑ comment by HowieL ·
2020-09-09T16:42:43.899Z · EA(p) · GW(p)
[Not meant to express an overall view.] I don't think you mention the time of the respondents as a cost of these surveys, but I think it can be one of the main costs. There's also risk of survey fatigue if EA researchers all double down on surveys.Replies from: MichaelA
↑ comment by MichaelA ·
2020-09-09T19:18:37.514Z · EA(p) · GW(p)
Strong upvote for two good points that, in retrospect, I feel should've been obvious to me!
In light of those points as well as what I mentioned above, my new, quickly adjusted, bottom-line view, would be that:
- People considering running these surveys should take into account that cost and that risk which you mention.
- I probably still think most EA research organisations should run such a survey at least once.
- In many cases, it may make the most sense to just send it to some particular group of people, or post it in some place more targeted to their target audience than the EA forum as a whole. This would reduce the risk of survey fatigue somewhat, in that not all these surveys are being publicised to basically all EAs.
- In many cases, it may make sense for the survey to be even shorter than my one.
- In many cases, it may make sense to run the survey only once, rather than something like annually.
- Probably no/very few individual researchers who are working at organisations who are themselves running surveys should run their own, relatively publicly advertised individual surveys (even if it's at a different time to the org's survey).
- This is because those individuals survey would probably provide relatively little marginal value, while still having roughly the same time costs and survey fatigue risk.
- But maybe this doesn't hold if the org only does a survey once, and the researcher is considering running a survey more than a year later.
- And maybe it doesn't hold for surveys sent out in a more targeted manner.
- Even among individual researchers who work independently, or whose org isn't running surveys, probably relatively few should run their own, relatively publicly advertised individual surveys.
- The exceptions may tend to be those who wrote a large number of outputs, on a wide range of topics, for relatively broad audiences. (For the reasons alluded to in my parent comment.)
I could definitely imagine shifting my views on this again, though. Replies from: HowieL
↑ comment by HowieL ·
2020-09-11T14:38:43.074Z · EA(p) · GW(p)
This all seems reasonable to me though I haven't thought much about my overall take.
I think the details matter a lot for "Even among individual researchers who work independently, or whose org isn't running surveys, probably relatively few should run their own, relatively publicly advertised individual surveys"
A lot of people might get a lot of the value from a fairly small number of responses, which would minimise costs and negative externalities. I even think it's often possible to close a survey after a certain number of responses.
A counterargument is that the people who respond earliest might be unrepresentative. But for a lot of purposes, it's not obvious to me you need a representative sample. "Among the people who are making the most use of my research, how is it useful" can be pretty informative on its own.Replies from: MichaelA
↑ comment by MichaelA ·
2020-09-11T17:52:36.984Z · EA(p) · GW(p)
A lot of people might get a lot of the value from a fairly small number of responses, which would minimise costs and negative externalities.
This sort of thing is part of why I wrote "relatively publicly advertised", and added "And maybe it doesn't hold for surveys sent out in a more targeted manner." But good point that someone could run a relatively publicly advertised survey and then just close it after a small-ish number of responses; I hadn't considered that option.
Comments sorted by top scores.