Posts

Market-shaping approaches to accelerate COVID-19 response: a role for option-based guarantees? 2020-04-28T10:10:30.266Z · score: 39 (12 votes)
Rethink Grants: an evaluation of Donational’s Corporate Ambassador Program 2019-07-23T23:53:20.274Z · score: 54 (19 votes)

Comments

Comment by derek on A counterfactual QALY for USD 2.60–28.94? · 2020-09-10T12:30:53.982Z · score: 5 (5 votes) · EA · GW

I'm sure there are many giving opportunities in global health that are better than the GiveWell top charities, and I'm pleased to see promising small or medium-sized projects like this being brought to the attention of EAs. 

However, I think you should try to get better estimates of QALYs gained (or DALYs averted)—especially if you're going to feature the cost-effectiveness ratio so prominently in your write-up. This should be possible by referring to the relevant literature. The current estimates don't seem all that plausible to me, e.g. an episode of "simple malaria" (by which you presumably mean there are no other complications like anaemia) tends to last a few weeks or less, so even if it could be immediately cured at the beginning, it wouldn't reach your lower estimate of 0.1 QALYs, let alone the upper of 5 QALYs. For life-threatening conditions, I don't think you should have the theoretical maximum of "save all lives" as the upper estimate, as that wouldn't happen in any context, and certainly not this one. If you must rely on your intuitive guesstimates, perhaps you should use 90% or 95% credible intervals.

Good luck with the project!

 

Comment by derek on EA Forum feature suggestion thread · 2020-07-04T20:51:33.300Z · score: 3 (2 votes) · EA · GW

''Next" and "Previous" arrows/buttons at the bottom of a post, to move to the next/previous post - useful when you haven't read the forum for a while and want to catch up. This would obviously have to assume a certain ordering (e.g. chronological vs karma) and selection (e.g. all or excluding Community/Questions), which could perhaps be adjusted in Settings.

Comment by derek on EA Forum feature suggestion thread · 2020-06-17T22:04:42.759Z · score: 26 (13 votes) · EA · GW

Level 3 headings should be supported. Unless it's changed recently, it currently jumps from Level 2 to Level 4, which makes it hard to logically format complex documents.

Comment by derek on Market-shaping approaches to accelerate COVID-19 response: a role for option-based guarantees? · 2020-04-29T18:46:50.986Z · score: 2 (2 votes) · EA · GW

Thanks for the comments!

1. The put could cover ~90% of the cost of the accelerated production, taking into account the additional costs.

2. Sales are likely to be higher if they move more quickly: the company with the first billion vaccines is likely to sell a lot more items than the company with the second, and this could more than offset any additional costs. (The second may not sell any, even if it’s a good product, if the first can meet all needs quickly enough.)

3. Some variants outlined in the brief, such as declining payouts, can further incentivise haste.

4. I’ve nothing against academic/PPP efforts, especially if they are under existing arrangements (since they normally take ages to negotiate), and put options will not always be the best approach. But in the current situation we need as many teams on this as we can get, and options-based guarantees may help generate new ideas or get existing ones to market more quickly.

Comment by derek on What posts do you want someone to write? · 2020-04-03T16:33:26.962Z · score: 5 (3 votes) · EA · GW

Should Covid-19 be a priority for EAs?

A scale-neglectedness-tractability assessment, or even a full cost-effectiveness analysis, of Covid as a cause area (compared to other EA causes) could be useful. I'm starting to look into this now – please let me know if it's already been done.

Comment by derek on What posts do you want someone to write? · 2020-04-03T16:30:39.643Z · score: 3 (2 votes) · EA · GW

"The longtermist case for animal welfare"

Have you seen this? https://forum.effectivealtruism.org/posts/W5AGTHm4pTd6TeEP3/should-longtermists-mostly-think-about-animals

Comment by derek on Coronavirus: how much is a life worth? · 2020-03-27T00:10:11.829Z · score: 3 (2 votes) · EA · GW

Suicide is a very poor indicator of the dead/neutral point, for a host of reasons.

A few small, preliminary surveys I've seen place it around 2/10, though it ranges from about 0.5 to 6 depending on whom and how you ask.

(I share your concerns in parentheses, and am doing some work along these lines - it's been sidelined in part due to covid projects.)

Comment by derek on What posts you are planning on writing? · 2020-03-26T18:39:32.875Z · score: 4 (3 votes) · EA · GW

Hah! I was working on them before getting sidelined with covid stuff.

I can send you the drafts if you send me a PM. The content is >80% done (I've decided to add more, so the % complete has dropped) but they need reorganising into ~10 manageable posts rather than 3 massive ones.

Comment by derek on Founders Pledge Charity Recommendation: Action for Happiness · 2020-03-19T22:00:33.464Z · score: 2 (2 votes) · EA · GW

Thanks Aidan! Hope you're feeling better now.

Most of your comments sound about right.

On retention rates: Your general methods seem to make sense, since one would expect gradual tapering off of benefits, but your inputs seem even more optimistic than I originally thought.

I'm not sure Strong Minds is a great benchmark for retention rates, partly because of the stark differences in context (rural Uganda vs UK cities), and partly because IIRC there were a number of issues with SM's study, e.g. a non-randomised allocation and evidence of social desirability bias in outcome measurement, plus of course general concerns related to the fact it was a non-peer-reviewed self-evaluation. Perhaps retention rates of effects from UK psychotherapy courses of similar duration/intensity would be more relevant? But I haven't looked at the SM study for about a year, and I haven't looked into other potential benchmarks, so perhaps yours was a sensible choice.

Also not a great benchmark in a UK context, but Haushofer and colleagues recently did a study* of Problem Management+ in Uganda that found no benefits at the end of a year (paper forthcoming), even though it showed effectiveness at the 3 month mark in a previous study in Kenya.

*Haushofer, J., Mudida, R., & Shapiro, J. (2019). The Comparative Impact of Cash Transfers and Psychotherapy on Psychological and Economic Well-being. Working Paper. Available upon request.

Comment by derek on AMA: Elie Hassenfeld, co-founder and CEO of GiveWell · 2020-03-19T15:43:02.106Z · score: 3 (2 votes) · EA · GW

Do you think GiveWell top charities are the best of all current giving opportunities? If so, what is the next best opportunity?

Comment by derek on AMA: Elie Hassenfeld, co-founder and CEO of GiveWell · 2020-03-19T13:01:06.802Z · score: 15 (5 votes) · EA · GW

Do you think adopting subjective wellbeing as your primary focus would materially affect your recommendations?

In particular:

(a) Would using SWB as the primary outcome measure in your cost-effectiveness analysis change the rank ordering of your current top charities in terms of estimated cost-effectiveness?

(b) If it did, would that affect the ranking of your recommendations?

(c) Would it likely cause any of your current top charities to no longer be recommended?

(d) Would it likely cause the introduction of other charities (such as ones focused on mental health) into your top charity list?

Comment by derek on AMA: Elie Hassenfeld, co-founder and CEO of GiveWell · 2020-03-18T21:46:20.442Z · score: 7 (4 votes) · EA · GW

How likely is it that GiveWell will ultimately (e.g. over a 100-year or 10,000-year period) do more harm than good? If that happens, what is the most likely explanation?

Comment by derek on AMA: Elie Hassenfeld, co-founder and CEO of GiveWell · 2020-03-18T21:37:21.469Z · score: 2 (2 votes) · EA · GW

A recent post on this forum (one of the most upvoted of all time) argued that "randomista" development projects like GiveWell's top charities are probably less cost-effective than projects to promote economic growth. Do you have any thoughts on this?

Comment by derek on Founders Pledge Charity Recommendation: Action for Happiness · 2020-03-07T16:51:45.831Z · score: 24 (10 votes) · EA · GW

I like your general approach to this evaluation, especially:

  • the use of formal Bayesian updating from a prior derived in part from evidence for related programmes
  • transparent manual discounting of the effect size based on particular concerns about the direct study
  • acknowledgement of most of the important limitations of your analysis and of the RCT on which it was based
  • careful consideration of factors beyond the cost-effectiveness estimate.

I'd like to see more of this kind of medium-depth evaluation in EA.

I don't have time at the moment for a close look at the CEA, but aside from limitations acknowledged in your text, 3 aspects stand out as potential concerns:

1. The "conservative" and "optimistic" results are quite extreme. This seems to be in part because "conservative" and "optimistic" values for several parameters are multiplied together (e.g. DALYs gained, yearly retention rate of benefits, % completing the course, discount rates...). As you'll know, it is highly improbable that even, say, three independent parameters would simultaneously obtain at, say, the 10th percentile: 0.1*0.1*0.1 = 0.001. Did you consider making a probabilistic model in Guesstimate, Causal, Excel (with macros for Monte Carlo simulation), R, etc in order to generate confidence intervals around the final results? (I appreciate there are major advantages to using Sheets, but it should be fairly straightforward to reproduce at least the "Main CEA" and "Subjective CEA inputs" tabs in, for example, Guesstimate. This would also enable a rudimentary sensitivity analysis.)

2. The inputs for "Yearly retention rate of benefits" (row 10) seem pretty high (0.30, 0.50, and 0.73 for conservative, best guess, and optimistic, respectively) and the results seem fairly sensitive to this parameter. IIRC the study this was based on only had an 8-week follow-up, which would be about half your "conservative" figure (8/52 = 0.15). Even their "extended" follow-up (without a control group) was only for another 2 months. It is certainly plausible that the benefits endure for several months, but I would say that estimates of about 0.1, 0.3, and 0.7 are more reasonable. With those inputs, the cost per DALY increases to about $47,000, $4,500, or $196. That central figure is roughly on a par with CBT for depression in high-income countries, i.e. pretty good but not comparable with developing-country interventions. (And I wouldn't take the "optimistic" figure seriously for the reasons given in (1) above.)

3. I haven't seen the "growth model" on which the cost estimates are based, but my guess is that it doesn't account for the opportunity cost of facilitators' (or participants') time. IIRC each course is led by two "skilled" volunteers who may otherwise do another pro-social activity.

Comment by derek on Founders Pledge Charity Recommendation: Action for Happiness · 2020-03-05T22:36:21.531Z · score: 7 (5 votes) · EA · GW
There is also evidence that health problems have a much smaller effect on subjective well-being than one might imagine.

This is only the case for (some) physical health problems, especially those associated with reduced mobility. People tend to underestimate the SWB impact of (at least some) mental health problems. See e.g. Gilbert & Wilson, 2000; De Wit et al., 2000; Dolan & Kahneman, 2007; Dolan 2008; Pyne et al., 2009; Karimi et al., 2017

Comment by derek on Poverty in Depression-era England: Excerpts from Orwell's "Wigan Pier" · 2020-02-12T01:26:34.067Z · score: 1 (1 votes) · EA · GW

You might want to mention the publication date (1937)

Comment by derek on A Local Community Course That Raises Mental Wellbeing and Pro-Sociality · 2020-01-31T23:05:48.949Z · score: 2 (2 votes) · EA · GW

Thanks - I missed that on my skim. But the "extended" follow-up is only for another two months. It does seem to indicate that effects persist for at least that period, without any trend towards baseline, which is promising (though without a control group the counterfactual is impossible to establish with confidence). I wonder why they didn't continue to collect data beyond this period.

Comment by derek on A Local Community Course That Raises Mental Wellbeing and Pro-Sociality · 2020-01-31T22:45:52.944Z · score: 6 (5 votes) · EA · GW

Thanks - "trained facilitator" might be a bit misleading. Still, it looks like there were two volunteer course leaders for each course, selected in part for their unspecified "skills", who were given "on-going guidance and support" to facilitate the sessions, and who have to arrange a venue etc themselves, then go through a follow-up process when it's over. So it's not a trivial amount of overhead for an average of 13 participants.

Comment by derek on A Local Community Course That Raises Mental Wellbeing and Pro-Sociality · 2020-01-31T14:49:13.690Z · score: 32 (11 votes) · EA · GW

I don't have much time to spend on this, but here are a few thoughts based on a quick skim of the paper.

The study was done by some of the world's leading experts in wellbeing and the study design seems okay-ish ('waitlist randomisation'). The main concern with internal validity, which the authors acknowledge, is that changes in the biomarkers, while mostly heading in the right direction, were far from statistically significant. This could indicate that the effects reported on other measures were due to some factor other than actual SWB improvement, e.g. social desirability bias. But biomarkers are not a great metric, and measures were taken to address these concerns, so I find it plausible that the effects in the study population were (nearly) as large as reported.

However:
- The participants were self-selected, largely from people who were already involved with Action for Happiness ("The charity aims to help people take action to create more happiness, with a focus on pro-social behaviour to bring happiness to others around them"), and largely in the UK. They also had to register online. It's unclear how useful it would be for other populations.
- It's quite an intensive program, involving weekly 2–2.5 hour group meetings with a trained facilitator two volunteer facilitators. ("Each of these sessions builds on a thematic question, for example, what matters in life, how to find meaning at work, or how to build happier communities.") This may limit its scalability and accessibility to certain groups.
- Follow-up was only for 2 months, the duration of the course itself. (This limitation seems to be due to the study design: the control group was people taking the course 8 weeks later.)
- The effect sizes for depression and anxiety were smaller than for CBT, so it may still not be the best option for mental health treatment (though the CBT studies were done in populations with a diagnosed mental disorder, so direct comparison is hard; and subgroup analyses showed that people with lower baseline wellbeing benefited most from the program).
- For clarity, the average effect size for life satisfaction was about 1 point on a 10-point scale. This is good compared to most wellbeing interventions, but that might say more about how ineffective most other interventions are than about how good this one is.

So at the risk of sounding too negative: it's hardly surprising that people who are motivated enough to sign up for and attend a course designed to make them happier do in fact feel a bit happier while taking the course. It seems important to find out how long these effects endure, and whether the course is suitable for a broader range of people.

Comment by derek on The EA Hotel is now the Centre for Enabling EA Learning & Research (CEEALAR) · 2020-01-29T18:05:45.588Z · score: 7 (6 votes) · EA · GW

But I really think the whole name should be reconsidered.

Comment by derek on The EA Hotel is now the Centre for Enabling EA Learning & Research (CEEALAR) · 2020-01-29T18:05:21.971Z · score: 6 (7 votes) · EA · GW

You could keep the name but drop the first 'A': CEELAR. Excluding the 'A' of Altruism isn't great, but I think you're allowed to take major liberties with acronyms. And really, almost anything is better than CEEALAR.

Comment by derek on AMA: Rob Mather, founder and CEO of the Against Malaria Foundation · 2020-01-28T11:56:42.157Z · score: 3 (3 votes) · EA · GW

Thanks Rob!

As you've said, in addition to averting deaths it looks like AMF considerably improves lives, e.g. by improving economic outcomes and reducing episodes of illness. Have you considered collecting data on subjective wellbeing in order to help quantify these improvements? Could that be integrated into your program without too much expense/difficulty?

On the other side of the coin, one possible negative impact of programs that increase wealth and/or population size is the suffering of animals farmed for food (since better-off people tend to eat more meat). Do you have any data on dietary changes resulting from bed net distribution (or similar programs)? Would it be feasible to collect that data in future?

Comment by derek on AMA: Rob Mather, founder and CEO of the Against Malaria Foundation · 2020-01-24T09:51:20.571Z · score: 22 (10 votes) · EA · GW

A recent post on this forum (the fourth most popular ever, at the time of writing) argued that "randomista" development projects like AMF are probably less cost-effective than projects to promote economic growth. Do you have any thoughts on this?


Comment by derek on AMA: Rob Mather, founder and CEO of the Against Malaria Foundation · 2020-01-24T09:48:03.462Z · score: 9 (6 votes) · EA · GW

What are your thoughts on the indirect ("flow-through") effects of AMF? For example:

1. What do you think are the main positive and negative indirect impacts of the program, both long- and short-term? (E.g. increasing productivity and economic growth, increasing/decreasing total population, strengthening health systems, greenhouse gas emissions, consumption of factory-farmed meat...) Do you have any data on these? Are you planning to gather data on any of them?

2. What proportion of the long-term benefit from the program is due to short-term direct effects such as saving lives and averting unpleasant episodes of malaria, relative to indirect benefits?

3. Do you hold a particular view of population ethics (totalism, averagism, person-affecting, etc)?

4. What is your response to critics who claim we are ultimately "clueless" about the long-run magnitude or even sign of interventions like this? (I think the basic argument is that e.g. averting deaths has a wide range of knock-on effects, both good and bad, and that we may not be justified in being confident that ultimately – say, over the next few hundred years - the impact will be net positive. See e.g. here, here, and here for a better explanation)

Comment by derek on Katriel Friedman: The Benefits of Starting Your Own Charity · 2020-01-24T08:56:50.547Z · score: 3 (2 votes) · EA · GW
It's a core part of the research ethics that they teach you when you're being trained to run an RCT — whether you can run them if you have equipoise (i.e., are certain that an intervention works).

You might want to clarify this. Equipoise is uncertainty about whether the intervention works, and is often considered a pre-requisite for an RCT. I'm sure Katriel understands this but the phrasing here is misleading.

Comment by derek on AMA: Rob Mather, founder and CEO of the Against Malaria Foundation · 2020-01-22T21:58:55.798Z · score: 18 (17 votes) · EA · GW

Can you explain your '20 minute rule'?

Comment by derek on Logarithmic Scales of Pleasure and Pain: Rating, Ranking, and Comparing Peak Experiences Suggest the Existence of Long Tails for Bliss and Suffering · 2020-01-20T15:26:15.978Z · score: 2 (2 votes) · EA · GW

Do you have any thoughts on whether valenced experience is asymmetrical, i.e. whether the most negative experiences (e.g. 10/10 on some suitable pain scale) are more bad than the most positive ones (e.g. 10/10 on some suitable pleasure scale) are good?

My hunch is that the worst experiences are more intense, at least if you exclude weird/rare things like Jhanas and 5-MeO-DMT trips, e.g. I'd give up days or weeks of 'maximum happiness' to avoid being burned alive for a minute. But not everyone shares this intuition, and I'm not sure how to settle the debate (at least until you prove and operationalise your symmetry theory of valence).


Comment by derek on Logarithmic Scales of Pleasure and Pain: Rating, Ranking, and Comparing Peak Experiences Suggest the Existence of Long Tails for Bliss and Suffering · 2020-01-20T15:12:14.612Z · score: 1 (1 votes) · EA · GW

Thanks for this - very interesting.

Do you think your claims would apply to broader measures of subjective wellbeing, e.g. questions like "Overall, how satisfied are you with your life?" and "Overall, how happy were you yesterday?" (often on a 0-10 scale)? Or even to more specific measures of valenced experience, like depression (e.g. PHQ-9)?

Because I've been wondering whether:

(a) the Weber-Fechner law is limited to perception of clear physical stimuli (weight, pain, spicyness, etc), as distinct from 'internal' states and cognitive evaluations (though the internal/external distinction may not make sense here).

(b) a log scale is less useful/accurate when considering long periods of time (a day, a year, a lifetime), over which the variance in average wellbeing in a population will be lower than the variance in the intensity of specific events.




Comment by derek on Physical Exercise for EAs – Why and How · 2020-01-12T19:39:25.313Z · score: 13 (8 votes) · EA · GW

This is very good, but I think busy (or unmotivated) EAs without much exercise experience would benefit from even more specific recommendations, especially for resistance exercises (i.e. strength training).

I found the Start Bodyweight program useful when beginning resistance training at home with no equipment other than a pull-up bar. An EA recommended the book Overcoming Gravity for more detailed information on bodyweight exercises.

I now I prefer to use the gym. At a glance, the following (which I just found with a quick Google search) seem like sensible gym-based* options for beginners, but maybe you have better ideas.

https://www.muscleandfitness.com/workouts/workout-routines/complete-mf-beginners-training-guide-plan

https://stronglifts.com/5x5/ [I'd add some core exercises to this, like situps and planks]

https://www.shape.com/fitness/workouts/strength-training-beginners

When I'm too busy to do the full range of strength and cardio (or when I'm travelling), I sometimes do moderate/high-intensity interval classes at home using YouTube videos. The Body Coach is pretty good - he has a videos with a range of difficulty (beginner to advanced), duration (10 min+), and muscle focus (legs, upper body, abs, full-body, etc). There are also videos meeting specific needs, e.g. low-impact routines so you don't disturb your neighbours or hurt your knees, and ones designed for small spaces. This kind of thing is perhaps the most efficient form of exercise: you can do it anywhere, it doesn't require any equipment, it's free, it covers both cardio and strength, and it doesn't take much time.

When travelling, I also take a resistance band. If you choose the weight carefully, a single band (which folds up to the size of a cigarette packet) can arguably substitute for any dumbbell that you'd use in the gym, and some of the machines as well. (The main thing you're lacking is the ability to do deadlifts, but there are ways around that too.)

I've heard some EAs recommend GymPass, especially if you travel a lot and don't like to exercise alone.

Feel free to correct me on any of this – I don't have any relevant expertise.

*They could obviously be done at home if you buy the equipment. The last one just needs dumbbells or resistance bands, which are pretty cheap.

Comment by derek on 2019 AI Alignment Literature Review and Charity Comparison · 2020-01-03T13:44:38.972Z · score: 10 (6 votes) · EA · GW

Why isn't there a GiveWell-style evaluator for longtermist (or specifically AI safety) orgs?

Comment by derek on New research on moral weights · 2020-01-02T20:50:19.827Z · score: 2 (2 votes) · EA · GW

Section 4 on subjective wellbeing is interesting.

• Across poor respondents in Kenya and Ghana, the average life satisfaction ladder score is 2.8 (where 0 is the lowest and 10 is the highest score).
• Respondents with higher consumption have higher life satisfaction ladder scores; doubling consumption is associated with being 0.4 steps higher on the ladder.
• When describing different points on the ladder respondents most often referred to levels of money and material goods. In contrast, health states were mentioned much less often with regards to life satisfaction. Having a health condition was associated with being 0.3 steps lower on the ladder.
• Overall, taken alone, these findings suggest that consumption is of greater relative importance to wellbeing of respondents than their preferences (described in Section 1-3) indicate.

I notice they only measured life satisfaction. Can you tell me why they didn't also include at least one measure of hedonic wellbeing, such as those used in the evaluations of GiveDirectly? It is really important to understand whether potential GiveWell top charity beneficiaries are actually unhappy (i.e. generally feel bad) or just dissatisfied with their material circumstances when someone with a clipboard asks them about it. (Life satisfaction is much more sensitive to relative wealth and status than is pleasure/misery.) For instance, this may be the critical factor when choosing between life-extending and life-improving interventions.

Comment by derek on Uncertainty and sensitivity analyses of GiveWell's cost-effectiveness analyses · 2020-01-01T18:23:25.917Z · score: 2 (2 votes) · EA · GW

Did you ever get round to running the the analysis with your best guess inputs?

If that revealed substantial decision uncertainty (and especially if you were very uncertain about your inputs), I'd also like to see it run with GiveWell's inputs. They could be aggregated distributions from multiple staff members, elicited using standard methods, or in some cases perhaps 'official' GiveWell consensus distributions. I'm kind of surprised this doesn't seem to have been done already, given obvious issues with using point estimates in non-linear models. Or do you have reason to believe the ranking and cost-effectiveness ratios would not be sensitive to methodological changes like this?

Comment by derek on Is mindfulness good for you? · 2019-12-30T21:02:22.844Z · score: 7 (5 votes) · EA · GW

This is very useful – thanks for writing it up.

This heterogeneity across intervention types means that we should be cautious about broad claims about the efficacy of mindfulness for depression and anxiety.

True, but that applies equally to claims of null or small effect sizes, e.g. some forms of mindfulness could be very effective even if 'on average' it's not. Did any of the meta-analyses contain useful subgroup analyses?

(For what it's worth, a few years ago I used the Headspace app ~5x/week for 3 months and found it to be actively detrimental to my mood. Anecdotally, this seem fairly common: https://www.theguardian.com/lifeandstyle/2016/jan/23/is-mindfulness-making-us-ill)

Comment by derek on Learning to ask action-relevant questions · 2019-12-28T20:09:21.962Z · score: 1 (1 votes) · EA · GW

I guess "action-relevant" has a better noun form, which could be a non-trivial advantage.

Comment by derek on Learning to ask action-relevant questions · 2019-12-28T20:06:33.065Z · score: 1 (1 votes) · EA · GW
Ask yourself: “If I imagine a world in which I have answered this question, what would look different?”

This sounds like the "importance" part of the ITN framework. From EA Concepts:

If all problems in the area could be solved, how much better would the world be?
Comment by derek on Learning to ask action-relevant questions · 2019-12-28T20:02:00.678Z · score: 1 (1 votes) · EA · GW
I'm not sure if "action-relevant" is accepted terminology

About a year ago I heard the term "action guiding", which I guess is the same?

Comment by derek on We're Rethink Priorities. AMA. · 2019-12-17T23:27:38.555Z · score: 9 (8 votes) · EA · GW

I’ve asked several academics with domain expertise to review draft posts, or sections of posts, or advise on specific issues. Some have been very useful, but they understandably do not have time to engage fully (if at all). As a consequence, I often worry that I’m making dumb mistakes, or just reinventing the wheel, and there are often substantial delays while waiting for expert input. I think the lack of access to academic networks and infrastructure is perhaps the biggest weakness of RP as a research organisation, and it is related to the youth and inexperience of EA as a whole.

I'm not sure it can be fully solved – some fields only have half a dozen people in the world working on them, so it may be impossible to find someone with enough free time to help out. But I suspect a lot of progress could be made, e.g. I bet there are a lot of statisticians and economists who would be willing and able to help if only they knew we needed it. At the mid- and late career professionals’ meetup at EAG San Fransisco last June, it was suggested that retired academics, professional groups, and LinkedIn might be good sources of mentors/advisors. Someone mentioned https://taprootfoundation.org/ as well – perhaps not for academic advice, but for support in other areas where EA orgs tend to be lacking, such as management. I'd be interested to see an effort to systematically connect experts with EA projects, perhaps through the EA Hub or 80,000 Hours.


Comment by derek on We're Rethink Priorities. AMA. · 2019-12-15T12:37:34.183Z · score: 24 (12 votes) · EA · GW

The following is a tidy, oversimplified version of what happened.

I learned about Bentham and Mill in A-level history class (aged 17) and I think read a Peter Singer book. I was very left-wing at the time but I remember being really frustrated that all the other altruistically-minded kids in my class supported standard leftist policies for ideological reasons even when they harmed disadvantaged people. This influenced me to study philosophy at undergrad level, where I defended utilitarianism.

Unfortunately EA hadn’t been invented at the time so I spent the first year after graduation working in warehouses and call centers, followed by about nine years of direct development work in low-income countries. I got frustrated by the inefficiency of most development orgs and decided to switch fields into either law (‘earning to give’ before I'd heard of the concept) or public health (to do direct work with more quantifiable impacts).

Around the same time I was searching online for information about charity evaluation and came across GiveWell, then the Singer TED Talk and the wider EA community. This may have influenced me to choose public health, though there were other factors (e.g. the 2008 financial crash made it even harder than usual to pursue a lucrative law career). I spent 18 months in Australia doing whatever work I could find – mostly farm labouring – to pay for my master’s course.

During the course I became more involved in EA, and got interested in health economics, especially methods for cost-effectiveness analysis. But I couldn’t get a job or PhD in health economics with a general public health background, so to save up for a second master's I spent two more years doing mostly sub-minimum wage temp jobs, or saving dole money when I couldn’t find work (though I also got a bit of contract work with GiveWell towards the end of this period). Halfway through that course I ran out of money and had some health issues, so I took a leave of absence, during which time I worked on the 2019 Global Happiness Policy Report (Chapter 3), then got the Rethink job.

My reasons for continuing to work in EA are some mixture of those given by my colleagues.


Comment by derek on We're Rethink Priorities. AMA. · 2019-12-15T11:41:49.903Z · score: 9 (4 votes) · EA · GW

Most likely academic research related to the use of subjective wellbeing in prioritisation systems (healthcare, central government, maybe EA orgs, etc). Might have applied for researcher positions in other EA orgs.

Comment by derek on We're Rethink Priorities. AMA. · 2019-12-13T23:15:46.443Z · score: 8 (8 votes) · EA · GW

I’ve become a bit more longtermist in outlook and more uncertain of the sign/effect size of most interventions/projects, mostly due to issues around indirect effects/cluelessness.

Comment by derek on We're Rethink Priorities. AMA. · 2019-12-13T23:08:56.992Z · score: 16 (9 votes) · EA · GW

I’m not a philosopher, but to the extent I have opinions on such things they are about the same as Moss’s, i.e. classical hedonistic utilitarianism with quite a lot of moral uncertainty. I have somewhat suffering-focused intuitions but (a) I’ve never seen a remotely convincing argument for a suffering-focused ethic, and (b) I think my intuitions – and, I suspect, those of many people who identify as suffering-focused – can be explained by other factors. In particular, I think there are problems with the scales people use to measure valence/wellbeing/value of lives, both in reality and in thought experiments, e.g. it seems common for philosophers to assume a symmetrical scale like -10 to +10, whereas it seems pretty obvious to me that the worst lives – or even, say, the 5th percentile of lives – are many times more bad then the best lives are good. So if the best few percent of lives are 10/10 and 0 is equivalent to being dead, the bottom few percent of any large population are probably somewhere between -100 and -100,000. (It is not widely appreciated just how awful things are for so many people.) If true, classical utilitarianism may have policy implications similar to prioritarianism and related theories, e.g. more resources for the worst off (assuming tractability). But I haven’t seen much literature on these scale issues so I’m not confident this is correct. If you know of any relevant research, preferably peer-reviewed, I’d be very interested.

Comment by derek on How do cash transfers impact the people who don’t receive them? · 2019-12-04T17:18:52.324Z · score: 20 (6 votes) · EA · GW

[EDIT: I no longer endorse all of this comment. After looking more closely at the papers, I'm more confident that the spillover effects of the latest version of the program are neutral to positive (at least on humans – growth in meat consumption is an important caveat).]

Thanks for posting this.

Though not reported here, I was pleased to see that non-market effects were also recorded in the study, and that these were neutral or positive for both recipients (‘treated households’) and non-recipients.

For treated households, we find positive and significant effects for four of the six indices: psychological well-being, food security, education and security [i.e. crime rates]. Estimated effects are close to zero and not significant for the health index and female empowerment index. When looking at total effects including spillovers for the treated, we find a similar pattern for all but the security index. For untreated households, we find no significant effects of local cash transfers except for the education index, which is higher by 0.1 SD (p < 0.10). Importantly, we do not find evidence of adverse spillover effects for untreated households on any of the indices, with point estimates positive for all but the security index, which is indistinguishable from zero (-0.02 SD, SE 0.07).

I’m particularly interested in the “psychological wellbeing” index, which Appendix C1 says comprises a “weighted, standardized average of depression (10 question CES-D scale), happiness, life satisfaction, and perceived stress (PSS-4)”. I would like to know: (a) what measures were used for “happiness” and “life satisfaction”; (b) how the components of the index were weighted; and most of all (c) a breakdown of scores for each measure. I can’t find this information in the paper.

I’m asking because there is a fair amount of research suggesting that one person's income increase causes wellbeing declines among other members of the community (i.e. people feel worse when their neighbour gets richer), at least for some accounts of wellbeing. For instance, Haushofer, Reisinger, & Shapiro (2019) found that neighbours of GiveDirectly cash recipients experienced a decline in psychological wellbeing (seemingly measured by a similar index to the one used in the most recent study) about half as great as the psychological wellbeing benefit to the recipient. Depending on how many neighbours are affected by each transfer, this would seem to indicate that GiveDirectly may have a net negative effect on aggregate wellbeing. However, this effect was driven entirely by life satisfaction, an ‘evaluative’ or ‘cognitive’ measure; there were no negative spillovers on measures of ‘hedonic’ wellbeing, namely “happiness”, “stress”, and “depression”. As the authors note:

This result is intuitive: the wealth of one’s neighbors may plausibly affect one’s overall assessment of life, but have little effect on how many positive emotional experiences one encounters in everyday life. This result complements existing distinctions between these different facets of well-being, e.g. the finding that hedonic well-being has a “satiation point” in income, whereas evaluative well-being may not (Kahneman and Deaton, 2010).

Without seeing the disaggregated scores from the new study, it seems possible that there were non-trivial and statistically significant harms (or benefits) according to some components of the index. This matters to those with a preferred moral theory or conception of wellbeing, e.g. a classical utilitarian probably cares more about hedonic states than life evaluations, and a prioritarian more about severe states like depression than positive ones like happiness.

Comment by derek on Does climate change deserve more attention within EA? · 2019-11-17T19:08:02.264Z · score: 1 (1 votes) · EA · GW

It's an interesting idea. Often the resource fungibility won't be huge so it may not make much difference, but in some cases it might.

It also seems to assume that it will use fewer total resources than working on both problems less intensively for a longer period. I would guess that it would usually be more efficient to divide resources and work on problems simultaneously, in part due to diminishing returns to investment. E.g. shifting all AI researchers to climate change would greatly hinder AI research but perhaps not contribute much to climate change mitigation, even assuming good personal fit of researchers, since there are already lots of talented people working on the issue.

But I've thought about this for less than 5 minutes so it might deserve a deeper dive. I'm not likely to do it, though.

Comment by derek on Concrete next steps for ageing-based welfare measures · 2019-11-17T18:01:57.742Z · score: 2 (2 votes) · EA · GW

This seems like a promising approach to comparing hedonic wellbeing across species of nonhuman animals. Comparing human with nonhuman wellbeing is also important for prioritising interventions, and there doesn't seem to be a good way of doing this at the moment. Do you think this could be part of a solution?

Comment by derek on Does climate change deserve more attention within EA? · 2019-10-13T11:39:01.629Z · score: 2 (2 votes) · EA · GW

"[AI safety] is currently taking up a large amount of attention from competent altruistic people. If the issue were to be solved or its urgency reduced, some of those resources might flow into [climate change mitigation]"

So hurry up, Toon ;)

Comment by derek on List of ways in which cost-effectiveness estimates can be misleading · 2019-08-26T16:57:35.799Z · score: 1 (1 votes) · EA · GW

I wish I'd spent more time reviewing this before publication as I failed to mention some key points. I'll add some of them as comments.

Comment by derek on List of ways in which cost-effectiveness estimates can be misleading · 2019-08-26T16:53:50.770Z · score: 19 (9 votes) · EA · GW

Most cost-effectiveness analyses by EA orgs (and other charities) use a ratio of costs to effects, or effects to costs, as the main - or only - outcome metric, e.g. dollars per life saved, or lives affected per dollar. This is a good start, but it can be misleading as it is not usually the most decision-relevant factor.

If the purpose is to inform a decision of whether to carry out a project, it is generally better to present:

(a) The probability that the intervention is cost-effective at a range of thresholds (e.g. there is a 30% chance that it will avert a death for less than my willingness-to-pay of $2,000, 50% at $4,000, 70% at $10,000...). In health economics, this is shown using a cost-effectiveness acceptability curve (CEAC).

(b) The probability that the most cost-effective option has the highest net benefit (a term that is roughly equivalent to 'net present value'), which can be shown with a cost-effectiveness acceptability frontier (CEAF). It's a bit hard to get one's head around, but sometimes the most cost-effective intervention has lower expected value than an alternative, because the distribution of benefits is skewed.

(c) A value of information analysis to assess how much value would be generated by a study to reduce uncertainty. As we found in our evaluation of Donational, sometimes interventions that have a poor cost-effectiveness ratio and a low probability of being cost-effective nevertheless warrant further research; and the same can be true of interventions that look very strong on those metrics.

See Briggs et al. (2012) for a general overview of uncertainty analysis in health economics, Barton et al. (2008) for CEACs, CEAFs and expected value of perfect information, and Wilson (2014) for a practical guide to VOI analyses (including the value of imperfect information gathered from studies).

Of course, these require probabilistic analyses that tend to be more time-consuming and perhaps less transparent than deterministic ones, so simpler models that give a basic cost-effectiveness ratio may sometimes be warranted. But it should always be borne in mind that they will often mislead users as to the best course of action.

Comment by derek on List of ways in which cost-effectiveness estimates can be misleading · 2019-08-26T15:58:25.230Z · score: 13 (8 votes) · EA · GW

Any deterministic analysis (using point estimates, rather than probability distributions, as inputs and outputs) is unlikely to be accurate because of interactions between parameters. This also applies to deterministic sensitivity analyses: by only changing a limited subset of the parameters at a time (usually just one) they tend to underestimate the uncertainty in the model. See Claxton (2008) for an explanation, especially section 3.

This is one reason I don't take GiveWell's estimates too seriously (though their choice of outcome measure is probably a more serious problem).

Comment by derek on List of ways in which cost-effectiveness estimates can be misleading · 2019-08-26T15:12:59.050Z · score: 1 (1 votes) · EA · GW

This was my fault, sorry. I was travelling and ill so I was slow giving feedback on the draft. I belatedly sent Saulius some comments without realising it had just been published, so he took it down it in order to incorporate some of my suggestions.

Comment by derek on Rethink Grants: an evaluation of Donational’s Corporate Ambassador Program · 2019-08-19T05:26:11.411Z · score: 6 (2 votes) · EA · GW

>TLYCS only endorses 22 charities, all of which work in the developing world on causes that are plausibly cost-effective on the level of some GiveWell interventions (even though evidence is fairly weak on some of them...)

It's plausible that some of these are as cost-effective as the GW top charities, but perhaps not that they are as cost-effective on average, or in expectation.

>This selection only looks narrow if your point of comparison is another EA-aligned evaluator like GiveWell, ACE, or Founder's Pledge.

You mean only looks broad?

Anyway, I would agree TLYCS's selection is narrow relative to some others; just not the EA evaluators that seem like the most natural comparators.