Just chiming in here as HLI was mentioned - although this definitely isn't the most important part of the post. I certainly see us as randomista-inspired - wait, should that be 'randomista-adjacent' - but I would say that what we do feels very different from what other EAs, notably longtermists, do. Also, we came into existence about 5 years after Doing Good Better was published.
I also share Habryka's doubts about how EA's original top interventions were chosen. The whole "scale, neglectedness, tractability' framework strikes me as a confusing, indeterminate methodology that was developed post hoc to justify the earlier choices. I moaned about the SNT framework at length in chapter 5 (pp171) of my PhD thesis.
[Writing from my hotel room at EAG at 5am because my body is on UK time and I can't sleep. Hopefully my reasoning isn't too wonky]
Hello Ozzie. Thanks very much for writing this. It brings lots of nuance. I agree this conversation is easier to have at an abstract level. I wanted to make a few points.
One early reviewer critiqued this post saying that they didn't believe that discomfort was a problem
I've been struck at how often I've seen or heard people say something like this, i.e., that people do feel free to make critiques on important issues. For a community of people that prizes itself on avoiding cognitive biases, this seems a real blind spot. It seems that some people mistakenly infer from the fact they don't feel uncomfortable making critiques, and they see other people doing it, that no one feels awkward about this and that and everything important gets said. In fact, I strongly suspect there are unrecognised power dynamics at play. If you're in a position of power, eg control funding, and work with people who mostly agree with you, those people may feel psychologically safe enough to give you pushback. However others - who may have more important disagreements with you - might not feel comfortable saying anything. This would falsely create the impressions both that people in general feel free to make critiques and that everyone agrees with you, leading to overconfidence.
Second, you ask the question of who is uncomfortable critiquing who. This raises the further question, Why? Again, I suspect this has to relate to power and interpersonal awkwardness. It's much easier to object to global health and wellbeing interventions, because you can focus on the evidence. It's less personal. But for longtermism stuff, it's more about people and their ideas and how well they seem to be running a project. When you add in the small, interconnected funder ecosystem, the incentives to criticise longtermism stuff are pretty weak: there is little to gain, but potentially much to lose, from objecting, so you'd expect less criticism. I speak to lots of people who don't find longtermism particularly plausible but conclude (I think rationally) it's not smart for them to say anything.
Third, as a personal note, I've found, and find, critiquing other bits of EA deeply uncomfortable. People might be surprised by this, because (1) I've done quite a bit of it and (2) I may give off the impression of being very confident and enjoying disagreement (I'm a 6ft5 male with a philosophy PhD) but (even?) I consistently find it really difficult and stressful. I do it because I think the issues are too important. But it's often psychologically unpleasant. And it's genuinely very difficult to do without annoying people, even when you really don't want to (I don't think I've been great at this in the past but hope I'm improving). Doing good better relies on us challenging our current approaches, which is why it's so important to recognise how hard it is to make critiques and to think about what could be tweaked to improve this.
This comment implies the only relevant division is over wokery. I'm not sure why you focused only on that, but there are other ways people can practically disagree about what to do...
This is an interesting idea, but I don't know how feasible or realistic it is. I find it really helpful to think of EA as a marketplace for maximum impact goods and services. Like your local farmers' market, but instead of selling fruit and vegetables, people are offering charities, charity recommendations, and so on. (I've been reflecting on this for a while, and might write it up as a standalone post). The analogy doesn't have to be perfect to be informative. In this framing, what we'd want, presumably, is for there to be lots of competition and choice, so that consumers get a better deal. You don't want there to be just one guy that sells fruit, and only sells oranges, for example.
The challenge for EA is that one buyer dominates the market: Open Philanthropy is, I think, 50+% of total. In effect, you're going to get about as much variety as OP wants, when they do their own 'shopping': if OP wants to buy your fruit, you're in business; if they don't, it's much harder to survive in the market. If they want there to be a greater variety of sellers, they can create demand for them - equivalent to saying "hey, if you start selling apples in the market, we'll buy 10 crates" or whatever. But it's not clear if it's a good use of their money, by their lights, to create a market for products they don't really want and other people may not want. It puts the new vendors in an odd, risky position too if they only exist because they have one, questionably enthusiastic consumer. They could see it as a good use of money: they'd be subsiding the creation of products in the hope of creating demand and bringing other people into the market. Hence, it's not obvious the main buyer in a marketplace would want to creat variety vs just seek out the specific things they want.*
*Mutatis mutandis, there are the same issues if major funders engage in regranting, which is like giving your friends some of your money and asking them to buy stuff in the market. If your friends have the same preferences to you, they'll buy the same stuff, so there's no point. If they buy stuff you'd hate, then you'll think you shouldn't have given them money. Either way, unless you keep giving them money, it's only an artificial spike in consumption.
Ah. I should probably have flagged I'm pointing to an ideal world. As goes (almost) without saying, all this is easier said than done. You could say that EA's challenge is that it has only one mega-donor. It seems much better that it has one than none, but having just one creates distinctive governance challenges compared to if there was no mega-donor but lots of small donors; this latter scenario is a bit closer to a democracy in terms of how power is distributed.
But your comment does rather make my point. In a world where there is only one major donor, that is, effectively a monopsony (a market with single buyer) you could say "look, all this talk about good governance is nice, but it's practically irrelevant; the only question to ask is 'what does the big funder want?'".
I don't think donors can or should be forced to defer to the democratic will. People, qua citizens in their democracy, can be forced to do things, but charity is a necessarily private act. I think it would be good if EVF had some democratic elements and also that someone paid for it to keep going. But, who am I to tell people what they ought to do, a moral philosopher, or something? :)
What is the main issue in EA governance then, in your view? It strikes me [I'm speaking in a personal capacity, etc.] the challenge for EA is a combination of the fact the resources are quite centralised and that trustees of charities are (as you say) not accountable to anyone. One by itself might be fine. Both together is tricky. I'm not sure where this fits in with your framework, sorry.
There's one big funder (Open Philanthropy), many of the key organisations are really just one organisation wearing different hats (EVF), and these are accountable only to their trustees. What's more, as Buck notes here, all the dramatis personae are quite friendly ("lot of EA organizations are led and influenced by a pretty tightly knit group of people who consider themselves allies"). Obviously, some people will be in favour of centralised, unaccountable decision-making - those who think it gets the right results - but it's not the structure we expect to be conducive to good governance in general.
If power in effective altruism were decentralised, that is, there were lots of 'buyers' and 'sellers' in the 'EA marketplace', then you'd expect competitive pressure to improve governance: poorly run organisations will be wracked by the "gales of creative destruction" as donors go elsewhere.
If leaders in effective altruism were accountable, for instance, if EVF became a membership organisation and the board were elected by its (paying?) members, that would provide a different sort of check and balance. I don't think it's reasonable for individual donors, i.e. Dustin Mosktovitz and Cari Tuna, or cause-specific organisations, to submit their money to the democratic will, but it seems more sensible for central organisations, those that are something like natural monopolies and ostensibly serve the whole community, to have democratic elements.
As it is, the governance structure across EA is, essentially, for its leaders to police themselves - and wait for media stories to break. Particularly in light of recent events, it's unclear if this is the optimal approach. I am reminded of the following passage in Pratchett.
Quis custodiet ipsos custodes? Your Grace.”
“I know that one,” said Vimes. “Who watches the watchmen? Me, Mr. Pessimal.”
“Ah, but who watches you, Your Grace?” said the inspector with a brief little smile.
“I do that, too. All the time,” said Vimes. “Believe me.”
-Terry Pratchett, Thud!
Ah, but then in what ways is this different from EA funds? (No need to reply to this now, happy to see something in the next update; just raising it's still a concern).
Kudos on making a concrete proposal - there's been lots of discussion of problems, not so much of potential solutions.
I'm one of those (many) people who is not at all familiar with DAOs, so I'm not sure I've got my head around this. One model is a donor lottery, where many people put money in, and one randomly-chosen person decides. Another is the EA funds, where lots of people put in money, and a few experts decide. This is ... like a donor democracy, where many people pool their money, and then there's a collective decision-making process? In which case, I'm curious for details on how the decision-making process works.
I'm reminded of this recent popular post The EA community does not own its donors' money. I imagine that, if people don't like where the DAO ends up donating, they'll simply stop giving their money to it. So the DAO would end up functioning much like an EA fund, where people give to it in the expectation they know where their money will go and are happy with that outcome.
Which raises the question: would a better solution be to set up more and/or differently run funds?
Hmm. I guess I was thinking about this in general, rather than my own case. That said, I don't think there's any contradiction between there being visible financial prizes for criticism and for people to still rationally think that (some form of) criticise will get you in trouble. Costly signals may reduce fears, but that doesn't mean they will remove or reverse them. Seems worth noting that there has just been a big EA critiques prize and people are presently using burner accounts.
Hello Ivy! I think you've missed at least one scenario, which is where you use your real name, your criticism is not well received, you have identified yourself as a troublemaker, and those in positions of power torpedo you. Surely this is a possibility? Unless people think it's a serious possibility, it's hard to make sense of why people write things anonymously, or just stay silent.
Buck seems to be consistently missing the point.
Although leaders may say "I won't judge or punish you if you disagree with me", listeners are probably correct to interpret that as cheap talk. We have abundant evidence from society and history that those in positions of power can and do act against them. A few remarks to the contrary should not convince people they are not at risk.
Someone who genuinely wanted to be open to criticism would recognise and address the fears people have about speaking up. Buck's comment of "the fact that people want to hide their identities is not strong evidence they need to" struck me as highly dismissive. If people do fear something, saying"well, you shouldn't be scared" doesn't generally make them less scared, but it does convey that you don't care about them - you won't expend effort to address their fears.
One other framing of the same thing, which might be more intuitive, is between the idea of effective altruism (use reason to do good better) vs the EA movement (the current group of people doing it and their priorities).
You could be in favour of the former, but not the latter - "I believe in the IDEA of effective altruism, but I think the EA movement is barking up the wrong tree" etc.
Another phrasing would be between "Big EA" and "Small ea", but like we in the UK differentiate between people who are "Big C" conservative, ie support the Conservative party vs "small c conservative" which means you have a conservative disposition but implicitly don't support the Conservative Party.
Just throwing these out in case they are more useful!
There may not have been extended discussions, but there was at least one more recent warning. “E.A. leadership” is a nebulous term, but there is a small annual invitation-only gathering of senior figures, and they have conducted detailed conversations about potential public-relations liabilities in a private Slack group.
I don't know about others, but I find it deeply uncomfortable there's an invite-only conference and a private slack channel where, amongst other things, reputational issues are discussed. For one, there's something weird about, on the one hand, saying "we should act with honesty and integrity" and also "oh, we have secret meetings where we discuss if other people are going to make us look bad".
This strikes me as weirdly one-sided. You're against leaking, but presumably, you're in favour of whistleblowing - people being able to raise concerns about wrongdoing. Would you have objected to someone leaking/whisteblowing that e.g. SBF was misusing customer money? If someone had done so months ago, that could have saved billions, but it would have a breach of (SBF's) trust.
The difference between leaking and whistleblowing is ... I'm actually not sure. One is official, or something?
Thanks for saying that. Yeah, I couldn't really understand where you were coming from (and honestly ended up spending 2+ hours drafting a reply).
On reflection, we should probably have done more WELLBY-related referencing in the post, but we were trying to keep the academic side light. In fact, we probably need to recombine our various scratching on the WELLBY and put them onto a single page on our website - it's been a lower priority than doing the object-level charity analysis work.
If you're doing the independent impression thing again, then, as a recipient, it would have been really helpful to know that. Then I would have read it more as a friendly "I'm new to this and sceptical and X and Y - what's going on with those?" and less as a "I'm sceptical, you clearly have no idea what you're talking about" (which was more-or-less how I initially interpreted it... :) )
Both comments by this author seemed in bad faith and I'm not going to engage with them.
Hello William, thanks for this. I’ve been scratching my head about how best to respond to the concerns you raise.
First, your TL;DR is that this post doesn’t address your concerns about the WELLBY. That’s understandable, not least because that was never the purpose of this post. Here, we aimed to set out our charity recommendations and give a non-technical overview of our work, not get into methodological and technical issues. If you want to know more about the WELLBY approach, I would send you to this recent post instead, where we talk about the method overall, including concerns about neutrality, linearity, and comparability.
Second, on scientific validity, it means that your measure successfully captures what you set out to measure. See e.g. Alexandrova and Haybron (2022) on the concept of validity and its application to wellbeing measures. I'm not going to give you chapter and verse on this.
Regarding linearity and comparability, you’re right that people *could* be using this in different ways. But, are they? and would it matter if they did? You always get measurement error, whatever you do. An initial response is to point out that if differences are random, they will wash out as ‘noise’. Further, even if something is slightly biased, that wouldn't make it useless - a bent measuring stick might be better than nothing. The scales don’t need to be literally exactly linear and comparable to be informative. I’ve looked into this issue previously, as have some others, and at HLI we plan to do more on it: again, see this post. I’m not incredibly worried about these things. Some quick evidence. If you look at map of global life satisfaction, it’s pretty clear there’s a shared scale in general. It would be an issue if e.g. Iraq gave themselves 9/10.
Equally, it's pretty clear that people can and do use words and numbers in a meaningful and comparable way.
In your MacAskill quotation, MacAskill is attacking a straw man. When people say something is, e.g. "the best", we don't mean the best it is logically possible to be. That wouldn't be helpful. We mean something more like the "the best that's actually possible", i.e. possible in the real world. That's how we make language meaningful. But yes, in another recent report, we stress that we need more work on understanding the neutral point.
Finally, and the thing I think you've really missed about all this, is that: if we're not going to use subjective wellbeing surveys to find out how well or badly people's lives are going, what are we going to use instead? Indeed, MacAskill himself says in the same chapter you quote from of What We Owe The Future:
You might ask, Who am I to judge what lives are above or below neutral? The sentiment here is a good one. We should be extremely cautious to figure our how good or bad others' lives are, as it's so hard to understand the experiences of people with lives very different to one's own. The answer is to rely primarily on self-reports
Hello Richard. I'm familiar with the back-and-forths between McMahan and others over the nature and plausibility of TRIA, e.g. those in Gamlund and Solberg (2019) which I assume is still the state of the art (if there's something better, I would love to know). However, I didn't want to get into the details here as it would require the introduction of lots of conceptual machinery for very little payoff. (I've even been to a whole term of seminars by Jeff McMahan on this topic when I was at Oxford)
But seeing as you've raised it ...
As Greaves (2019) presses, there is an issue of which person-stages count:
Are the relevant time-relative interests, for instance, only those of present person-stages (“presentism”)? All actual person-stages (“actualism”)? All person-stages that will exist regardless of how one resolves one’s decision (“necessitarianism”)? All person-stages that would exist given some resolution of one’s decision (“possibilism”)? Or something else again?
Whichever choice the TRIA-advocate makes, they will inherit structurally the same issues for those as one finds for the equivalent theories in population ethics (for those, see Greaves (2017)).
The version of TRIA you are referring to is, I think, actualist person-stage version: if so, then the view is not action-guiding (the issue of normative invariance). If you save the child, it will have those future stages, so it'll be good that it lived; if you don't save the child, it won't, so it won't be bad that it didn't. Okay, should you save the child? Well, the view doesn't tell you either way!
The actualist version can't be the one at hand, as it doesn't say that it's good (for the child) if you save it (vs the case where you don't).
I am, I think, implicitly assuming a present-stage-interest version of TRIA, as that's the one that generates the value-of-death-at-different-ages curve that is relevantly different from the deprivationist one.
Serious question: Tanae, what else would you like to see? We've already displayed the results of the different ethical views, even if we don't provide a means of editing them
Hello Rhyss. We actually hadn't considered incorporating a suicide-reducing effect of talk therapy onto our model. I think suicide rates in eg Uganda, one place where SM works, are pretty low - I gather they are pretty low in low-income countries in general.
Quick calculation. I came across these Danish numbers, which found that "After 10 years, the suicide rate for those who had therapy was 229 per 100,000 compared to 314 per 100,000 in the group that did not get the treatment." Very very naively, then, that's one life saved via averted suicide per 1,000 treated, or about $150k to save a life via therapy (vs $3-5k for AMF), so probably wouldn't make much difference. But that is just looking at suicide. We could look at the all-cause mortality effects on treating depression (mental and physical health are often comorbid, etc.).
And read this as you planning to continue evaluating everything in WELLBYs, which in turn I thought meant ruling out evaluating research - because it isn't clear to me how you evaluate something like psychedelics research using WELLBYs.
If we said we plan to evaluate projects in terms of their ability to save lives, would that rule out us evaluating something like research? I don't see how it would. You'd simply need to think about the effect that doing some research would have on the number of lives that are saved.
I feel very uncomfortable mutiplying, dividing adding up these wellbys as if they are interchangeable numbers.
Yeah, it is a bit weird, but you get used to it! No weirder than using QALYs, DALYs etc. which people have been doing for 30+ years.
Re grief, here's what we say in section A.2 of the other report
We do a shallow calculation for grief in the same way we did in Donaldson et al. (2020). The best estimate we found is from Oswald and Powdthavee (2008): a panel study in the UK which finds the effect on life satisfaction due to the death of a child in the last year as being -0.72 (adjusted for a 0-10 scale). According to Clark et al. (2018), the duration of grief is ~5 years. Based on data from the UNDP, we calculate that the average household size across the beneficiary countries (excluding the recipient of the nets) is 4.03 people (row 16). Hence, an overall effect of grief per death prevented is (0.72 x 5 x 0.5) x 4.03 = 7.26 WELLBYs. However, we think this is an upper bound because it doesn’t account for the counterfactual grief averted. If you avert the death of someone, they will still die at some point in the future, and the people who love them will still grieve.
I'm not sure what you find weird about those numbers.
Do you mean a pure time discount rate, or something else?
I think a pure-time discount rate would actually boost the case for StrongMinds, right? Regarding cash vs therapy, the benefits from therapy happen more so at the start. Regarding saving lives vs improving lives, the benefit of a saved life presumably applies over the many extra years the person lives for.
Hello Henry. It may look like we’re just leaning on 2 RCTs, but we’re not! If you read further down in the 'cash transfers vs treating depression' section, we mention that we compared cash transfers to talk therapy on the basis of a meta-analysis of each.
The evidence base for therapy is explained in full in Section 4 of our StrongMinds cost-effectiveness analysis. We use four direct studies and a meta-analysis of 39 indirect studies (n > 38,000). You can see how much weight we give to each source of evidence in Table 2, reproduced below. To be clear, we don’t take the results from StrongMinds' own trials at face value. We basically use an average figure for their effect size, even though they find a high figure themselves.
Also, what's wrong with the self-reports? People are self-reporting how they feel. How else should we determine how people feel? Should we just ignore them and assume that we know best? Also, we're comparing self-reports to other self-reports, so it's unclear what bias we need to worry about.
Regarding the issues of comparing saving lives to improving lives, we've just written a whole report on how to think about that. We're hoping that, by bringing these difficult issues to the surface - rather than glossing over them, which is what normally happens - people can make better-informed decisions. We're very much on your side: we think people should be thinking harder about which does more good.
Thanks for the feedback Julian. I've changed the title and added a 'just' that was supposed to have been added to final version but somehow got lost when we copied the text across. I don't know how much that mollifies you...
We really ummed and erred about the title. We concluded that it was cheeky, but not unnecessarily adversarial. Ultimately, it does encapsulate the message we want to convey: that we should compare charities by their overall impact on wellbeing, which is what we think the WELLBY captures.
I don’t think many people really understand how GiveWell compares charities, they just trust GiveWell because that’s what other people seem to do. HLI’s whole vibe is that, if we think hard about what matters, and how we measure that, we can find even more impactful opportunities. We think that’s exactly what we’ve been able to do - the, admittedly kinda lame, slogan we sometimes use is ‘doing good better by measuring good better’.
To press the point, I wouldn’t even know how to calculate the cost-effectiveness of StrongMinds on GiveWell’s framework. It has two inputs: (1) income and (2) additional years of life. Is treating depression good just because it makes you richer? Because it helps you life longer? That really seems to miss the point. Hence, the WELLBY.
Unless GiveWell adopt the WELLBY, we will inevitably be competing with them to some extent. The question is not whether we compete - the only way we could not compete would be by shutting down - it's how best to do it. Needless antagonism is obviously bad. We plan to poke and prod GiveWell in a professional and humourful way when we think they could do better - something we’ve been doing for several years already - and we hope they'll do the same to us. Increased competition in the charity evaluation space will be better for everyone (see GWWC’s recent post).
I am super interested in this problem, but finding it a bit tricky to follow your analysis - in part because it's so spread out and lots of the information seems buried seem in large paragraphs. Is there any chance you could put in it a spreadsheet?
For instance, I'm still unsure what your hypothetical intervention is - providing training to clinics to better diagnose mental health issues and then prescribe SSRIs, is that right?
Hi! Thanks for engaging. Yes, the issue you raise does get discussed quite a bit, and is much worried about (by effective altruists but not in general). I've got a working paper here where I review the theory and evidence and tentatively conclude people probably do interpret the difference between each unit as having the same value (i.e. the scales are interpreted linearly).
My colleague, Casper Kaiser, who is also an HLI trustee, has a more recent paper which shows an approximately linear relationship between reports and later behaviour.
Internally at HLI we're working on doing our own survey on this stuff too!
Why AMF is the best under deprivationism followed by TRIA (AC=5), then TRIA (AC=25), then Epicureanism.
Um, because these are literally the results these views are structured to give! To me, your question is akin to asking "why does consequentialism care more about consequences than deontology?" Sorry, maybe I've misunderstood.
Why StrongMinds is generally better than AMF (almost irregardless of what your philosophical view is).
To be clear, there is no intuition here! These are the outputs of an empirical analysis. There's absolutely no reason it has to be true that the purported best life-extending intervention is better, under a range of different philosophical assumptions, than the purported best life-improving one. In a nearby possible world, AMF* could have been very many times more cost-effective on the assumptions most generous to saving lives.
It is worth noting the portfolio approach corresponding to worldview diversification applies to the allocation of resources of the community as a whole, as far as I understand.
It does? Says who? And why does it? Given that attempts there have been, as far as I can tell, almost nil attempts to think through the worldview diversification approach - despite it being appealed to in decision-making for many years - it strikes me as an open question about how it should be understood. I see moral uncertainty as asking a first-personal question - what should I do, given my beliefs about morality?
I also wonder whether GiveWell's moral weights being majorly determined by its donors (60 %) has the intention of capturing other effects besides those directly related to the death of the person
Ah, I too used to spend many hours wondering what GiveWell really thought about things. But now I am a man, I have put away such childish things.
Well, if we think feelings matter, we should try to quantify them in a sensible way. That's what we try to do.
But I share the sentiment that you really do miss something if you try to quantify feelings by measuring something other than feelings, such as income.
Just skimmed this, but note that there is an existing literature in philosophy on whether non-consequentialist theories can or should be 'consequentialised', that is, turned into consequentialist theories, e.g. Portmore (2009), Brown (2011), Sinnott-Armstrong (2019), Shroeder (2019), Muñoz (2021). I found these in 5 minutes, so there's probably loads more.
A very general problem with the move, as Sinnott-Armstrong (2019) points out, is that if all theories can be re-presented as consequentialist, then it means little to label a theory as consequentialist. Even if successful, we would then have many different 'consequentialisms' that suggest, in practice, different things: should you kill the one and harvest their organs to save the five?
Of course, all minimally plausible versions of deontology and virtue ethics must be concerned in part with promoting the good. As John Rawls, not a consequentialist himself, famously put in A Theory of Justice: “All ethical doctrines worth our attention take consequences into account in judging rightness. One which did not would simply be irrational, crazy.” That does not, however, make everyone a consequentialist.
I don't understand this conversation. The common take is that SBF has been caught admitting he's a Machiavellian nihilist but then ... he did contact a journalist and choose to say this. He cannot have reasonably thought it wouldn't be shared. What was he hoping to achieve? Seems like bizarre self-sabotage. I wondered if it was some contorted attempt to make himself the fall guy rather than have EA thinking take the blame.
This seems to be a false equivalence. There's a big difference between asking "did this writer, who wrote a bit about ethics and this person read, influence this person?" vs "did this philosophy and social movement, which focuses on ethics and this person explicitly said they were inspired by, influence this person?"
I agree with you that the question
Who's at fault for FTX's wrongdoing?
has the answer
But the question
Who else is at fault for FTX's wrongdoing?
Is nevertheless sensible and cannot have the answer FTX.
I, in general, share your sentiments, but I wanted to pick up on one thing (which I also said on twitter originally)
For years, the EA community has emphasised the importance of integrity, honesty, and the respect of common-sense moral constraints. [...] A clear-thinking EA should strongly oppose “ends justify the means” reasoning. I hope to write more soon about this. In the meantime, here are some links to writings produced over the years. [emphases added]
While it might sound good to say people should be honest, have integrity, and reject 'ends justify the means' reasoning, I do see how you can expect people to do all three simultaneously: many people - including many EAs and almost certainly you, given your views on moral uncertainty - do accept that the ends sometimes justify the means. Hence, to go around saying "the ends don't justify the means" when you think that, sometimes - perhaps often - they do, smacks of dishonesty and a lack of integrity. So, I hope you do write something further to your statements above.
It seems like the better response is to accept that, in theory, the ends can sometimes justify the means - it would be right to harm one person to save *some* number more - but then say that, in practice, defrauding people of their money is really not a time when this is true.
Just skimmed this, but I notice there seems to be something inconsistent between this and the usual AI dooomerism stuff. For instance, above you claim that we should be worried about values lock-in because we will be able to align AI - cf doomerism that says alignment won't work; equally, above you state the value drift could be prevented by 'turning the AGI off and on again' - which is, again, at odds with the doomerist claim that we can't do this. I'm unsure what to make of this tension.
For what it's worth, I'm a philosopher and I've not only offered to help GiveWell improve its moral weights, but repeatedly pressed them to do so over the years. I'm not sure why, but they've never shown any interest. I've since given up. Perhaps others will have more luck.
Q/DALYs are intended to measure health and the weights are found by asking individuals to make various trade-offs. There are some subtleties between them, but nothing important for this discussion.
WELLBYs are intended to measure overall subjective wellbeing, and do so in a way that allows quality and quantity of life to be traded off. Subjective wellbeing is measures via self-reports, primarily of happiness and life satisfaction (see World Happiness Report; UK Treasury). I should emphasise that HLI did not invent either the idea of measuring feelings, or of the WELLBY itself - we're transferring and developing ideas from social science. How much difference various properties make to subjective wellbeing, e.g. income, relationship, employment status, etc. are inferred from the data, rather than asking people for their hypothetical preferences. Kahneman et al. draw an important distinction between decision utility (what people choose, aka preferences) and experienced utility (how people feel, aka happiness). The motivation for the focus on subjective wellbeing is often that there is often a difference between them (due to e.g. mispredictions of the future) and, if there is, we should focus on the latter.
Hence, when you say
Saying 'wellbeing-adjusted life years' suggests this involves a metric where people would trade off ('adjusted') an additional year of life spent at one level of happiness versus something else.
I'm puzzled. The WELLBY is 'adjusted' just like the QALY and the DALY: you're combining a measure of quality of life with a measure of time, not just measuring time. On the QALYs, a year at 0.5 health utility are worth half as much as at 1 health utility, because of the adjustment for quality.
yes, thanks for this. I'll be more careful when I say this in future. Providing a $1000 transfer costs a bit more than $1,000 in total, when you factor in the costs to deliver it.
Thanks for spotting and including this Mo! Yes, Dan, at HLI we're trying to develop and deploy the WELLBY approach and work how much difference it makes vs the 'business as normal' income and health approaches. We're making progress, but it's not as fast as we'd like it to be!
Feel free to reach out if you'd like to chat. Michael@happierlivesinstitute.org
I agree that this methodology should be better explained and justified. In a previous blog post on Open Philanthropy's cause prioritisation framework, I noted that Open Philanthropy defers to GiveWell for what the 'moral weights' about the badness of death should be, but that GiveWell doesn't provide a rationale for its own choice (see image below).
I, with colleagues at HLI, are working on a document that specifically looking at philosophical issues about how to compare improving to saving lives. We hope to publish that within 4 weeks.
To ask what should be the obvious question, How do you plan to compare the different causes to each other? What metrics will you us? My obligatory plug is for the Wellbeing Life-Years, aka WELLBY, approach
I don't think this makes sense, no, sorry. The HLI meta-analysis results are from cash transfers, which make a few individuals happier over time, not looking at the average of an entire society. It's well-studied that people care about their relative income, not just their absolute income. So we should be particularly worried about extrapolating from what works for individuals to what works for societies - especially where we think the benefit to the individual could be from comparisons. Hence, I think it is not justified to start from the HLI numbers.
IIRC, in the HLI cash transfer meta-analysis, we found that cash transfers had no effect on those in nearby villages ('across-village' effect). In other words, there was, on average, no relative income effect. I was puzzled by it and I find it hard to believe - our CEA does, however, despite my disbelief, assume there are no negative spillovers from cash transfers. I was puzzled by this because there's such consistent evidence of a relative income effect in rich countries. I also thought it was weird the effect from cash transfers was zero. To put this in context, imagine a bunch of people down the road from you get given $40,000 for each household. Would you expect that to have no effect on you? It wouldn't make you envious? Or, it wouldn't make you excited that this might happen to you? I'd expect the effect of income to be (almost) wholly relative in rich countries, but not that there was no relative income effect in the very poor. However, there wasn't loads of across-village data in the HLI meta-analysis, so I didn't update much. It would be good to have a bigger and better analysis of the relative income effect in very poor contexts.
On the summary: I'd have found this summary more useful if it had made the ideas in the paper simpler, so it was easier to get an intuitive grasp on what was going on. This summary has made the paper shorter, but (as far as I can recall) mostly by compressing the complexity, rather than lessening it!
On the paper itself: I still find Tarsney's argument hard to make sense of (in addition to the above, I've read the full paper itself a couple of times).
AFAIT, the set up is that the longtermist wants to show that there are things we can do now that will continually make the future better than it would have been ('persistent-difference strategies'). However, Tarnsey takes the challenge to be that there are things that might happen that would stop these positive states happening ('exogenously nullifying events'). And what does all the work is that if the human population expands really fast ('cubic growth model'), that is, because it's fled to the stars, but the negative events should happen at a constant rate, then longtermism looks good.
I think what bothers me about the above is this: why think that we could ever identify and do something that would, in expectation, make a persistent positive difference, i.e. a difference for ever and ever and ever? Isn't Tarsney assuming the existence of the thing he seeks to prove, ie 'begging the question'? I think the sceptic is entitled to respond with a puzzled frown - or an incredulous stare - about whether we can really expect to knowingly change the whole trajectory of the future - that, after all, presumably is the epistemic challenge. That challenge seems unmet.
I've perhaps misunderstood something. Happy to be corrected!
A couple of comments.
(1), I found this post quite hard to understand - it was quite jargon-heavy.
(2) I'd have appreciated it if you'd located this in what you take to be the relevant literature. I'm not sure if you're making an argument about (A) why you might want to diversify resources across various causes, even if certain in some moral view (for instance because there are diminishing marginal returns, so you fund option X up to some point and then switch to Y) or (B) why you might want to diversify because you are morally uncertain.
(3), because of (2), I'm not sure what your objection to 'argmax' is. You say 'naive argmax' doesn't work. But isn't that a reason to do 'non-naive argmax' rather than do something else? Cf. debates where people object to consequentialism by claiming it implies you ought to kill people and harvest their organs, and the consequentialist says that's naive and not actually what consequentialism would recommend.
Fwiw, the standard approaches to moral uncertainty ('my favourite theory' and 'maximise expected choiceworthiness') provide no justification in themselves for splitting your resources. In contrast, the 'worldview diversfication' approach does do this. You say that worldview diversification is ad hoc, but I think it can be justified by a non-standard approach to moral uncertainty, one I call 'internal bargaining' and have written about here.
I agree that we can and should try to be Bayesian but, if we do, we still don't get a slam-dunk result that economic growth will increase average happiness (at least, in already rich countries).
The story that often gets told to explain why the Easterlin Paradox holds refers to hedonic adaptation, social comparison, and evolution. We are very good at getting used to lots of things but we do continue to notice our status relative to others. How much material prosperity do we really need, given humans are basically naked apes who evolved to live in the savannah? We might imagine getting richer would make a difference to us, but think about the last thing you were really excited to buy, then think about how you've stopped paying attention to it. Therefore, we can explain both why money would matter in the cross-section and why it wouldn't matter in the time-series. So noticing that money makes individuals happier at a time does not, by itself, require us to conclude that economic growth would increase average happiness.
What's more, there are some reasons to worry that modernity is not good for humans. As I said in my earlier post:
Notably, Hidaka (2012) argues that depression is rising as a result of modernity, and points to the fact that “modern populations are increasingly overfed, malnourished, sedentary, sunlight-deficient, sleep-deprived, and socially-isolated”.
In other words, you can't just assume that economic growth increases happiness - that's exactly the point. If you're going to already take it as given, then there's no purpose in having the debate.
Vadim, thanks very much for writing this. I'm really pleased to see this debate moving forward. I've discussed this quite a bit with HLI colleagues over the last few days and wanted to share where we've got to so far.
TL;DRs are (1) We should probably now conclude we don't have enough data to know if the Easterlin Paradox is true; (2) even if economic growth increases wellbeing, the effects are likely so small we should be sceptical about prioritising it.
I'll break this into several smaller points.
1. Easterlin and O'Connor (and others) do not claim there is no relationship between growth rates and happiness. They claim there is one - it's what you pick up on - but that it's not statistically significant (we don't know if it's more than chance) or economically significant. The latter term is a bit vague, but the sense is that it's so small we shouldn't make increasing economic growth a priority.
2. What I take to be your key observation is that, just taking the coefficients at face value, they suggests that (A) doubling of national income over time has about the same effect as (B) doubling income for an individual at a time. Hence, there is no paradox to explain: wealth makes nations as much happier as it does individuals. This observation is important and I think new.
3. One issue, however, is that we lack the statistical power to check if this effect holds. This is a rather large update and I thank Caspar Kaiser for it: Caspar points out the relevant coefficient is 0.001, three times smaller than the standard error. To elaborate, the problem is we're looking for a really small change over time but there are only a few years of available data. By rough analogy, this is a bit like trying to detect if climate change is happening when you have only 100 years of data - because the effect is so small, you'd struggle to detect if even if it's there. The effect on long-run growth might be more positive, or even negative, but we cannot tell.
4. As far as I know, no one has raised this issue regarding the Easterlin Paradox either: namely, because the (cross-sectional) relationship between income and happiness is so small, would we actually have enough data to prove or disprove the effects over time? I think this merits further investigation and it would be worth calculating when there would be enough data to tell.
5. One thing that's worth (re)emphasising is how small the relationship between income and happiness is. If a doubling of income increases subjective wellbeing by 0.1 on a 0-10 scale - what HLI's cash transfer numbers suggest - then you need 10 doublings to go up 1 point. However, that means you need to be over 1,000 times richer. If we're thinking on a global scale, extrapolating this far starts to look weird: what would the world be like if global GDP was 1000x was it is right now? Is that even possible? Relatedly, we should worry about how reasonable it is to look at the 2-3 decades of data we have about economic growth and extrapolate that forward 500-1000 years.
6. On the basis of 5, you can see why Easterlin and others have claimed the relationship is economically insignificant: in short, a little bit more economic growth is barely going to move the collective needle, particularly if you're thinking about improving lives over (just) the next 50-70 years (rather than the longterm).
7. A potential response to the claim it's economically insigificant is the one Vadim makes: actually, a small change to a lot of people is a big change, and we should (in principle) be prepared to pay quite a lot to make this happen.
8. I think the correct response to 7. is to agree to the principle that if we could (say) raise economic growth by 1 percentage point for 30 years, that would be quite big, but then to point out there doesn't seem to be a large magic wand we can wave that will make this happen. More generally, I'm not a fan of claims along the lines that "we should be excited about unspecified action X, even if it costs an arbitrarily large sum of money Y, because it's a great deal even if it only has an arbitrarily small chance of success Z". I don't take these seriously until more evidence is provided.
9. Moving on to Vadim's second and third claims, it's not really the case that small differences in methodology make big differences to the results: Caspar Kaiser also pointed out that all of these are super imprecisely estimated anyway, so the particular results from adding or taking one bit away are basically luck.
10. Finally, on the comparisons of increasing GDP vs other things, we really want to get into the details of cost-effectiveness analyses and the success of achieving particular policy goals, rather than just looking in crude terms and how big various changes would be.
Yeah, imo the most natural thing is to make "global priorities research" a discipline. Maybe there should also be other (sub)disciplines that are more cause specific, ie related to (human) wellbeing, existential risks, etc.
FWIW, I'm finding the forum less useful and enjoyable than before and I'm less motivated to contribute. I think the total number of posts has gone up, whereas the number of posts I want to read is about the same or a bit lower.
When I log on I see lots of (1) recycled topics, that is, things that have been discussed many times before (admittedly new users can't really be blamed for this, but still) and (2) hot(ter) takes, where people are sharing something without having really thought or researched it. Clearly, there is some overlap between (1) and (2).
Very much agree with this. The fact that people seem to interpret "mental health is a problem" as "mental health for effective altruists is a problem" illustrates my point about the odd attitude
Great question, thanks for this! Part of the motivation for global desire theories is something like Parfit's addiction case, which I mention in section 3 of the paper and will now quote at length
Parfit illustrates this with his famous case of Addiction:
I shall inject you with an addictive drug. From now on, you will wake each morning with an extremely strong desire to have another injection of this drug. Having this desire will be in itself neither pleasant nor painful, but if the desire is not fulfilled within an hour it will then become very painful. This is no cause for concern, since I shall give you ample supplies of this drug. Every morning, you will be able at once to fulfil this desire. The injection, and its after‐effects, would also be neither pleasant nor painful. You will spend the rest of your days as you do now.31
Parfit points out that on a summative desire theory—on which all your desires count and how your life goes overall is the product of the extent to which each desire is fulfilled and intensity of each desire—your life goes better in Addiction.32 But it is hard to believe one’s life would go better in the Addiction case.
Parfit draws a distinction between local and global desires where a desire is “global if it is about some part of one’s life considered as a whole, or is about one’s whole life”. A global desire theory (GDT), counts only global desires. On this theory, we can say being addicted is worse for us; when we think about how our lives go overall, we do not want to become addicted.
The appeal of a global theory is that, in some sense, you get to make a cognitive choice about which desires count. If you weren't able to choose which desires count, then Addiction would be better for you (once you were actually addicted, anyway).
You might think that getting addicted really is good for me, in which you've presumably abandoned the global account in favour of the summative one. Which is fine, but doesn't take away from the fact that automaximisation is still a problem for the global view.