Comment by gregory_lewis on EA Mental Health Survey: Results and Analysis. · 2019-06-13T08:10:19.087Z · score: 19 (9 votes) · EA · GW

Thanks for this. A statistical note:

As best as I can tell, linear regressions are used throughout. This is not an appropriate method for when the dependent variable is binary (e.g. 'mentally ill or not'. Logistic regression should be used instead - and may find some further associations, given linear methods will intuitively be underpowered.

Comment by gregory_lewis on Cash prizes for the best arguments against psychedelics being an EA cause area · 2019-06-04T04:01:17.453Z · score: 6 (4 votes) · EA · GW

1) Generally my probability mass is skewed to the lower ends of the intervals I'm noting. Thus the 70% band is more with multiple caveats rather than just one (e.g. a bit like - as Scott describes it - Ketamine: only really useful for depression, and even then generally modest effects even as second line therapy). Likewise the 3% is mostly 'around SSRIs and maybe slightly better', with subpercentile mass as the dramatic breakthrough I think you have in mind.

2) Re. updates: There wasn't a huge update on reading the studies (not that I claim to have examined them closely), because I was at least dimly aware since medical school of psychedelics having some promise in mental health.

Although this was before I appreciated the importance of being quantitative, I imagine I would have given higher estimates back then, with the difference mainly accounted for by my appreciation of how treacherous replication has proven in both medicine and psychology.

Seeing that at least some of the studies were conducted reasonably given their limitations has attenuated this hit, but I had mostly priced this in as I expected to see this (i.e. I wasn't expecting to see the body of psychedelic work was obviously junk science etc.).

3) Aside: Givewell's view doesn't appear to be "1-2% that deworming effects are real", but:

The “1-2% chance” doesn’t mean that we think that there’s a 98-99% chance that deworming programs have no effect at all, but that we think it’s appropriate to use a 1-2% multiplier compared to the impact found in the original trials – this could be thought of as assigning some chance that deworming programs have no impact, and some chance that the impact exists but will be smaller than was measured in those trials.

I.e. Their central estimate prices across a range of 'no effect' 'modest effect' 'as good as the index study advertised', but weighted towards the lower end.

One could argue whether, if applied to psychedelics, whether the discount factor they suggest should be higher or lower than this (multiple studies would probably push to a more generous discount factors, but an emphasis on quality might point to more pessimistic ones, as the Kremer index study has I think a stronger methodology - and a lot more vetting - than the work noted here). But even something like a discount of ~0.1 would make a lot of the results noted above considerably less exciting (e.g. The Calhart-Harris effect size drops to d~0.3, which is good but puts it back into the ranges seen with existing interventions like CBD).

VoI is distinct from this best guess (analogously, a further deworming RCT to reduce uncertainty may have higher or lower value than 'exploiting' based on current uncertainty), but I'd return to my prior remarks to suggest the likelihood of ending up with something '(roughly) as good as initial results advertise' is low/negligible enough not to make it a good EA buy.

4) Further aside: Given the OP was about psychedelics generally (inc advocacy and research) rather than the particular points on whether confirmatory research was a good idea, I'd take other (counter-) arguments addressed more generally than this to be in scope.

Comment by gregory_lewis on Cash prizes for the best arguments against psychedelics being an EA cause area · 2019-06-03T09:12:13.189Z · score: 4 (2 votes) · EA · GW

Not sure how helpful percentages are given effect sizes and interventions are varied.

Per my OP, I'd benchmark getting results similar or better to SSRIs (i.e. modestly effective for a few mental illnesses) to be the top 3% ish of what I'd expect research to confirm. I'd give 25% for essentially nothing replicating and it going the way of power poses, priming, or other psych dead ends Scott mentions.

The remaining 70% is smeared across much less impressive results (and worth noting SSRIs are hardly a miracle cure): maybe sort-of helpful for one condition, maybe helpful but only for a subset of motivated individuals, etc. etc.

Comment by gregory_lewis on Cash prizes for the best arguments against psychedelics being an EA cause area · 2019-06-03T06:12:13.832Z · score: 2 (1 votes) · EA · GW

Gregory didn't close out his argument except to say that he thinks EA shouldn't fund most kinds of research, including confirmatory research about psychedelics. (In his initial post, he pointed to some reasons why he thinks the results of the initial studies won't hold up under further scrutiny, but he doesn't think funding more scrutiny should be an EA priority, and I don't follow why not.)

My views are the same as Carl's, hence I didn't make a further reply. (i.e. Low enough base rates imply the yield on chasing replications does not reach the - high - bar for EA).

Comment by gregory_lewis on A Framework for Thinking about the EA Labor Market · 2019-05-16T12:02:05.145Z · score: 10 (3 votes) · EA · GW

To be clear: I don't think suppressing pay is a suboptimal way to foster a strong culture. I think driving to low salaries is sign-negative for this goal.

Comment by gregory_lewis on A Framework for Thinking about the EA Labor Market · 2019-05-16T12:01:26.599Z · score: 6 (2 votes) · EA · GW

Hello Jon,

I agree with you that "if you have low/suppressed pay, you harm your recruitment". I think we disagree on how prevalent the antecedent is: I think the 80k stat you cite elsewhere is out of date - although I think some orgs still are paying in a fairly flat band around 'entry level graduate salary', I think others do pay more (whether enough to match market isn't clear, but I think the shortfall is less stark than it used to be).

Comment by gregory_lewis on A Framework for Thinking about the EA Labor Market · 2019-05-16T11:46:28.642Z · score: 26 (9 votes) · EA · GW

The latter seems substantially better than the former by my lights (well, substituting 'across the board' for 'let the market set prices'.)

The standard econ-101 story for this is (in caricature) that markets tend to efficiently allocate scarce resources, and you generally make things worse overall if you try to meddle in them (although you can anoint particular beneficiaries).

The mix of strategies to soft-suppress (i.e. short of frank collusion/oligospony) salaries below market rate will probably be worse than not doing so - the usual predictions are a labour supply shortfall, with the most able applicants preferentially selecting themselves out (if I know I'll only realistically get $X at an EA org but I can get $1.5X in the private sector, that's a disincentive, and one that costs the EA org if they value my marginal labour more than X), and probably misallocation issues (bidding up wages gives a mechanism for the highest performers to match into the highest performing orgs).

It's also worth stressing the "Have a maximum, but ask the applicant to make a first suggestion; don't disclose wages; discourage employees sharing their salary with other employees" isn't an EA innovation - they are pretty standard practice in salary negotiation on the employer side, and they conspire to undermine employee bargaining position. Canny employees being confronted with 'how much do you need?' may play along with the charade ("I need $10k more for leisure and holidays which are - promise! - strictly necessary to ensure my peak productivity!") or roll their eyes at the conceit ("So I need $X in the sense you need to offer $X or I won't take the job").


  • 'Paying by necessity' probably gets into legal trouble in various jurisdictions. Paying Alice more than (similarly situated) Bob because (e.g.) she has kids is unlikely to fly. (More generally, the perverse incentives on taking 'pay by necessity' at face value are left as an exercise to the reader).
  • Heavily obfuscating compensation disadvantages poorer/less experienced/less willing negotiators. I think I remember some data suggesting there are demographic trends in these factors - insofar as it does, it seems likely to lead to unjust bias in compensation.
  • Typical sentiment is that employees rather than employers are the weaker party, at greater risk of exploitative or coercive practice. I don't understand why in EA contexts we are eager to endorse approaches that systematically benefit the latter at the expense of the former.
  • Not trying to push down the ceiling doesn't mean you have to elevate the floor. People can still offer their services 'at a discount' if they want to. Although this still a bad idea, one could always pay at market and hope employees give their 'unnecessary' money back to you.
  • I'm a big fan of having some separation between personal and professional life (and I think a lot of EA errs in eliding the two too much). Insofar as these aren't identical - insofar as "Greg the human" isn't "Greg the agent of his EA employers will", interests of (EA) employer and (EA) employee won't perfectly converge: my holiday to Rome or whatever isn't a 'business expense'; the most marginal activity of my org isn't likely to be the thing that I consider the best use of my funds. Better to accept this (and strike mutually beneficial deals) rather than pretending otherwise.
Comment by gregory_lewis on A Framework for Thinking about the EA Labor Market · 2019-05-14T13:27:40.980Z · score: 19 (6 votes) · EA · GW

[Obvious conflicts of interest given I work for an EA org - that said, I have argued similar points before that was the case]


I'm also extremely sceptical of (in caricature) 'if people aren't willing to take a pay-cut, how do we know they really care?' reasoning - as you say, one doesn't see many for-profit companies use the strategy of 'We need people who believe in our mission, so we're going to offer market -20% to get a stronger staff cohort'. In addition to the points made (explicitly) in the OP on this, there's an adverse selection worry: low salaries may filter for dedication, but also lower-performers without better 'exit options'.

(Although I endorse it anyway, I have related 'EA exceptionalism' worries about the emphasis on mission alignment etc. Many non-profits (and most for profits) don't or can't rely on being staffed with people who passionately invest in their brand, and yet can be very successful.)

That said, my impression is the EA community is generally learning this lesson. Although the benchmarks are hard, most orgs that can now offer competitive(ish) compensation. It is worth noting the reverse argument: if lots of EA roles are highly over-subscribed, this doesn't seem a reason to raise salaries for at least these roles - it might suggest EA orgs can afford to drop them(!)

A lot has been written on trying to explain why EA orgs (including ones with a lot of resources) say they struggle to find the right people, whilst a lot of EA people say they really struggle find work for an EA org. What I think may explain this mismatch the EA community can 'supply' lots of generally able and motivated people, whilst EA org demand skews more to those with particular specialised skills. Thus jobs looking for able generalists have lots of applicants yet the 'person spec' for other desired positions have few or zero appointable candidates.

This doesn't give a clear 'upshot' in terms of setting compensation: it's possible that orgs who set a premium on chasing up the tail of their best generalists applicant may see increasing salary still pay dividends even when they have more than enough appointable candidates to choose from now; supply of specialised people might sufficiently depend on non-monetary considerations to be effectively inelastic.

My overall impression agrees with OP. It's probably more economically efficient to set compensation at or around market, rather than approximating this with a mix of laggy and hard to reallocate transfer contributions of underpaid labour. Insofar as less resource-rich orgs cannot afford to do this, they are fortunate that there are a lot of able people who are also willing to make de facto donations of their earning power to them. Yet this should be recognised as sub-optimal, rather being lionised as a virtue.

Comment by gregory_lewis on Cash prizes for the best arguments against psychedelics being an EA cause area · 2019-05-11T17:44:24.201Z · score: 18 (9 votes) · EA · GW

The latter. EA shouldn't fund most research, but whether it is confirmatory or not is irrelevant. Psychedelics shouldn't make the cut if we expect (as I argue above) we expect a lot of failure to replicate and regression, and the true effect to be unexceptional in the context of existing mental health treatment.

Comment by gregory_lewis on Cash prizes for the best arguments against psychedelics being an EA cause area · 2019-05-11T16:22:20.027Z · score: 15 (8 votes) · EA · GW

It does, but although that's enough to make it worthwhile on the margin of existing medical research, that is not enough to make it a priority for the EA community.

Comment by gregory_lewis on Cash prizes for the best arguments against psychedelics being an EA cause area · 2019-05-10T21:49:32.163Z · score: 58 (35 votes) · EA · GW

[Own views]

0) I don't know what the bar should be for calling something a 'cause area' or 'EA interest' should be, but I think this bar should be above (e.g.) 'promising new drug treatment for bipolar disorder', even though this is unequivocally a good thing. Wherever exactly this bar falls (I don't think it needs to be 'as promising as global health'), I don't think psychedelics meet it.

1) My scepticism on the mental health benefits of psychedelics mainly rely on second-order causes for concern, namely:

1.1) There's some weak wisdom of nature prior that blasting one of your neurotransmitter pathways for a short period is unlikely to be helpful. This objection is pretty weak, given existing psychiatric drugs are similarly crude (although one of their advantages by the lights of this consideration is they generally didn't come to human attention by previous recreational use).

1.2) I get more sceptical as the number of (fairly independent) 'upsides' of a proposed intervention increases. The OP notes psychedelics could help with anxiety and depression and OCD and addiction and PTSD, which looks remarkably wide-ranging and gives suspicion of a 'cure looking for a disease'. (That they are often mooted as having still other benefits on people without mental health issues such as improving creativity and empathy deepens my suspicion). Likewise, a cause that is proposed to be promising on long-termism and its negation pings suspicious convergence worries.

1.3) (Owed to Scott Alexander's recent post). The psychedelic literature mainly comprises small studies generally conducted by 'true believers' in psychedelics and often (but not always) on self-selected and motivated participants. This seems well within the territory of scientific work vulnerable to replication crises.

1.4) Thus my impression is that although I wouldn't be shocked if psychedelics are somewhat beneficial, I'd expect them to regress at least as far down to efficicacies observed in existing psychopharmacology, probably worse, and plausibly to zero. Adding to the armamentarium of therapy for mental illness (in expectation) is worthwhile, but not enough for a big slice of EA opinion: it being a promising candidate for further exploration relies on 'neartermism' and (conditional on this) the belief that mental health is similarly promising to standard global health interventions on NTDs etc.

2) On the 'longtermism' side of the argument, I agree it would be good - and good enough to be an important 'cause' - if there were ways of further enhancing human capital. (I bracket here the proposed mental health benefits, as my scepticism above applies even more strongly to the case that psychedelics are promising based on their benefits to EA community members' mental health alone).

My impression is most of the story for 'how do some people perform so well?' will be a mix of traits/'unmodifiable' factors (e.g. intelligence, personality dispositions, propitious upbringing); very boring advice (e.g. 'Sleep enough', 'exercise regularly'); and happenstance/good fortune. I'd guess there will be some residual variance left on the table after these have taken the lion's share, and these scraps would be important to take. Yet I suspect a lot of this will be pretty idiographic/reducible to boring advice (e.g. anecdotally, novelists have their own peculiar habits for writing: IIRC Nabokov used index cards, Pullman has a writing shed, Gaiman a 'novel writing pen' - maybe 'having a ritual for dedicated work' matters, but which one is a matter of taste).

The evidence for psychedelic 'enhancement' is even thinner than psychedelic therapy, and labours under a more adverse prior. I agree the case for psychedelics here is comparable to CFAR/Paradigm/rationality training, but I would rule both out, not in.

3) I agree with agdfoster that psychedelics have reputational costs. This 'bad rap' looks unfair to me (notwithstanding the above, I'm confident that an 'MDMA habit' is much better for you than an alcohol, smoking, extreme sports, or social media one, none of which attract similar opprobrium), but it is decision-relevant all the same. If the upside was big enough, these costs would be worth paying, but I don't think they are.

Comment by gregory_lewis on Aging research and population ethics · 2019-04-29T21:20:53.421Z · score: 2 (3 votes) · EA · GW

Good post. Some further considerations on the total view side of things (mostly culled from a very old working paper I have here where I suggest life extension may be bad - but N.B. besides its age and a few errors, my overall view is now tentatively pro rather than tentatively con).

0. LEV or not seems to be a distraction. The population ethics concerns don't really change much either way if the offer on the table is LEV or merely 'L' (e.g. there's a new drug which guarantees lifespan to 150 but no more).

1. As the contours of your argument imply, I think the core ethical issue on totalist-y lights would be whether there is a 'packaging constraint' on how one should allocate available lifetime to persons (e.g. better 1 800 year life versus 10 80 year lives, or vice versa), versus a broad cloud of empirical considerations and second order effects (although I think these probably dominate the calculus).

2. I don't buy the story that life extension can be a free lunch. If it is better to 'package' lifespan into 80 year chunks versus millenia-sized chunks, whether or not to pursue this will have great impact across the future, so any initial 'free benefit' will be probably outweighed by ongoing misallocation across the future. (I suppose the story could be 'LEV, even if bad, is inevitable, and doing it sooner at least gets a bigger free lunch - but it seems in such a world there bigger scale problems to target).

3. On pure aggregation, the key seems to be whether lifespan has accelerating or diminishing marginal returns. As you say, intuitive survey by time-tradeoff gives conflicting recommendations: most would be averse to gambles like "Would you rather 5% chance of 2000 years (and 95% of dying right now) versus keeping your life expectancy?", yet we'd also be averse to 'Logan's run' (or Logan's sprint) cases of splitting 80 year lives into 16 5-year lives (or, indeed, millions of 2 minute ones).

3.1 One natural reply to defuse 'Logan's run' type reductios is to suggest it is confounded with human development. One might say our childhood and adolescence is in part an investment to enjoy the greater goods of adulthood. So perhaps we would take lifespan to have accelerating returns up commensurate to this, but maybe not for the interval of 20-ish to infinity (so if the returns diminish, there will usually be a break-even point whereby the 'investment cost' is matched by the diminishing returns loss, so making the ideal tiling of lives across time not 'as long as possible'.

(We should probably be pretty surprised if the morally 'optimal' lifespan just-so-happened to match our actual lifespan which emerged from a mix of contingent biological facts. Of course, it could be the 'optimal' lifespan is shorter, not larger, than the one we can typically expect.)

3.2 There's a natural consideration for diminishing returns on the idea that people may naturally prioritise the best things to do with their life first, and so extending their lives gives them opportunity (borrowing a bit from Bernard Williams) to engage in further projects which, although good, are not as good as those they prioritised before then. So packaging into smaller chunks offers the ability for the population over the time to complete more 'most valuable' projects.

3.3 On the other, there's a murkier issue about maybe having a much longer life 'unlocks' opportunities which are better than those shorter lives can access. In the same way 'living each day as your last' when taken literally is terrible advice (many things people want to do take much longer than a day to accomplish), perhaps (say) observing changes over cosmological or geological timescales are much experiences than what one can do in decades. This looks fairly speculative/weak to me.

What seems more persuasive on the 'increasing marginal returns' side is the idea of positive interaction terms between experience moments. Some good things could be even better if they resonate with other previous moments, and so a longer prior life seems to provide further opportunity for this (e.g. insofar as 'watching the grandchildren grow up' is joyful, a longer life better ensures this occurs, among many other examples).

4 Egalitarianism, 'justicy'-considerations, or prioritarianism will generally push towards packaging in shorter blocks rather than longer ones: the one which best gets around tricky different number cases is prioritarianism. Insofar as you are sympathetic to these views, these will seem to push against life extension.

4.1: I'm pretty sympathetic to Parfitian/deflationary accounts of personal identity, which would take the wind out of the sales of this line of argument (as there isn't much remaining sense of a given person being better or worse off than another, nor of an index to which there's a 'you' that accrues person moments which may have diminishing returns). Such a view also takes the wind out of the sails of a pro life extension case (as we should be relatively indifferent to whether future moments are linked to our present ones or otherwise), although there might be second order considerations (beyond those mentioned above, if most experience moments simply prefer to be linked up to more future ones, this is a pro tanto consideration in favour).

5 It seems the second order impacts are best distinguished from the 'pure axiological' issue above. It could be that very long lives are an imperfect allocation, but still best all-things considered if (for example) it allows people to develop much greater skill and ability and (say) produce works of even greater artistic genius. A challenge to trying to disentangle this is plausible scenarios which offer (radical) life extension likely involve other radical changes to the human condition: maybe we can also enhance ourselves in various ways too (and maybe these aren't seperable, so maybe the moral cost we pay for improperly long lives is a price worth paying for the other benefits).

5.1 If we separate these and imagine some naive 'eternal (or extended) youth' scenario (e.g. people essentially like themselves, with a period of morbidity similar to what we'd expect now, but their period of excellent health extended by a long time), I'd agree this leans positive. Beyond skill building benefits, I'd speculate longer lives probably prompt less short-sightedness in policy and decision making.

Comment by gregory_lewis on Legal psychedelic retreats launching in Jamaica · 2019-04-19T01:42:09.615Z · score: 16 (11 votes) · EA · GW

My impression agrees with Issa's: in EA, psychedelic use seems to go along with a cluster of bad epistemic practice (e.g. pseudoscience, neurobabble, 'enlightenment', obscurantism).

This trend is a weak one, with many exceptions; I also don't know about direction of causation. Yet this is enough to make me recommend that taking psychedelics to 'make one a better EA' is very ill-advised.

Comment by gregory_lewis on Reducing EA job search waste · 2019-04-17T07:47:16.439Z · score: 29 (12 votes) · EA · GW

Although private industry and EA organisations may have different incentives, a lot of law for the former will apply to the latter. Per Khorton, demanding the right to publish successful applicants CVs would be probably illegal in many places, and some 'coordination' between EA orgs (e.g. a draft system) seems likely to run afoul of competition law.


  • The lowest hanging fruit here (which seems a like a good idea) is to give measures of applicant:place ratios for calibration purposes.
  • Independent of legal worries, one probably doesn't need to look at resumes to gauge applicant pool - most orgs have team pages, and so one can look at bios.
  • More extensive feedback to unsuccessful applicants is good, but it easier said than done, as explained by Kelsey Piper here.
  • I don't think EA employers are 'accountable to the community' for how onerous their hiring process is, provided they make reasonable efforts inform potential applicants before they apply. If they've done this, then I'd default to leaving it to market participants to make decisions in their best interest.
Comment by gregory_lewis on Is visiting North Korea effective? · 2019-04-05T09:45:37.283Z · score: 12 (4 votes) · EA · GW

'Getting experience in North Korea' is perhaps one of the worst things you can do if you want to work as a diplomat (or in government more broadly).

Taking US diplomats in particular (although this generalises well to other government roles, and to other countries) people in these roles - ditto ~half the federal government - require a security clearance. Going on your own initiative to a hostile foreign power (circumventing state department attempts to prevent US citizens going without their express dispensation due to safety concerns whilst you are at it) concisely demonstrates you are a giant security risk.

This impression gets little better (and plausibly even worse) if the explanation you offer for your visit is a (probably misguided) attempt to conduct tacit economic warfare against the NK government.

Comment by gregory_lewis on Apology · 2019-03-23T19:02:22.769Z · score: 37 (25 votes) · EA · GW

I don't see that as surprising/concerning. Suppose someone approaches you with (e.g.) "Several people have expressed concerns about your behaviour - they swore us to secrecy about the details, but they seemed serious and credible to us (so much so we intend to take these actions)."

It looks pretty reasonable, if you trust their judgement, to apologise for this even if you lack precise knowledge of what the events in question are.

(Aside: I think having a mechanism which can work in confidence between the relevant parties is valuable for these sorts of difficult situations, and this can get undermined if lots of people start probing for more information and offering commentary.

This doesn't mean this should never be discussed: these sorts of mechanisms can go wrong, and should be challenged if they do (I can think of an example where a serious failing would not have come to light if the initial 'behind closed doors' decision was respected). Yet this seems better done by people who are directly affected by and know the issue in question.)

Comment by gregory_lewis on EA jobs provide scarce non-monetary goods · 2019-03-21T06:46:34.242Z · score: 8 (4 votes) · EA · GW

Right, I (mis?)took the OP to be arguing "reducing salaries wouldn't have an effect on labour supply, because it is price inelastic", instead of "reducing salaries wouldn't have enough of an effect to qualitatively change oversupply.


I'd expect a reduction but not a drastic one. Like I'd predict Open Phil's applicant pool to drop to 500-600 from 800 if they cut starting salary by $10k-$15k.

This roughly cashes out to an income elasticity of labour (/applicant) supply of 1-2 (i.e. you reduce applicant supply by ~20% by reducing income ~~10%). Although a crisp comparison is hard to find, in the labour market you see figures generally <1, so this expectation slightly goes against the OP, given it suggests EA applicants are more compensation sensitive than typical.

Comment by gregory_lewis on EA jobs provide scarce non-monetary goods · 2019-03-20T22:55:07.058Z · score: 27 (12 votes) · EA · GW

(Obvious CoI/own views, but in my defence I've been arguing along these lines long before I had - or expected to have - an EA job.)

I agree 'EA jobs' provide substantial non monetary goods, and that 'supply' of willing applicants will likely outstrip available positions in 'EA jobs'. Yet that doesn't mean 'supply' of potential EA employees is (mostly) inelastic to compensation.

In principle, money is handy to all manner of interests one may have, including altruistic ones. Insofar as folks are not purely motivated by altruistic ends (and in such a way they're indifferent to having more money to give away themselves) you'd expect them to be salary-sensitive. I aver basically everyone in EA is therefore (substantially) salary-sensitive.

In practice, I know of cases (including myself) where compensation played a role in deciding to change job, quit, not apply etc. I also recall on the forum remarks from people running orgs which cannot compensate as generously as others that this hurts recruitment.

So I'm pretty sure if you dropped salaries you would reduce the number of eager applicants (albeit perhaps with greater inelasticity than many other industries). As (I think) you imply, this would be a bad idea: from point of view of an org, controlling overall 'supply' of applicants shouldn't be their priority (rather they set salaries as necessary to attract the most cost effective employees). For the wider community point of view, you'd want to avoid 'EA underemployment' in other ways than pushing to distort the labour market.

Comment by gregory_lewis on EA Hotel with free accommodation and board for two years · 2019-03-20T02:23:17.829Z · score: 2 (1 votes) · EA · GW

The inconvenience I had in mind is not in your list, and comprises things in the area of, "Prefer to keep the diet I'm already accustomed to", "Prefer omnivorous diets on taste etc. grounds to vegan ones", and so on. I was thinking of an EA who is omnivorous and feels little/no compunction about eating meat (either because they aren't 'on board' with the moral motivation for animal causes in general, or doesn't find the arguments for veganism persuasive in particular). I think switching to a vegan diet isn't best described as a minor inconvenience for people like these.

But to be clear, this doesn't entail any moral obligation whatsoever on the hotel to serve meat - it's not like they are forcing omnivorous guests to be vegan, but just not cooking them free (non-vegan) food. If a vegan offers me to stay at their house a) for free, b) offers vegan food for free too, c) welcomes me to, if I'm not a fan of vegan food, get my own food to cook at their house whenever I like - which seems basically the counterfactual scenario if I wasn't staying with them in the first place, and d) explains all of this before I come, they've been supererogatory in accommodating me, and it would be absurd for me to say they've fallen short in not serving me free omnivorous food which they morally object to.

Yet insofar as 'free food' is a selling point of the hotel, 'free vegan food' may not be so enticing to omnivorous guests. Obviously the offer is still generous by itself, leave alone combined with free accommodation, but one could imagine it making a difference on the margin to omnivores (especially if they are cost-sensitive).

Thus there's a trade-off in between these people and vegans who would be put off if the hotel served meat itself (even if vegan options were also provided). It's plausible to me the best option to pick here (leave alone any other considerations) is the more 'vegan-friendly' policy. But this isn't because the trade-off is in fact illusory because the 'vegan-friendly' policy is has minimal/minor costs to omnivores after all.

[Empirically though, this doesn't seem to amount to all that much given (I understand) the hotel hasn't been struggling for guests.]

Comment by gregory_lewis on Sharing my experience on the EA forum · 2019-03-20T01:22:31.567Z · score: 7 (5 votes) · EA · GW

Beyond the 'silent downvote -> anon feedback' substitution (good, even if 'public comment' is even better) substitution, there could also be a 'public comment --> anon feedback' one (less good).

That said, I'm in favour of an anon feedback option: I see karma mostly serving as a barometer of community sentiment (so I'm chary of disincentivizing downvotes as this probably impairs resolution). It isn't a good way of providing feedback to the author (a vote is only a bit or two of information). Text is better - although for me, the main reasons I don't 'explain my downvotes' are mostly time, but occasionally social considerations. An anon option at least removes the latter disincentive.

Comment by gregory_lewis on The Importance of Truth-Oriented Discussions in EA · 2019-03-16T02:44:03.880Z · score: 17 (9 votes) · EA · GW

I think I get the idea:

Suppose (heaven forbid) a close relative has cancer, and there's a new therapy which fractionally improves survival. The NHS doesn't provide it on cost-effectiveness grounds. If you look around and see the NHS often provides treatment it previously ruled out if enough public sympathy can be aroused, you might be inclined try to do the same. If instead you see it is pretty steadfast ("We base our allocation on ethical principles, and only change this when we find we've made a mistake in applying them"), you might not be - or at least change your strategy to show the decision the NHS has made for your relative is unjust rather than unpopular.

None of this requires you to be acting in bad faith looking for ways of extorting the government - you're just trying to do everything you can for a loved one (the motivation for pharmaceutical companies that sponsor patient advocacy groups may be less unalloyed). Yet (ideally) the government wants to encourage protest that highlights a policy mistake, and discourage those for when it has done the right thing for its population, but is against the interests of a powerful/photogenic/popular constituency. 'Caving in' to the latter type pushes in the wrong direction.

(That said, back in EA-land, I think a lot things that are 'PR risks' for EA look bad because they are bad (e.g. in fact mistaken, morally abhorrent, etc.), and so although PR considerations aren't sufficient to want to discourage something, they can further augment concern.)

Risk Communication Strategies for the Very Worst of Cases

2019-03-09T06:56:12.480Z · score: 26 (7 votes)
Comment by gregory_lewis on What to do with people? · 2019-03-08T01:26:06.048Z · score: 8 (2 votes) · EA · GW

Related: David Manheim's writing on network theory and scaling organisations.

Comment by gregory_lewis on What skills would you like 1-5 EAs to develop? · 2019-03-07T20:53:31.808Z · score: 3 (2 votes) · EA · GW

A bit of both:

I'd like to see more forecasting skills/literacy 'in the water' of the EA community, in the same way statistical literacy is commonplace. A lot of EA is about making the world go better, and so a lot of (implicit) forecasting is done when deciding what to do. I'd generally recommend most people consider things like opening a Metaculus account, reading superforecasting, etc.

This doesn't mean everyone should be spending (e.g.) 3 hours a day on this, given the usual story about opportunity costs. But I think (per the question topic) there's also a benefit of a few people highly developing this skill (again, a bit like stats: it's generally harder to design and conduct statistical analysis than to critique one already done, but you'd want some folks in EA who can do the former).

Comment by gregory_lewis on What skills would you like 1-5 EAs to develop? · 2019-03-07T00:19:14.160Z · score: 9 (4 votes) · EA · GW


This is more a 'skill I'd like to see more of in the EA community', rather than a career track. It seems a generally valuable skill set for a lot of EA work, and having some people develop expertise/very high performance in it (e.g. becoming a superforecaster) looks beneficial to me.

Comment by gregory_lewis on What skills would you like 1-5 EAs to develop? · 2019-03-07T00:12:06.196Z · score: 12 (5 votes) · EA · GW

[Not one of the downvoters]

The leading rationale of "Learn a trade --> use it for EA projects that need it" looks weak to me:

  • There's not a large enough density of 'EA' work in any given place to take up more than a small fraction of a tradepersons activity. So this upside should be discounted by (substantial) time to learn the trade, and then most of one's 'full time job' as (say) an electrician will not be spent on EA work.
  • It looks pretty unlikely to have 'nomadic' tradespeople travelling between EA hubs, as the added cost of flights etc. suggest it might be more efficient just to try and secure good tradespeople by (e.g.) offering above market rates.

As you say, it could be a good option for some due to good earning power (especially for those with less academic backgrounds, cf. kbog's guide) but the leading rationale doesn't seem substantial reason to slant recommendations (e.g. if you could earn X as a plumber, but 1.1X in something else, the fact they could occasionally help out for EA projects shouldn't outweigh this.

Comment by gregory_lewis on Making discussions in EA groups inclusive · 2019-03-04T20:30:23.843Z · score: 15 (13 votes) · EA · GW

[I didn't downvote.] I fear the story is that this is something of a 'hot button' issue, and people in either 'camp' have sensitivities about publicly speaking out on one side or the other for fear of how others in the opposing 'camp' may react towards them. (The authors of this document are anonymous; previous conversations on this area in this forum have had detractors also use anon accounts or make remarks along the lines of, 'I strongly disagree with this, but I don't want to elaborate further'). Hence why people who might be opposed to this (for whatever reason) preferring anonymous (albeit less-informative) feedback via downvoting.

There are naturally less charitable explanations along the lines of tribalism, brigadeing, etc. etc.

Comment by gregory_lewis on EA Survey 2018 Series: How welcoming is EA? · 2019-03-03T11:29:02.823Z · score: 10 (4 votes) · EA · GW

Thanks for your reply, and the tweaks to the post. However:

[I] decided to keep the discussion short because the regression seemed to offer very limited practical significance (as you pointed out). Had I decided to give it more weight in my analysis then it certainly would be appropriate to offer a fuller explanation. Nonetheless, I should have been clearer about the limited usefulness of the regression, and noted it as the reason for the short discussion.

I think the regression having little practical significance makes it the most useful part of the analysis: it illustrates the variation in the dependent variable is poorly explained by all/any of the variables investigated, that many of the associations found by bivariate assessment vanish when controlling for others, and gives better estimates of the effect size (and thus relative importance) of those which still exert statistical effect. Noting, essentially, "But the regression analyses implies a lot of the associations we previously noted are either confounded or trivial, and even when we take all the variables together we can't predict welcomeness much better than taking the average" at the end buries the lede.

A worked example. The summary notes, "EAs in local groups, in particular, view the movement as more welcoming than those not in local groups" (my emphasis). If you look at the t-test between members and nonmembers there's a difference of ~ 0.25 'likert levels', which is one of the larger effect sizes reported.

Yet we presumably care about how much of this difference can be attributed to local groups. If the story is "EAs in local groups find EA more welcoming because they skew (say) male and young", it seems better to focus attention on these things instead. Regression isn't a magic wand to remove confounding (cf.), but it tends to be better than not doing it at all (which is essentially what is being done when you test association between a single variable and the outcome).

As I noted before, the 'effect size' of local group membership when controlling for other variables is still statistically significant, but utterly trivial. Again: it is ~ 1/1000th of a likert level; the upper bound of the 95% confidence interval would only be ~ 2/1000th of a likert level. By comparison, the effect of gender or year of involvement are two orders of magnitude greater. It seems better in the conclusion to highlight results like these, rather than results the analysis demonstrates have no meaningful effect when other variables are controlled for.

A few more minor things:

  • (Which I forgot earlier). If you are willing to use means, you probably can use standard errors/confidence intervals, which may help in the 'this group looks different, but small group size' points.
  • Bonferroni makes a rod for your back given it is conservative (cf.); an alternative approach is false discovery rate control instead of family wise error rate control. Although minor, if you are going to use this to get your adjusted significance threshold, this should be mentioned early, and the result which 'doesn't make the cut' should be simply be reported as non-significant.
  • It is generally a bad idea to lump categories together (e.g. countries, cause areas) for regression as this loses information (and so statistical power). One of the challenges of regression analysis is garden of forking path issues (even post-hoc - some coefficients 'pop into' and out of statistical significance depending on which model is used, and once I've seen one, I'm not sure how much to discount subsequent ones). It is here where an analysis plan which pre-specifies this is very valuable.
Comment by gregory_lewis on After one year of applying for EA jobs: It is really, really hard to get hired by an EA organisation · 2019-03-01T07:36:14.444Z · score: 9 (5 votes) · EA · GW

FWIW: I think I know of another example along these lines, although only second hand.

Comment by gregory_lewis on EA Survey 2018 Series: How welcoming is EA? · 2019-03-01T07:28:07.030Z · score: 50 (17 votes) · EA · GW

Thanks for this - the presentation of results is admirably clear. Yet I have two worries:

1) Statistics: I think the statistical methods are frequently missing the mark. Sometimes this is a minor quibble; other times more substantial:

a) The dependent variable (welcomeness - assessed by typical Likert scale) is ordinal data i.e. 'very welcoming' > welcoming > neither etc). The write-up often treats this statistically either as categorical data (e.g. chi2) or interval data (e.g. t-test, the use of 'mean welcomeness' throughout). Doing the latter is generally fine (the data looks pretty well-behaved, t-tests are pretty robust, and I recall controversy about when to use non-parametric tests). Doing the former isn't.

chi2 tests against the null of (in essence) the proportion in each 'row' of a table is the same between columns: it treats the ordered scale as a set of 5 categories (e.g. like countries, ethnicities, etc.). Statistical significance for this is not specific for 'more or less welcoming': two groups with identical 'mean welcomeness' yet with a different distribution across levels could 'pass statistical significance' by chi2. Tests for 'ranked dependent by categorical independent' data exist (e.g. Kruskall-Wallis) and should be used instead.

Further, chi2 assumes the independent variable is categorical too. Usually it is (e.g. where you heard about EA) but sometimes it isn't (e.g. age, year of joining, ?political views). For similar reasons to the above, a significant chi2 result doesn't demonstrate a (monotonic) relationship between welcomeness and time in EA. There are statistical tests for trend which can be used instead.

Still further, chi2 (ditto K-W) is an 'omnibus' test: it tells you your data is surprising given the null, but not what is driving the surprise. Thus statistical significance 'on the test' doesn't indicate whether particular differences (whether highlighted in the write-up or otherwise) are statistically significant.

b) The write-up also seems to be switching between the descriptive and the inferential in an unclear way. Some remarks on the data are accompanied with statistical tests (implying an inference from the sample to the population), whilst similar remarks are not: compare the section on 'time joining EA' (where there are a couple of tests to support a 'longer in EA - finding it more welcoming'), versus age (which notes a variety of differences between age groups, but no statistical tests).

My impression is the better course is the former, and so differences being highlighted to the readers interest should be accompanied by whether these differences are statistically significant. This uniform approach also avoids 'garden of forking path' worries (e.g. 'Did you not report p values for the age section because you didn't test, or because they weren't significant?')

c) The ordered regression is comfortably the 'highest yield' bit of statistics performed, as it is appropriate to the data, often more sensitive (e.g. lumping the data into two groups by time in EA and t-testing is inferior technique to regression), and helps answer questions of confounding sometimes alluded to in the text ("Welcoming seems to go up with X, but down with Y, which is weird because X and Y correlate"), but uniformly important ("People in local groups find EA more welcoming - but could that driven by other variables between those within and without local groups?")

It deserves a much fuller explanation (e.g. how did 'country' and 'top priority cause' become single variables with a single regression coefficient - is the 'lumping together' implied in the text post-hoc? How was variable selection/model choice decided? Model 1 lacks only 'top priority cause', so assumedly 'adding in political spectrum didn't improve explanatory power' is a typo?). When its results vary with the univarible analysis, I would prefer the former over the latter. That fb membership, career shifting (in model 2), career type, and politics aren't significant predictors means their relationship to welcomingness, if, even if statistically significant, probably confounding rather than true association.

It is unfortunate some of these are highlighted in the summary and conclusion, even more so when a crucial negative result from the regression is relatively unsung. The ~3% R^2 and very small coefficients (with the arguable exception of sex) implies very limited practical significance: almost all the variation in whether an EA finds EA welcoming or not is not predicted by the factors investigated; although EAs in local groups find EA more welcoming, this effect - albeit statistically significant - is (if I interpret the regression right) around 0.1% of a single likert level.

2) Selection bias: A perennial challenge to the survey is issues of selection bias. Although happily noted frequently in discussion, I still feel it is underweighed: I think it is huge enough to make the results all but uninterpretable.

Facially, one would expect those who find EA less welcoming are less likely to join. We probably wouldn't think that how welcoming people already in EA think it is would be informative to how good it is at welcoming people into EA (caricatured example: I wouldn't be that surprised if members of something like the KKK found it generally welcoming). As mentioned in the 'politics' section, the relative population size seems a far better metric (although the baseline hard to establish) to which welcomingness adds very little.

Crucially, selection bias imposes some nigh-inscrutable but potentially sign-inverting considerations to any policy 'upshot'. A less welcoming subgroup could be cause for concern, but alternatively cause for celebration: perhaps this subgroup offers other 'pull factors' that mean people who find EA is less welcoming nonetheless join and stick around within it (and vice versa: maybe subgroups whose members find EA very welcoming do so because they indirectly filter out everyone who doesn't). Akin to Wald and the bombers in WW2, it is crucial to work out which. But I don't think we can here.

Comment by gregory_lewis on After one year of applying for EA jobs: It is really, really hard to get hired by an EA organisation · 2019-02-27T00:45:08.529Z · score: 38 (17 votes) · EA · GW

I think the reason the OP had a high fraction of 'long' processes had more to do with him being a strong applicant who would get through a lot of the early filters. I don't think a typical 'EA org' hiring round passes ~50% of its applicants to a work test.

This doesn't detract from your other points re. the length in absolute terms. (The descriptions from OP and others read uncomfortably reminiscent of more senior academic hiring, with lots of people getting burned competing for really attractive jobs). There may be some fundamental trade-offs (the standard argument about '*really* important to get the right person, so we want to spent a lot of time assessing plausible candidates to pick the right one, false negatives at intermediate stages cost more than false positives, etc. etc.'), but an easy improvement (mentioned elsewhere) is to communicate as best as one can the likelihood of success (perhaps broken down by stage) so applicants can make a better-informed decision.

Comment by gregory_lewis on Has your "EA worldview" changed over time? How and why? · 2019-02-26T00:22:08.344Z · score: 20 (9 votes) · EA · GW
If you're Open Phil, you can hedge yourself against the risk that your worldview might be wrong by diversifying. But the rest of us are just going to have to figure out which worldview is actually right.

Minor/Meta aside: I don't think 'hedging' or diversification is the best way to look at this, whether one is an individual or a mega-funder.

On standard consequentialist doctrine, one wants to weigh things up 'from the point of view of the universe', and be indifferent as to 'who is doing the work'. Given this, it looks better to act in the way which best rebalances the humanity-wide portfolio of moral effort, rather than a more narrow optimisation of 'the EA community', 'OPs grants', or ones own effort.

This rephrases the 'neglectedness' consideration. Yet I think people don't often think enough about conditioning on the current humanity-wide portfolio, or see their effort as being a part of this wider whole, and this can mislead into moral paralysis (and, perhaps, insufficient extremising). If I have to 'decide what worldview is actually right', I'm screwed: many of my uncertainties I'd expect to be resilient to a lifetime of careful study. Yet I have better prospects of reasonably believing that "This issue is credibly important enough that (all things considered, pace all relevant uncertainties) in an ideal world humankind would address X people to work on this - given in fact there are Y, Y << X, perhaps I should be amongst them."

This is a better sketch for why I work on longtermism, rather than overall confidence in my 'longtermist worldview'. This doesn't make worldview questions irrelevant (there are lot of issues where the sketch above applies, and relative importance will be one of the ingredients that goes in the mix of divining which one to take), but it means I'm fairly sanguine about perennial uncertainty. My work is minuscule part of the already-highly-diversified corporate effort of humankind, and the tacit coordination strategy of people like me acting on our best guess of the optimal portfolio looks robustly good (a community like EA may allow better ones), even if (as I hope and somewhat expect) my own efforts transpire to have little value.

The reason I shouldn't 'hedge' but Open Phil should is not so much because they can afford to (given they play with much larger stakes, better resolution on 'worldview questions' has much higher value to them than to I), but because the returns to specialisation are plausibly sigmoid over the 'me to OP' range. For individuals, there's increasing marginal returns to specialisation: in the same way we lean against 'donation splitting' with money, so too with time (it seems misguided for me to spend - say - 30% on bio, 10% on AI, 20% on global health, etc.) A large funder (even though it still represents a minuscule fraction of the humanity-wide portfolio) may have overlapping marginal return curves between its top picks of (all things considered) most promising things to work on, and it is better placed to realise other 'portfolio benefits'.

Comment by gregory_lewis on Evidence on good forecasting practices from the Good Judgment Project: an accompanying blog post · 2019-02-16T18:44:01.262Z · score: 6 (5 votes) · EA · GW

Excellent. This series of interviews with superforecasters is also interesting. [H/T Ozzie]

Comment by gregory_lewis on EA Survey 2018 Series: Donation Data · 2018-12-10T23:45:59.445Z · score: 5 (3 votes) · EA · GW

Thanks. I should say that I didn't mean to endorse stepwise when I mentioned it (for reasons Gelman and commenters note here), but that I thought it might be something one might have tried given it is the variable selection technique available 'out of the box' in programs like STATA or SPSS (it is something I used to use when I started doing work like this, for example).

Although not important here (but maybe helpful for next time), I'd caution against using goodness of fit estimators (e.g. AIC going down, R2 going up) too heavily in assessing the model as one tends to end up with over-fitting. I think the standard recommendations are something like:

  • Specify a model before looking at the data, and caveat any further explanations as post-hoc. (which sounds like essentially what you did).
  • Split your data into an exploration and confirmation set, where you play with whatever you like on the former, then use the model you think is best on the latter and report these findings (better, although slightly trickier, are things like k-fold cross validation rather than a single holdout).
  • LASSO, Ridge regression (or related regularisation methods) if you are going to select predictors 'hypothesis free' on your whole data.

(Further aside: Multiple imputation methods for missing data might also be worth contemplating in the future, although it is a tricky judgement call).

Comment by gregory_lewis on Giving more won't make you happier · 2018-12-10T23:13:58.822Z · score: 27 (13 votes) · EA · GW

Neither of your examples backs up your point.

The 80000 hours article you cite notes in its summary only that:

Giving some money to charity is unlikely to make you less happy, and may well make you happier. (My emphasis)

The GWWC piece reads thus:

Giving 10% of your income to effective charities can make an incredible difference to some of the most deprived people in the world. But what effect will giving have on you? You may be concerned that it will damage your quality of life and make you less happy. This is a perfectly reasonable concern, and there is no shame in wanting to live a full and happy life.

The good news is that giving can often make you happier.... (My emphasis)

As I noted in prior discussion, not only do these sources not claim 'giving effectively will increase your happiness', I'm not aware of this being claimed by any major EA source. Thus the objection "This line of argument confuses the effect of donating at all with the effect of donating effectively" targets a straw man.

Comment by gregory_lewis on Open Thread #43 · 2018-12-09T19:17:24.830Z · score: 13 (6 votes) · EA · GW

My impression FWIW is that the 'giving makes you happier' point wasn't/isn't advanced to claim that the optimal portfolio for one's personal happiness would include (e.g.) 10% of charitable donations (to effective causes), but that doing so isn't such a 'hit' to one's personal fulfilment as it appears at first glance. This is usually advanced in conjunction with the evidence on diminishing returns to money (i.e. even if you just lost - say - 10% of your income, if you're a middle class person in a rich country, this isn't a huge loss to your welfare - and given this evidence on the wellbeing benefits to giving, the impact is likely to be reduced further).

E.g. (and with apologies to the reader for inflicting my juvenilia upon them):

[Still being in the a high global wealth percentile post-giving] partly explains why I don’t feel poorly off or destitute. There are other parts. One is that giving generally makes you happier, and often more happier than buying things for yourself. Another is that I am fortunate in non-monetary respects: my biggest medical problem is dandruff, I have a loving family, a wide and interesting circle of friends, a fulfilling job, an e-reader which I can use to store (and occasionally read) the finest works of western literature, an internet connection I should use for better things than loitering on social media, and so on, and so on, and so on. I am blessed beyond all measure of desert.
So I don’t think that my giving has made me ‘worse off’. If you put a gun to my head and said, “Here’s the money you gave away back. You must spend it solely to further your own happiness”, I probably wouldn’t give it away: I guess a mix of holidays, savings, books, music and trips to the theatre might make me even happier (but who knows? people are bad at affective forecasting). But I’m pretty confident giving has made me happier compared to the case where I never had the money in the first place. So the downside looks like, “By giving, I have made myself even happier from an already very happy baseline, but foregone opportunities to give myself a larger happiness increment still”. This seems a trivial downside at worst, and not worth mentioning across the scales from the upside, which might be several lives saved, or a larger number of lives improved and horrible diseases prevented.
Comment by gregory_lewis on EA Survey 2018 Series: Donation Data · 2018-12-09T13:16:07.099Z · score: 3 (2 votes) · EA · GW

Thanks for these interesting results. I have a minor technical question (which I don't think was covered in the methodology post, nor in the Github repository from a quick review):

How did you select the variables (and interaction term) for the regression model? A priori? Stepwise? Something else?

Comment by gregory_lewis on EA Survey 2018 Series: Community Demographics & Characteristics · 2018-11-27T19:44:30.141Z · score: 2 (1 votes) · EA · GW

Minor: I'd say the travel times in 'Loxbridge' are somewhat longer than an hour.

Time from (e.g.) Oxford train station to London train station is an hour, but adding on the travel time from 'somewhere in Oxford/London to the train station' would push this up to ~2 hours. Oxford to Cambridge takes 3-4 hours by public transport.

The general topic looks tricky. I'd guess if you did a kernel density map over the bay, you'd get a (reasonably) even gradient over the 3k square miles. If you did the same over 'Loxbridge' you'd get very strong foci over the areas that correspond to London/Oxford/Cambridge. I'd also guess you'd get reasonable traffic between subareas in the bay area, but in Loxbridge you'd have some Oxford/London and Cambridge/London (a lot of professionals make this sort of commute daily) but very little Oxford/Cambridge traffic.

What criteria one uses to chunk large connurbations into natural language looks necessarily imprecise. I'd guess if you had the ground truth and ran typical clustering algos on it, you'd probably get a 'bay area' cluster though. What might be more satisfying is establishing whether the bay acts like a single community: if instead there is a distinguishable (e.g.) East Bay and South Bay community, where people in one or the other group tend to go to (e.g.) events in one or the other and visit the other occasionally (akin to how an Oxford-EA like me may mostly attend Oxford events but occasionally visit London ones), this would justify splitting it up.

Comment by gregory_lewis on Cross-post: Think twice before talking about ‘talent gaps’ – clarifying nine misconceptions, by 80,000 Hours. · 2018-11-20T08:29:15.217Z · score: 6 (3 votes) · EA · GW

Although orgs tacitly colluding with one another to pay their staff less than they otherwise would may also have an adverse effect on recruitment and retention...

Comment by gregory_lewis on William MacAskill misrepresents much of the evidence underlying his key arguments in "Doing Good Better" · 2018-11-17T20:55:52.930Z · score: 27 (13 votes) · EA · GW


I don't take, "[DGB] misrepresents sources structurally, and this is a convincing sign it is written in bad faith." to be either:

  • True. The OP strikes me as tendentiously uncharitable and 'out for blood' (given the earlier versions was calling for Will to be disavowed by EA per Gleb Tsipursky, trust in Will down to 0, etc.), and the very worst that should be inferred, even if we grant all the matters under dispute in its favour - which we shouldn't - would be something like "sloppy, and perhaps with a subconscious finger on the scale tilting the errors to be favourable to the thesis of the book" rather than deceit, malice, or other 'bad faith'.
  • Helpful. False accusations of bad faith are obviously toxic. But even true ones should be made with care. I was one of the co-authors on the Intentional Insights document, and in that case (with much stronger evidence suggestive of 'structural misrepresentation' or 'writing things in bad faith') we refrained as far as practicable from making these adverse inferences. We were criticised for this at the time (perhaps rightly), but I think this is the better direction to err in.
  • Kind. Self explanatory.

I'm sure Siebe makes their comment in good faith, and I agree some parts of the comment are worthwhile (e.g. I agree it is important that folks in EA can be criticised). But not overall.

Comment by gregory_lewis on Crohn's disease · 2018-11-16T14:15:46.358Z · score: 9 (2 votes) · EA · GW

In hope but little expectation:

You could cast about for various relevant base-rates ("What is the chance of any given proposed conjecture in medical science being true?" "What is the chance of a given medical trial giving a positive result?"). Crisp data on these questions are hard to find, but the proportion for either is comfortably less than even. (Maybe ~5% for the first, ~20% for the second).

From something like this one can make further adjustments based on the particular circumstances, which are generally in the adverse direction:

  • Typical trials have more than n=6 non-consecutive case series behind them, and so this should be less likely to replicate than the typical member of this class.
  • (Particularly, heterodox theories of pathogenesis tend to do worse, and on cursory search I can find a alternative theories of Crohn's which seem about as facially plausible as this).
  • The wild theory also imposes a penalty: even if the minimal prediction doesn't demand the wider 'malasezzia causes it etc.', that the hypothesis is generated through these means is a further cost.
  • There's also information I have from medical training which speaks against this (i.e. if antifungals had such dramatic effects as proposed, it probably would have risen to attention somewhat sooner).
  • All the second order things I noted in my first comment.

As Ryan has explained, standard significance testing puts a floor of 2.5% of a (false) positive result in any trial even if the true effect is zero. There is some chance the ground truth really is that itraconazole cures Crohn's (given some evidence of TNFa downstream effects, background knowledge of fungal microbiota disregulation, and the very slender case series), which gives it a small boost above this, although this in itself is somewhat discounted by the limited power of the proposed study (i.e. even if Itraconazole works, the study might miss it).

Comment by gregory_lewis on Crohn's disease · 2018-11-15T22:54:38.027Z · score: 10 (3 votes) · EA · GW

~3% (Standard significance testing means there's a 2.5% chance of a false positive result favouring the treatment group under the null).

Comment by gregory_lewis on Crohn's disease · 2018-11-15T19:59:49.140Z · score: 9 (2 votes) · EA · GW

The idea of doing an intermediate piece of work is so one can abandon the project if it is negative whilst having spent less than 500k. Even independent of the adverse indicators I note above, the prior on case series finding replicating out in RCT is very low.

Another cheap option would be talking to the original investigators. They may have reasons why they haven't followed this finding up themselves.

Comment by gregory_lewis on Crohn's disease · 2018-11-15T15:45:05.007Z · score: 7 (4 votes) · EA · GW

A cheaper alternative (also by about an order of magnitude) is to do a hospital record study where you look at subsequent Crohn's admissions or similar proxies of disease activity in those recently prescribed antifungals versus those who aren't.

I also imagine it would get better data than a poorly powered RCT.

Comment by gregory_lewis on Crohn's disease · 2018-11-14T08:27:43.642Z · score: 29 (11 votes) · EA · GW

I strong-downvoted this post. I had hoped the reasons why would be obvious. Alas not.

Scientific (in)credibility

The comments so far have mainly focused on the cost-effectiveness calculation. Yet it is the science itself that is replete with red flags: from grandiose free-wheeling, to misreporting cited results, to gross medical and scientific misunderstanding. [As background: I am a doctor who has published on the genetics of inflammatory bowel disease]

Several examples before I succumbed:

  • Samuel et al. 2010 is a retrospective database review of 6 patients treated with itraconazole for histoplasmosis in Crohn's Disease (CD) (N.B. Observational, not controlled, and as a letter, editor- rather than peer-reviewed). It did not "report it cured patients with CD by clearing fungus from the gut": the authors' own (appropriately tentative - unlike the OP) conjecture was any therapeutic effect was mediated by immunomodulatory effects of azole drugs downstream of TNF-a. It emphatically didn't "suggest oral itraconazole may be effective against Malassezia in the gut" (as claimed in the linked website's FAQ) as the presence or subsequent elimination of Malassezia was never assessed - nor was Malassezia mentioned.
  • Crohn's disease is not a spondyloarthritis! (and neither is psoriasis, ulcerative colitis, or acute anterior uveitis). As the name suggests, spondyloarthritides are arthritides (i.e. diseases principally of joints - the 'spondylo' prefix points to joints between vertebrae); Crohn's a disease of the GI tract. Crohn's can be associated with a spondyloarthritis (enteropathic spondyloarthritis). As the word 'associated' suggests, these are not one and the same: only a minority of those with Crohn's develop joint sequelae. (cf. Standard lists of spondyloarthrides - note Crohn's simpliciter isn't on them).
  • Chronic inflammation isn't a symptom ('spondoyloarthritide' or otherwise), and symptoms (rather than diseases) are only cured in the colloquial use of the term.
  • However one parses "[P]roving beyond all doubt that Crohn's disease is caused by this fungus will very likely lead to a cure for all spondyloarthritide symptoms using antifungal drugs." ('Merely' relieving all back pain from spondyloarthritides? Relieving all symptoms that arise from the set of (correctly defined) spondyloarthritides? Curing all spondyloarthritides? Curing (and/or relieving all symptoms) from the author's grab bag of symptoms/diseases which include CD, Ulcerative Collitis, Ankylosing spondylitis, Psoriasis and chronic back pain?) The antecedent (one n=40 therapeutic study won't prove Malassezia causes Crohn's, especially with a competing immunomodulatory mechanism already proposed); the consequent (anti-fungal drugs as some autoimmune disease panacea of uncertain scope); and the implication (even if Malasezzia is proven to cause Crohn's, the likelihood of this result (and therapy) generalising is basically nil) are all absurd.
  • The 'I love details!' page notes at then end "These findings satisfy Koch’s postulates for disease causation, albeit scattered across several related diseases." Which demonstrate the author doesn't understand Koch's postulates: you can't 'mix and match' across diseases, and the postulates need to be satisfied in sequence (i.e. you find the microorganism only present in cases of the disease (1), culture it (2), induce the disease in a healthy individual with such a culture (3), and extract the organism again from such individuals (4)).
  • The work reported in that page, here, and elsewhere also directly contradict Koch's first postulate. Malasezzia is not found in abundance in cases of disease (pick any of them) and not in healthy individuals (postulate 1): the author himself states Malasezzia is ubiquitous across individuals, diseased or not (and this ubiquity is cited as why this genus is being proposed in the first place).


I'd also rather not know how much has been spent on this so far. Whatever it is, investing another half a million dollars is profoundly ill-advised (putting the money in a pile and burning it is mildly preferable, even when one factors in climate change impacts). At least an order of magnitude cheaper is buying the time of someone who works in Crohn's to offer their assessment. I doubt it would be less scathing than mine.

Meta moaning

Most EAs had the good judgement to avoid the terrible mistake of a medical degree. One of the few downsides of so doing is (usually) not possessing the background knowledge to appraise something like this. As a community, we might worry about our collective understanding being led astray without the happy accident of someone with specialised knowledge (yet atrocious time-management and prioritisation skills among manifold other relevant personal failings) happening onto the right threads.

Have no fear: I have some handy advice/despairing pleas:

  • Medical science isn't completely civilizationally inadequate, and thus projects that resort to being pitched directly to inexpert funders have a pretty poor base rate (cf. DRACO)
  • Although these are imperfect, if the person behind the project doesn't have credentials in a relevant field (bioinformatics rather than gastroenterology, say), and/or a fairly slender relevant publication record, and scant/no interest from recognised experts, these are also adverse indicators. (Remember the nobel-prize winner endorsed Vit C megadosing?)
  • It can be hard to set the right incredulity prior: we all want take advantage of our risk neutrality to chase hits, but not all upsides that vindicate a low likelihood are credible. A rule-of-thumb I commend is 10^-(3+n(miracles)). So when someone suggests they have discovered the key mechanism of action (and consequent fix) for Crohn's disease, and ulcerative colitis, and ankylosing spondylitis, and reactive arthritis, and psoriasis, and psoriatic arthritis, and acute anterior uveitis, and oligoarthritis, and multiple sclerosis, and rheumatoid arthritis, and systemic lupus erythematosus, and prostate cancer, and benign prostatic hyperplasia, and chronic back pain (n~14), there may be some cause for concern.
  • Spot-checking bits of the write-up can be a great 'sniff test', especially in key areas where one isn't sure of one's ground ("Well, the fermi seems reasonable, but I wonder what this extra-sensory perception thing is all about").
  • Post value tends to be multiplicative (e.g. the antecedent of "If we have a cure for Crohn's, how good would it be?" may be the crucial consideration), and so its key to have an to develop an understanding across the key topics. Otherwise one risks conversational bikeshedding. Worse, there could be Sokal-hoax-esque effects where nonsense can end up well-received (say, moderately upvoted) provided it sends the right signals on non-substantive metrics like style, approach, sentiment, etc.

I see these aspects of epistemic culture as an important team sport, with 'amateur' participation encouraged (for my part, implored). I had hoped when I clicked the 'downvote' arrow for a few seconds I could leave this to fade in obscurity thereafter. When instead I find it being both upvoted and discussed like it has been, I become worried that it might actually attract resources from other EAs who might mistakenly take conversation thus-far to represent the balance of reason, and detract from EA's reputation with those who recognise it does not (cf. "The scientific revolution for altruism" aspiration). So I feel I have to am to write something more comprehensive. This took a lot longer than a few seconds, although fortunately my time is essentially worthless. Next time we may not be so lucky.

Comment by gregory_lewis on Even non-theists should act as if theism is true · 2018-11-09T22:07:37.681Z · score: 5 (3 votes) · EA · GW

The meat of this post seems to be a version of Plantinga's EAAN.

Comment by gregory_lewis on Mind Ease: a promising new mental health intervention · 2018-10-23T22:17:21.312Z · score: 23 (16 votes) · EA · GW

[based on an internally run study of 250 uses] Mind Ease reduces anxiety by 51% on average, and helps people feel better 80% of the time.

Extraordinary claims like this (and it's not the only one - e.g. "very likely" to help myself or people who I know who suffer from anxiety elsewhere in the post, "And for anxiety [discovering which interventions work best] is what we've done, '45% reduction in negative feelings' in the app itself) demands much fuller and more rigorous description and justification. e.g. (and cf. PICO):

  • (Population): How are you recruiting the users? Mturk? Positly? Convenience sample from sharing the link? Are they paid for participation? Are they 'people validated (somehow) as having an anxiety disorder' or (as I guess) 'people interested in reducing their anxiety/having something to help when they are particularly anxious?'
  • (Population): Are the "250 uses" 250 individuals each using Mindease once? If not, what's the distribution of duplicates?
  • (Intervention): Does "250 uses" include everyone who fired up the app, or only those who 'finished' the exercise (and presumably filled out the post-exposure assessment)?
  • (Comparator): Is this a pre-post result? Or is this vs. the sham control mentioned later? (If so, what is the effect size on the sham control?)
  • (Outcome): If pre-post, is the postexp assessment immediately subsequent to the intervention?
  • (Outcome): "reduces anxiety by 51%" on what metric? (Playing with the app suggests 5-level Likert scales?)
  • (Outcome): Ditto 'feels better' (measured how?)
  • (Outcome): Effect size (51% from what to what?) Inferential stats on the same (SE/CI, etc.)

There are also natural external validity worries. If (as I think it is) the objective is 'immediate symptomatic relief', results are inevitably confounded by anxiety a symptom that is often transient (or at least fluctuating in intensity), and one with high rates of placebo response. An app which does literally nothing but waits a couple of days before assessing (symptomatic) anxiety again will probably show great reductions in self-reported anxiety on pre-post, as people will be preferentially selected to use the app when feeling particularly anxious, and severity will tend to regress. This effect could apply to much shorter intervals (e.g. those required to perform a recommended exercise).

(Aside: An interesting validity test would be using GAD-7 for pre-post assessment. As all the items on GAD-7 are 'how often do you get X over the last 2 weeks', significant reduction in this metric immediately after the intervention should raise alarm).

In candour (and with regret) this write-up raises a lot of red flags to me. There is a large relevant literature which this post does not demonstrate command of. For example, there's a small hill of descriptive epidemiology papers on prevalence of anxiety as a symptom or anxiety disorders - including large population samples for GAD-7, which would look better routes to prevalence estimates than conducting a 300-person survey (and if you do run this survey, finding a prevalence in your sample of 73% >5 GAD given the population studies (e.g.) give means and medians ~2-3 and proportions >5 ~ 25% prompt obvious questions).

Likewise there are well-understood pitfalls in conducting research (some them particularly acute for intervention studies, and even moreso in intervention studies on mental health), which the 'marketing copy' style presentation (heavy on exuberant confidence, light on how this is substantiated) gives little reassurance they were in fact avoided. I appreciate "writing for an interested lay audience" (i.e. this one) demands a different style than writing to cater to academic scepticism. Yet the latter should be satisfied (either here or in a linked write-up), especially when attempting pioneering work in this area and claiming "extraordinarily good" results. We'd be cautious in accepting this from outside sources - we should mete out similar measure to projects developed 'in house'.

I hope subsequent work proves my worries unfounded.

Comment by gregory_lewis on Many EA orgs say they place a lot of financial value on their previous hire. What does that mean, if anything? And why aren't they hiring faster? · 2018-10-14T08:47:21.969Z · score: 29 (27 votes) · EA · GW

My hunch is (as implied elsewhere) 'talent-constraint' with 'talent' not further specified is apt to mislead. My impression for longtermist orgs (I understand from Peter and others this may apply less to orgs without this as the predominant focus) is there are two broad classes, which imperfectly line up with 'senior' versus 'junior'.

The 'senior' class probably does fit (commensensically understood) 'talent-constraint', in that orgs or the wider ecosystem want to take everyone who clears a given bar. Yet these bars are high even when conditioned on the already able cohort of (longtermist/)EAs. It might be things like 'ready to run a research group', 'can manage operations for an org' (cf. Tara's and Tanya's podcasts), 'subject matter expertise/ability/track record'.

One common feature is that these people add little further load on current (limited) management capacity, either because they are managing others or are already 'up to speed' to contribute themselves without extensive training or supervision. (Aside: I suspect this is a under-emphasised bonus of 'value-aligned operations staff' - their tacit knowledge of the community/mission/wider ecosystem may permit looser management than bringing on able professionals 'from outside'.) From the perspective of the archetypal 'pluripotent EA' a few years out from undergrad, these are skills which are hard to develop and harder to demonstrate.

More 'junior' roles are those where the criteria are broader (at least in terms of legible ones: 'what it takes' to be a good generalist researcher may be similarly rare to 'what it takes' to be a good technical AI safety researcher, but more can easily 'rule themselves out' of the latter than the former), where 'upskilling' is a major objective, or where there's expectation of extensive 'hands-on' management.

There might be similarly convex returns to getting a slightly better top candidate (e.g. 'excellent versus very good' might be 3x rather than 1.3x). Regardless, there will not be enough positions for all the talented candidates available: even if someone at an org decided to spend their time only managing and training junior staff (and haste considerations might lead them to spending more of their time doing work themselves than investing in the 'next generation'), they can't manage dozens at a time.

I think confusing these two broad classes is an easy way of burning a lot of good people (cf. Denise's remarks). If Alice the 23-year-old management consultant might reason on current messaging, "EA jobs are much better for the world than management consultancy, and they're after good people - I seem to fit the bill, so I should switch career into this". She might then forsake her promising early career for an unedifying and unsuccessful period as 'EA perennial applicant', ending up worse than she was at the start. EA has a vocational quality to it - key it does not become a siren song.

There seem a few ways to do this better, as alluded to in prior discussions here and elsewhere:

0) If I'm right, it'd be worth communicating the 'person spec' for cases where (common-sense) talent constraint applies, and where we really would absorb basically as many as we could get (e.g. "We want philosophers to contribute to GPR, and we're after people who either already have a publication record in this area, or have signals of 'superstar' ability even conditioned on philosophy academia. If this is you, please get in touch.").

1) Concurrently, it'd be worth publicising typical applicants:place or similar measures of competition for hiring rounds in more junior roles to allow applicants to be better calibrated/emphasise the importance of plan B. (e.g. "We have early-career roles for people thinking of working as GPR researchers, which serves the purpose of talent identification and development. We generally look for XYZ. Applications for this are extremely competitive (~12:1). Other good first steps for people who want to work in this field are these"). {MIRI's research fellows page does a lot of this well}.

2) It would be good for there to be further work addressed to avoiding 'EA underemployment', as I would guess growth in strong candidates for EA roles will outstrip intra-EA opportunities. Some possibilities:

2.1) There are some areas I'd want to add to the longtermist portfolio which might be broadened into useful niches for people with comparative advantage in them (macrohistory, productivity coaching and nearby versions, EA-relevant bits of psychology, etc.) I don't think these are 'easier' than the existing 'hot' areas, but they are hard in different ways, and so broaden opportunities.

2.2) Another option would be 'pre-caching human capital' into areas which are plausible candidates for becoming important as time goes on. I imagine something like international relations turning out to be crucial (or, contrariwise, relatively unimportant), but it seems better rather than waiting for this to be figured out for instead people to coordinate and invest themselves across the portfolio of plausible candidates. (Easier said than done from the first person perspective, as such a strategy potentially involves making an uncertain bet with many years of one's career, and if it turns out to be a bust ex post the good ex ante EV may not be complete consolation).

2.3) There seem a lot of stakeholders where it would be good for EAs to enter due to the second-order benefits even if their direct work is of limited direct relevance (e.g. having more EAs in tech companies looks promising to me, even if they aren't doing AI safety). (Again, not easy from the first person-perspective).

2.4) A lot of skills for more 'senior' roles can and have been attained outside of the EA community. Grad school is often a good idea for researchers, and professional/management aptitude is often a transferable skill. So some of the options above can be seen as a holding-pattern/bet hedging approach: they hopefully make one a stronger applicant for such roles, but in the meanwhile one is doing useful things (and also potentially earning to give, although I think this should be a minor consideration for longtermist EAs given the field is increasingly flush with cash).

If the framing is changed to something like, "These positions are very valuable, but very competitive - it is definitely worth you applying (as you in expectation increase the quality of the appointed candidate, and the returns of a slightly better candidate are very high), don't bet the farm (or quit the day job) on your application - and if you don't get in, here's things you could do to slant your career to have a bigger impact", I'd hope the burn risk falls dramatically: in many fields there are lots of competitive oversubscribed positions which don't impose huge costs to unsuccessful applicants.

Comment by gregory_lewis on 2018 list of half-baked volunteer research ideas · 2018-09-20T08:55:13.801Z · score: 4 (3 votes) · EA · GW

Something similar perhaps worth exploring is putting up awards/bounties for doing particular research projects. A central clearing-house of this could be interesting (I know myself and a couple of others have done this on an ad-hoc basis - that said, efforts to produce central repositories for self-contained projects etc. in EA have not been wildly successful).

A couple of related questions/topics I'd be excited for someone to have a look at:

1. Is rationality a skill, or a trait? Stanovich's RQ correlates with IQ fairly strongly, but I imagine going through the literature could uncover how much of a positive manifold there is between 'features' of rationality which is orthogonal to intelligence, and then investigation of how/whether this can be trained (with sub-questions around transfer, what seems particularly promising, etc.

2. I think a lot of people have looked into the superforecasting literature for themselves, but a general write-up for public consumption (e.g. How 'traity' is superforecasting? What exactly does GJP do to get a reported 10% boost from pre-selected superforecasters? Are there useful heuristics people can borrow to improve their own performance beyond practice/logging predictions? (And what is the returns curve to practice, anyway?)) could spare lots of private duplication.

3. More generally, I imagine lots of relevant books (e.g. Deep Work, superforecasting, better angels) could be concisely summarised. That said, I think there are already services that do this, so less clear if this already exists whether it is worth EA time to repeat 'in house'.

Comment by gregory_lewis on Current Estimates for Likelihood of X-Risk? · 2018-08-06T20:46:04.825Z · score: 6 (10 votes) · EA · GW

Thanks for posting this.

I don't think there are any other sources you're missing - at least, if you're missing them, I'm missing them too (and I work at FHI). I guess my overall feeling is these estimates are hard to make and necessarily imprecise: long-run large scale estimates (e.g. what was the likelihood of a nuclear exchange between the US and Russia between 1960 and 1970?) are still very hard to make ex post, leave alone ex ante.

One question might be how important further VoI is for particular questions. I guess the overall 'x risk chance' may have surprisingly small action relevance. The considerations about the relative importance of x-risk reduction seem to be fairly insensitive to 10^-1 or 10^-5 (at more extreme values, you might start having pascalian worries), and instead the discussion hinges on issues like tractability, pop ethics, etc.

Risk share seems more important (e.g. how much more worrying is AI than nuclear war?), yet these comparative judgements can be generally made in relative terms, without having to cash out the absolute values.

Comment by gregory_lewis on Leverage Research: reviewing the basic facts · 2018-08-05T17:47:05.220Z · score: 36 (30 votes) · EA · GW

[My views only]

Although few materials remain from the early days of Leverage (I am confident they acted to remove themselves from wayback, as other sites link to wayback versions of their old documents which now 404), there are some interesting remnants:

  • A (non-wayback) website snapshot from 2013
  • A version of Leverage's plan
  • An early Connection Theory paper

I think this material (and the surprising absence of material since) speaks for itself - although I might write more later anyway.

Per other comments, I'm also excited by the plan of greater transparency from Leverage. I'm particularly eager to find out whether they still work on Connection Theory (and what the current theory is), whether they addressed any of the criticism (e.g. 1, 2) levelled at CT years ago, whether the further evidence and argument mentioned as forthcoming in early documents and comment threads will materialise, and generally what research (on CT or anything else) have they done in the last several years, and when this will be made public.

The person-affecting value of existential risk reduction

2018-04-13T01:44:54.244Z · score: 41 (31 votes)

How fragile was history?

2018-02-02T06:23:54.282Z · score: 11 (13 votes)

In defence of epistemic modesty

2017-10-29T19:15:10.455Z · score: 56 (45 votes)

Beware surprising and suspicious convergence

2016-01-24T19:11:12.437Z · score: 35 (41 votes)

At what cost, carnivory?

2015-10-29T23:37:13.619Z · score: 5 (5 votes)

Don't sweat diet?

2015-10-22T20:15:20.773Z · score: 11 (13 votes)

Log-normal lamentations

2015-05-19T21:07:28.986Z · score: 11 (13 votes)

How best to aggregate judgements about donations?

2015-04-12T04:19:33.582Z · score: 4 (4 votes)

Saving the World, and Healing the Sick

2015-02-12T19:03:05.269Z · score: 12 (12 votes)

Expected value estimates you can take (somewhat) literally

2014-11-24T15:55:29.144Z · score: 4 (4 votes)