Use resilience, instead of imprecision, to communicate uncertainty 2020-07-18T12:09:36.901Z · score: 73 (39 votes)
Reality is often underpowered 2019-10-10T13:14:08.605Z · score: 157 (80 votes)
Risk Communication Strategies for the Very Worst of Cases 2019-03-09T06:56:12.480Z · score: 26 (7 votes)
The person-affecting value of existential risk reduction 2018-04-13T01:44:54.244Z · score: 47 (32 votes)
How fragile was history? 2018-02-02T06:23:54.282Z · score: 18 (15 votes)
In defence of epistemic modesty 2017-10-29T19:15:10.455Z · score: 81 (55 votes)
Beware surprising and suspicious convergence 2016-01-24T19:11:12.437Z · score: 51 (46 votes)
At what cost, carnivory? 2015-10-29T23:37:13.619Z · score: 6 (9 votes)
Don't sweat diet? 2015-10-22T20:15:20.773Z · score: 15 (16 votes)
Log-normal lamentations 2015-05-19T21:07:28.986Z · score: 12 (14 votes)
How best to aggregate judgements about donations? 2015-04-12T04:19:33.582Z · score: 5 (5 votes)
Saving the World, and Healing the Sick 2015-02-12T19:03:05.269Z · score: 12 (12 votes)
Expected value estimates you can take (somewhat) literally 2014-11-24T15:55:29.144Z · score: 14 (7 votes)


Comment by gregory_lewis on What is the increase in expected value of effective altruist Wayne Hsiung being mayor of Berkeley instead of its current incumbent? · 2020-08-07T14:02:26.822Z · score: 46 (14 votes) · EA · GW

I recall Hsiung being in favour of conducting disruptive protests against EAG 2015:

I honestly think this is an opportunity. "EAs get into fight with Elon Musk over eating animals" is a great story line that would travel well on both social and possibly mainstream media.

Organize a group. Come forward with an initially private demand (and threaten to escalate, maybe even with a press release). Then start a big fight if they don't comply.

Even if you lose, you still win because you'll generate massive dialogue!

It is unclear whether the motivation was more 'blackmail threats to stop them serving meat' or 'as Elon Musk will be there we can co-opt this to raise our profile'. Whether Hsiung calls himself an EA or not, he evidently missed the memo on 'eschew narrow minded obnoxious defection against others in the EA community'.

For similar reasons, it seems generally wiser for a community not to help people who previously wanted to throw it under the bus.

Comment by gregory_lewis on Use resilience, instead of imprecision, to communicate uncertainty · 2020-07-23T23:19:03.253Z · score: 4 (2 votes) · EA · GW

My reply is a mix of the considerations you anticipate. With apologies for brevity:

  • It's not clear to me whether avoiding anchoring favours (e.g.) round numbers or not. If my listener, in virtue of being human, is going to anchor on whatever number I provide them, I might as well anchor them on a number I believe to be more accurate.
  • I expect there are better forms of words for my examples which can better avoid the downsides you note (e.g. maybe saying 'roughly 12%' instead of '12%' still helps, even if you give a later articulation).
  • I'm less fussed about precision re. resilience (e.g. 'I'd typically expect drift of several percent from this with a few more hours to think about it' doesn't seem much worse than 'the standard error of this forecast is 6% versus me with 5 hours more thinking time' or similar). I'd still insist something at least pseudo-quantitative is important, as verbal riders may not put the listener in the right ballpark (e.g. does 'roughly' 10% pretty much rule out it being 30%?)
  • Similar to the 'trip to the shops' example in the OP, there's plenty of cases where precision isn't a good way to spend time and words (e.g. I could have counter-productively littered many of the sentences above with precise yet non-resilient forecasts). I'd guess there's also cases where it is better to sacrifice precision to better communicate with your listener (e.g. despite the rider on resilience you offer, they will still think '12%' is claimed to be accurate to the nearest percent, but if you say 'roughly 10%' they will better approximate what you have in mind). I still think when the stakes are sufficiently high, it is worth taking pains on this.
Comment by gregory_lewis on Use resilience, instead of imprecision, to communicate uncertainty · 2020-07-23T22:50:31.876Z · score: 3 (2 votes) · EA · GW

I had in mind the information-theoretic sense (per Nix). I agree the 'first half' is more valuable than the second half, but I think this is better parsed as diminishing marginal returns to information.

Very minor, re. child thread: You don't need to calculate numerically, as: , and . Admittedly the numbers (or maybe the remark in the OP generally) weren't chosen well, given 'number of decimal places' seems the more salient difference than the squaring (e.g. per-thousandths does not have double the information of per-cents, but 50% more)

Comment by gregory_lewis on Use resilience, instead of imprecision, to communicate uncertainty · 2020-07-23T22:27:36.206Z · score: 2 (1 votes) · EA · GW

It's fairly context dependent, but I generally remain a fan.

There's a mix of ancillary issues:

  • There could be a 'why should we care what you think?' if EA estimates diverge from consensus estimates, although I imagine folks tend to gravitate to neglected topics etc.
  • There might be less value in 'relative to self-ish' accounts of resilience: major estimates in a front facing report I'd expect to be fairly resilient, and so less "might shift significantly if we spent another hour on it".
  • Relative to some quasi-ideal seems valuable though: E.g. "Our view re. X is resilient, but we have a lot of knightian uncertainty, so we're only 60% sure we'd be within an order of magnitude of X estimated by a hypothetical expert panel/liquid prediction market/etc."
  • There might be better or worse ways to package this given people are often sceptical of any quantitative assessment of uncertainty (at least in some domains). Perhaps something like 'subjective confidence intervals' (cf.), although these aren't perfect.

But ultimately, if you want to tell someone an important number you aren't sure about, it seems worth taking pains to be precise, both on it and its uncertainty.

Comment by gregory_lewis on Evidence on good forecasting practices from the Good Judgment Project: an accompanying blog post · 2020-07-15T17:30:31.706Z · score: 4 (2 votes) · EA · GW

It is true that given the primary source (presumably this), the implication is that rounding supers to 0.1 hurt them, but 0.05 didn't:

To explore this relationship, we rounded forecasts to the nearest 0.05, 0.10, or 0.33 to see whether Brier scores became less accurate on the basis of rounded forecasts rather than unrounded forecasts. [...]
For superforecasters, rounding to the nearest 0.10 produced significantly worse Brier scores [by implication, rounding to the nearest 0.05 did not]. However, for the other two groups, rounding to the nearest 0.10 had no influence. It was not until rounding was done to the nearest 0.33 that accuracy declined.

Prolonged aside:

That said, despite the absent evidence I'm confident accuracy with superforecasters (and ~anyone else - more later, and elsewhere) does numerically drop with rounding to 0.05 (or anything else), even if has not been demonstrated to be statistically significant:

From first principles, if the estimate has signal, shaving bits of information from it by rounding should make it less accurate (and it obviously shouldn't make it more accurate, pretty reliably setting the upper bound of our uncertainty to 0).

Further, there seems very little motivation for the idea we have n discrete 'bins' of probability across the number line (often equidistant!) inside our heads, and as we become better forecasters n increases. That we have some standard error to our guesses (which ~smoothly falls with increasing skill) seems significantly more plausible. As such the 'rounding' tests should be taken as loose proxies to assess this error.

Yet if error process is this, rather than 'n real values + jitter no more than 0.025', undersampling and aliasing should introduce a further distortion. Even if you think there really are n bins someone can 'really' discriminate between, intermediate values are best seen as a form of anti-aliasing ("Think it is more likely 0.1 than 0.15, but not sure, maybe its 60/40 between them so I'll say 0.12") which rounding ablates. In other words 'accurate to the nearest 0.1' does not mean the second decimal place carries no information.

Also, if you are forecasting distributions rather than point estimates (cf. Metaculus), said forecast distributions typically imply many intermediate value forecasts.

Empirically, there's much to suggest a T2 error explanation of the lack of a 'significant' drop. As you'd expect, the size of the accuracy loss grows with both how coarsely things are rounded, and the performance of the forecaster. Even if relatively finer coarsening makes things slightly worse, we may expect to miss it. This looks better to me on priors than these trends 'hitting a wall' at a given level of granularity (so I'd guess untrained forecasters are numerically worse if rounded to 0.1, even if the worse performance means there is less signal to be lost, and in turn makes this hard to 'statistically significantly' detect).

I'd adduce other facts against too. One is simply that superforecasters are prone to not give forecasts on a 5% scale, using intermediate values instead: given their good callibration, you'd expect them to iron out this Brier-score-costly jitter (also, this would be one of the few things they are doing worse than regular forecasters). You'd also expect discretization in things like their calibration curve (e.g. events they say happen 12% of the time in fact happen 10% of time, whilst events that they say happen 13% of the time in fact happen 15% of the time), or other derived figures like ROC.

This is ironically foxy, so I wouldn't be shocked for this to be slain by the numerical data. But I'd bet at good odds (north of 3:1) things like "Typically, for 'superforecasts' of X%, these events happened more frequently than those forecast at (X-1)%, (X-2)%, etc."

Comment by gregory_lewis on EA Forum feature suggestion thread · 2020-06-20T07:53:23.878Z · score: 16 (8 votes) · EA · GW

On-site image hosting for posts/comments? This is mostly a minor QoL benefit, and maybe there would be challenges with storage. Another benefit would be that images would not vanish if their original source does.

Comment by gregory_lewis on EA Forum feature suggestion thread · 2020-06-20T07:49:06.302Z · score: 10 (6 votes) · EA · GW

Import from HTML/gdoc/word/whatever: One feature I miss from the old forum was the ability to submit HTML directly. This allowed one to write the post in google docs or similar (with tables, footnotes, sub/superscript, special characters, etc.), export it as HTML, paste into the old editor, and it was (with some tweaks) good to go.

This is how I posted my epistemic modesty piece (which has a table which survived the migration, although the footnote links no longer work). In contrast, when cross-posting it to LW2, I needed the kind help of a moderator - and even they needed to make some adjustments (e.g. 'writing out' the table).

Given such a feature was available before, hopefully it can be done again. It would be particularly valuable for the EA forum as:

  • A fair proportion of posts here are longer documents which benefit from the features available in things like word or gdocs. (But typically less mathematics than LW, so the nifty LATEX editor finds less value here than there).
  • The current editor has much less functionality than word/gdocs, and catching up 'most of the way' seems very labour intensive and could take a while.
  • Most users are more familiar with gdocs/word than editor/markdown/latex (i.e. although I can add and other special characters with the Latex editor and a some googling, I'm more familiar with doing this in gdocs - and I guess folks who have less experience with Latex or using a command line would find this difference greater).
  • Most users are probably drafting longer posts on google docs anyway.
  • Clunkily re-typesetting long documents in the forum editor manually (e.g. tables as image files) poses a barrier to entry, and so encourages linking rather than posting, with (I guess?) less engagement.

A direct 'import from gdoc/word/etc.' would be even better, but an HTML import function alone (given software which has both wordprocessing and HTML export 'sorted' are prevalent) would solve a lot of these problems at a stroke.

Comment by gregory_lewis on EA Forum feature suggestion thread · 2020-06-20T06:54:33.518Z · score: 7 (2 votes) · EA · GW

Footnote support in the 'standard' editor: For folks who aren't fluent in markdown (like me), the current process is switching the editor back and forth to 'markdown mode' to add these footnotes, which I find pretty cumbersome.[1]

[1] So much so I lazily default to doing it with plain text.

Comment by gregory_lewis on Examples of people who didn't get into EA in the past but made it after a few years · 2020-05-30T18:54:55.082Z · score: 22 (8 votes) · EA · GW

I applied for a research role at GWWC a few years ago (?2015 or so), and wasn't selected. I now do research at FHI.

In the interim I worked as a public health doctor. Although I think this helped me 'improve' in a variety of respects, 'levelling up for an EA research role' wasn't the purpose in mind: I was expecting to continue as a PH doctor rather than 'switching across' to EA research in the future; if I was offered the role at GWWC, I'm not sure whether I would have taken it.

There's a couple of points I'd want to emphasise.

1. Per Khorton, I think most of the most valuable roles (certainly in my 'field' but I suspect in many others, especially the more applied/concrete) will not be at 'avowedly EA organisations'. Thus, depending on what contributions you want to make, 'EA employment' may not be the best thing to aim for.

2. Pragmatically, 'avowedly EA organisation roles' (especially in research) tend oversubscribed and highly competitive. Thus (notwithstanding the above) this is ones primary target, it seems wise to have a career plan which does not rely on securing such a role (or at least have a backup).

3. Although there's a sense of ways one can build 'EA street cred' (or whatever), it's not clear these forms of 'EA career capital' are best even for employment at avowedly EA organisations. I'd guess my current role owes more to (e.g.) my medical and public health background than it does to my forum oeuvre (such as it is).

Comment by gregory_lewis on Why not give 90%? · 2020-03-26T11:42:23.593Z · score: 6 (4 votes) · EA · GW

Part of the story, on a consequentialising-virtue account, is typically desire for luxury is amenable to being changed in general, if not in Agape's case in particular. Thus her attitude of regret rather than shrugging her shoulders typically makes things go better, if not for her but for third parties who have a shot at improving this aspect of themselves.

I think most non-consequentialist views (including ones I'm personally sympathetic to) would fuzzily circumscribe character traits where moral blameworthiness can apply even if they are incorrigible. To pick two extremes: if Agape was born blind, and this substantially impeded her from doing as much good as she would like, the commonsense view could sympathise with her regret, but insist she really has 'nothing to be sorry about'; yet if Agape couldn't help being a vicious racist, and this substantially impeded her from helping others (say, because the beneficiaries are members of racial groups she despises), this is a character-staining fault Agape should at least feel bad about even if being otherwise is beyond her - plausibly, it would recommend her make strenuous efforts to change even if both she and others knew for sure all such attempts are futile.

Comment by gregory_lewis on Why not give 90%? · 2020-03-25T12:15:34.912Z · score: 11 (4 votes) · EA · GW

Nice one. Apologies for once again offering my 'c-minor mood' key variation: Although I agree with the policy upshot, 'obligatory, demanding effective altruism' does have some disquieting consequences for agents following this policy in terms of their moral self-evaluation.

As you say, Agape does the right thing if she realises (similar to prof procrastinate) that although, in theory, she could give 90% (or whatever) of her income/effort to help others, in practice she knows this isn't going to work out, and so given she wants to do the most good, she should opt for doing somewhat less (10% or whatever), as she foresees being able to sustain this.

Yet the underlying reason for this is a feature of her character which should be the subject of great moral regret. Bluntly: she likes her luxuries so much that she can't abide being without them, despite being aware (inter alia) that a) many people have no choice but to go without the luxuries she licenses herself to enjoy; b) said self-provision implies grave costs to those in great need if (per impossible) she could give more; c) her competing 'need' doesn't have great non-consequentialist defences (cf. if she was giving 10% rather than 90% due to looking after members of her family); d) there's probably not a reasonable story of desert for why she is in this fortunate position in the first place; e) she is aware of other people, similarly situated to her, who nonetheless do manage to do without similar luxuries and give more of themselves to help others.

This seems distinct from other prudential limitations a wise person should attend to. Agape, when making sure she gets enough sleep, may in some sense 'regret' she has to sleep for several hours each day. Yet it is wise for Agape to sleep enough, and needing to sleep (even if she needs to sleep more than others) is not a blameworthy trait. It is also wise for Agape to give less in the OP given her disposition of, essentially, "I know I won't keep giving to charity unless I also have a sports car". But even if Agape can't help this no more than needing to sleep, this trait is blameworthy.

Agape is not alone in having blameworthy features of her character - I, for one, have many; moral saintliness is rare, and most readers probably could do more to make the world better were they better people. 'Obligatory, demanding effective altruism' would also make recommendations against responses to this fact which are counterproductive (e.g. excessive self-flagellation, scrupulosity). I'd agree, but want to say slightly more about the appropriate attitude as well as the right action - something along the lines of non-destructive and non-aggrandising regret.[1] I often feel EAs tend to err in the direction of being estranged from their own virtue; but they should also try to avoid being too complaisant to their own vice.

[1] Cf. Kierkegaard, Sickness unto Death

Either in confused obscurity about oneself and one’s significance, or with a trace of hypocrisy, or by the help of cunning and sophistry which is present in all despair, despair over sin is not indisposed to bestow upon itself the appearance of something good. So it is supposed to be an expression for a deep nature which thus takes its sin so much to heart. I will adduce an example. When a man who has been addicted to one sin or another, but then for a long while has withstood temptation and conquered -- if he has a relapse and again succumbs to temptation, the dejection which ensues is by no means always sorrow over sin. It may be something else, for the matter of that it may be exasperation against providence, as if it were providence which had allowed him to fall into temptation, as if it ought not to have been so hard on him, since for a long while he had victoriously withstood temptation. But at any rate it is womanish [recte maudlin] without more ado to regard this sorrow as good, not to be in the least observant of the duplicity there is in all passionateness, which in turn has this ominous consequence that at times the passionate man understands afterwards, almost to the point of frenzy, that he has said exactly the opposite of that which he meant to say. Such a man asseverated with stronger and stronger expressions how much this relapse tortures and torments him, how it brings him to despair, "I can never forgive myself for it"; he says. And all this is supposed to be the expression for how much good there dwells within him, what a deep nature he is.

Comment by gregory_lewis on Thoughts on The Weapon of Openness · 2020-02-17T05:15:52.123Z · score: 9 (4 votes) · EA · GW
All else equal, I would expect a secret organisation to have worse epistemics and be more prone to corruption than an open one, both of which would impair its ability to pursue its goals. Do you disagree?

No I agree with these pro tanto costs of secrecy (and the others you mentioned before). But key to the argument is whether these problems inexorably get worse as time goes on. If so, then the benefits of secrecy inevitably have a sell-by date, and once the corrosive effects spread far enough one is better off 'cutting ones losses' - or never going down this path in the first place. If not, however, then secrecy could be a strategy worth persisting with if the (~static) costs of this are outweighed by the benefits on an ongoing basis.

The proposed trend of 'getting steadily worse' isn't apparent to me. One can find many organisations which typically do secret technical work have been around for decades (the NSA is one, most defence contractors another, (D)ARPA, etc.). A skim of what they were doing in (say) the 80s versus the 50s doesn't give an impression they got dramatically worse despite the 30 years of secrecy's supposed corrosive impact. Naturally, the attribution is very murky (e.g. even if their performance remained okay, maybe secrecy had gotten much more corrosive but this was outweighed by countervailing factors like much larger investment; maybe they would have fared better under a 'more open' counterfactual) but the challenge of dissecting out the 'being secret * time' interaction term and showing it is negative is a challenge that should be borne by the affirmative case.

Comment by gregory_lewis on EA Survey 2019 Series: Donation Data · 2020-02-14T18:24:26.187Z · score: 8 (3 votes) · EA · GW


Like last year, we ran a full model with all interactions, and used backwards selection to select predictors.

Presuming backwards selection is stepwise elimination, this is not a great approach to model generation. See e.g. this from Frank Harrell: in essence, stepwise tends to be a recipe for overfitting, and thus the models it generates tend to have inflated goodness of fit measures (e.g. R2), overestimated coefficient estimates, and very hard to interpret p values (given the implicit multiple testing in the prior 'steps'). These problems are compounded by generating a large number of new variables (all interaction terms) for stepwise to play with.

Some improvements would be:

1. Select the variables by your judgement, and report that model. If you do any post-hoc additions (e.g. suspecting an interaction term), report these with the rider it is a post-hoc assessment.

2. Have a hold-out dataset to test your model (however you choose to generate it) against. (Cross-validation is an imperfect substitute).

3. Ridge, Lasso, elastic net or other approaches to variable selection.

Comment by gregory_lewis on Thoughts on The Weapon of Openness · 2020-02-13T14:28:35.051Z · score: 15 (8 votes) · EA · GW

Thanks for this, both the original work and your commentary was an edifying read.

I'm not persuaded, although this is mainly owed to the common challenge that noting considerations 'for' or 'against' in principle does not give a lot of evidence of what balance to strike in practice. Consider something like psychiatric detention: folks are generally in favour of (e.g.) personal freedom, and we do not need to think very hard to see how overruling this norm 'for their own good' could go terribly wrong (nor look very far to see examples of just this). Yet these considerations do not tell us what the optimal policy should be relative to the status quo, still less how it should be applied to a particular case.

Although the relevant evidence can neither be fully observed or fairly sampled, there's a fairly good prima facie case for some degree of secrecy not leading to disaster, and sometimes being beneficial. There's some wisdom of the crowd account that secrecy is the default for some 'adversarial' research; it would surprise if technological facts proved exceptions to the utility of strategic deception. Bodies that conduct 'secret by default' work have often been around decades (and the states that house them centuries), and although there's much to suggest this secrecy can be costly and counterproductive, the case for their inexorable decay attributable to their secrecy is much less clear cut.

Moreover technological secrecy has had some eye-catching successes: the NSA likely discovered differential cryptanalysis years before it was on the open literature; discretion by early nuclear scientists (championed particularly by Szilard) on what to publish credibly gave the Manhattan project a decisive lead over rival programs. Openness can also have some downsides - the one that springs to mind from my 'field' is Al-Qaeda started exploring bioterrorism after learning of the United States expressing concern about the same.

Given what I said above, citing some favourable examples doesn't say much (although the nuclear weapon one may have proved hugely consequential). One account I am sympathetic to would be talking about differential (or optimal) disclosure: provide information in the manner which maximally advantages good actors over bad ones. This will recommend open broadcast in many cases: e.g. where there aren't really any bad actors, where the bad actors cannot take advantage of the information (or they know it already, so letting the good actors 'catch up'), where there aren't more selective channels, and so forth. But not always: there seem instances where, if possible, it would be better to preferentially disclose to good actors versus bad ones - and this requires some degree of something like secrecy.

Judging the overall first-order calculus, leave along weighing this against second order concerns (such as noted above) is fraught: although, for what it's worth, I think 'security service' norms tend closer to the mark than 'academic' ones. I understand cybersecurity faces similar challenges around vulnerability disclosure, as 'don't publish the bug until the vendor can push a fix' may not perform as well as one might naively hope: for example, 'white hats' postponing their discoveries hinders collective technological progress, and risks falling behind a 'black hat' community avidly trading tips and tricks. This consideration can also point the other way: if the 'white hats' are much more able than their typically fragmented and incompetent adversaries, the greater the danger of their work 'giving bad people good ideas'. The FBI or whoever may prove much more adept at finding vulnerabilities terrorists could exploit than terrorists themselves. They would be unwise to blog their red-teaming exercises.

Comment by gregory_lewis on Concerning the Recent 2019-Novel Coronavirus Outbreak · 2020-02-03T03:46:27.270Z · score: 22 (14 votes) · EA · GW

All of your examples seem much better than the index case I am arguing against. Commonsense morality attaches much less distaste to cases where those 'in peril' are not crisply identified (e.g. "how many will die in some pandemic in the future" is better than "how many will die in this particular outbreak", which is better than "will Alice, currently ill, live or die?"). It should also find bets on historical events are (essentially) fine, as whatever good or ill implicit in these has already occurred.

Of course, I agree they your examples would be construed as to some degree morbid. But my recommendation wasn't "refrain from betting in any question where we we can show the topic is to some degree morbid" (after all, betting on GDP of a given country could be construed this way, given its large downstream impacts on welfare). It was to refrain in those cases where it appears very distasteful and for which there's no sufficient justification. As it seems I'm not expressing this balancing consideration well, I'll belabour it.


Say, God forbid, one of my friend's children has a life-limiting disease. On its face, it seems tasteless for me to compose predictions at all on questions like, "will they still be alive by Christmas?" Carefully scrutinising whether they will live or die seems to run counter to the service I should be providing as a supporter of my friends family and someone with the child's best interests at heart. It goes without saying opening a book on a question like this seems deplorable, and offering (and confirming bets) where I take the pessimistic side despicable.

Yet other people do have good reason for trying to compose an accurate prediction on survival or prognosis. The child's doctor may find themselves in the invidious position where they recognise they their duty to give my friend's family the best estimate they can runs at cross purposes to other moral imperatives that apply too. The commonsense/virtue-ethicsy hope would be the doctor can strike the balance best satisfies these cross-purposes, thus otherwise callous thoughts and deeds are justified by their connection to providing important information to the family

Yet any incremental information benefit isn't enough to justify anything of any degree of distastefulness. If the doctor opened a prediction market on a local children's hospice, I think (even if they were solely and sincerely motivated for good purposes, such as to provide families with in-expectation better prognostication now and the future) they have gravely missed the mark.

Of the options available, 'bringing money' into it generally looks more ghoulish the closer the connection is between 'something horrible happening' and 'payday!'. A mere prediction platform is better (although still probably the wrong side of the line unless we have specific evidence it will give a large benefit), also paying people to make predictions on said platform (but paying for activity and aggregate accuracy rather than direct 'bet results') is also slightly better. "This family's loss (of their child) will be my gain (of some money)" is the sort of grotesque counterfactual good people would strenuously avoid being party to save exceptionally good reason.


To repeat: the it is the balance of these factors - which come in degrees - which is determines the final evaluation. So, for example, I'm not against people forecasting the 'nCoV' question (indeed, I do as well), but the addition of money takes it the wrong side of the line (notwithstanding the money being ridden on this for laudable motivation). Likewise I'm happy to for people to prop bet on some of your questions pretty freely, but not for the 'nCoV' (or some even more extreme versions) because the question is somewhat less ghoulish, etc. etc. etc.

I confess some irritation. Because I think whilst you and Oli are pressing arguments (sorry - "noticing confusion") re. there not being a crisp quality that obtains to the objectionable ones yet not the less objectionable ones (e.g. 'You say this question is 'morbid' - but look here! here are some other questions which are qualitatively morbid too, and we shouldn't rule them all out') you are in fact committed to some sort of balancing account.

I presume (hopefully?) you don't think 'child hospice sweepstakes' would be a good idea for someone to try (even if it may improve our calibration! and it would give useful information re. paediatric prognosticiation which could be of value to the wider world! and capitalism is built on accurate price signals! etc. etc.) As you're not biting the bullet on these reductios (nor bmg's, nor others) you implicitly accept all the considerations about why betting is a good thing are pro tanto and can be overcome at some extreme limit of ghoulishness etc.

How to weigh these considerations is up for grabs. Yet picking each individual feature of ghoulishness in turn and showing it, alone, is not enough to warrant refraining from highly ghoulish bets (where the true case against would be composed of other factors alongside the one being shown to be individually insufficient) seems an exercise in the fallacy of division.


I also note that all the (few) prop bets I recall in EA up until now (including one I made with you) weren't morbid. Which suggests you wouldn't appreciably reduce the track record of prop bets which show (as Oli sees it) admirable EA virtues of skin in the game.

Comment by gregory_lewis on Concerning the Recent 2019-Novel Coronavirus Outbreak · 2020-02-02T00:09:24.330Z · score: 9 (14 votes) · EA · GW
Both of these are environments in which people participate in something very similar to betting. In the first case they are competing pretty directly for internet points, and in the second they are competing for monetary prices.
Those two institutions strike me as great examples of the benefit of having a culture of betting like this, and also strike me as similarly likely to create offense in others.

I'm extremely confident a lot more opprobrium attaches to bets where the payoff is in money versus those where the payoff is in internet points etc. As you note, I agree certain forecasting questions (even without cash) provoke distaste: if those same questions were on a prediction market the reaction would be worse. (There's also likely an issue the money leading to a question of ones motivation - if epi types are trying to predict a death toll and not getting money for their efforts, it seems their efforts have a laudable purpose in mind, less so if they are riding money on it).

I agree with you that were there only the occasional one-off bet on the forum that was being critiqued here, the epistemic cost would be minor. But I am confident that a community that had a relationship to betting that was more analogous to how Chi's relationship to betting appears to be, we would have never actually built the Metaculus prediction platform.

This looks like a stretch to me. Chi can speak for themselves, but their remarks don't seem to entail a 'relationship to betting' writ large, but an uneasy relationship to morbid topics in particular. Thus the policy I take them to be recommending (which I also endorse) of refraining making 'morbid' or 'tasteless' bets (but feel free to prop bet to heart's desire on other topics) seems to have very minor epistemic costs, rather than threatening some transformation of epistemic culture which would mean people stop caring about predictions.

For similar reasons, this also seems relatively costless in terms of other perceptions: refraining from 'morbid' topics for betting only excludes a small minority of questions one can bet upon, leaving plenty of opportunities to signal its virtuous characteristics re. taking ideas seriously whilst avoiding those which reflect poorly upon it.

Comment by gregory_lewis on Concerning the Recent 2019-Novel Coronavirus Outbreak · 2020-02-01T21:53:48.141Z · score: 35 (26 votes) · EA · GW

I emphatically object to this position (and agree with Chi's). As best as I can tell, Chi's comment is more accurate and better argued than this critique, and so the relative karma between the two dismays me.

I think it is fairly obvious that 'betting on how many people are going to die' looks ghoulish to commonsense morality. I think the articulation why this would be objectionable is only slightly less obvious: the party on the 'worse side' of the bet seems to be deliberately situating themselves to be rewarded as a consequence of the misery others suffer; there would also be suspicion about whether the person might try and contribute to the bad situation seeking a pay-off; and perhaps a sense one belittles the moral gravity of the situation by using it for prop betting.

Thus I'm confident if we ran some survey on confronting the 'person on the street' with the idea of people making this sort of bet, they would not think "wow, isn't it great they're willing to put their own money behind their convictions", but something much more adverse around "holding a sweepstake on how many die".

(I can't find an easy instrument for this beyond than asking people/anecdata: the couple of non-EA people I've run this by have reacted either negatively or very negatively, and I know comments on forecasting questions which boil down to "will public figure X die before date Y" register their distaste. If there is a more objective assessment accessible, I'd offer odds at around 4:1 on the ratio of positive:negative sentiment being <1).

Of course, I think such an initial 'commonsense' impression would very unfair to Sean or Justin: I'm confident they engaged in this exercise only out of a sincere (and laudable) desire to try and better understand an important topic. Nonetheless (and to hold them much higher standards than my own behaviour) one may suggest it is nonetheless a lapse of practical wisdom if, whilst acting to fulfil one laudable motivation, not tempering this with other moral concerns one should also be mindful of.

One needs to weigh the 'epistemic' benefits of betting (including higher order terms) against the 'tasteless' complaint (both in moral-pluralism case of it possibly being bad, but also the more prudential case of it looking bad to third parties). If the epistemic benefits were great enough, we should reconcile ourselves to the costs of sometimes acting tastelessly (triage is distasteful too) or third parties (reasonably, if mistakenly) thinking less of us.

Yet the epistemic benefits on the table here (especially on the margin of 'feel free to bet, save on commonsense ghoulish topics') are extremely slim. The rate of betting in EA/rationalist land on any question is very low, so the signal you get from small-n bets are trivial. There are other options, especially for this question, which give you much more signal per unit activity - given, unlike the stock market, people are interested in the answer for-other-than pecuniary motivations: both metacalus and the John's Hopkins platform prediction have relevant questions which are much active, and where people are offering more information.

Given the marginal benefits are so slim, they are easily outweighed by the costs Chi notes. And they are.

Comment by gregory_lewis on The Labour leadership election: a high leverage, time-limited opportunity for impact (*1 week left to register for a vote*) · 2020-01-15T07:13:15.527Z · score: 19 (10 votes) · EA · GW

Thanks. I think it would be better, given you are recommending joining and remaining in the party, the 'price' isn't quoted as a single month of membership.

One estimate could be the rate of leadership transitions. There have been ~17 in the last century of the Labour party (ignoring acting leaders). Rounding up, this gives an expected vote for every 5 years of membership, and so a price of ~£4.38*60 = ~£250 per leadership contest vote. This looks a much less attractive value proposition to me.

Comment by gregory_lewis on The Labour leadership election: a high leverage, time-limited opportunity for impact (*1 week left to register for a vote*) · 2020-01-13T21:29:56.998Z · score: 28 (12 votes) · EA · GW

Forgive me, but your post didn't exactly avoid any doubt, given:

1) The recommendation in the second paragraph is addressed to everyone regardless of political sympathy:

We believe that, if you're a UK citizen or have lived in the UK for the last year, you should pay £4.38 to register to vote in the current Labour leadership, so you can help decide 1 of the 2 next candidates for Prime Minister. (My emphasis)

2) Your OP itself gives a few reasons for why those "indifferent or hostile to Labour Party politics" would want to be part of the selection. As you say:

For £4.38, you have a reasonable chance of determining the next candidate PM, and therefore having an impact in the order of billions of pounds. (Your emphasis)

Even a committed conservative should have preferences on "conditional on Labour winning in the next GE, which Labour MP would I prefer as PM?" (/plus the more Machiavellian "who is the candidate I'd most want leading Labour, given I want them to lose to the Conservatives?").

3) Although the post doesn't advocate joining just to cancel after voting, noting that one can 'cancel any time', alongside the main motivation being offered taking advantage of a time-limited opportunity for impact (and alongside the quoted cost being a single month of membership) makes this strategy not a dazzling feat of implicature (indeed, it would be the EV-maximising option taking the OP's argument at face value).


Had the post merely used the oncoming selection in Labour to note there is an argument for political party participation similar to voting (i.e. getting a say in the handful of leading political figures); clearly stressed this applied across the political spectrum (and so was more a recommendation that EAs consider this reason to join the party they are politically sympathetic in expectation of voting in future leadership contests, rather than the one which happens to have a leadership contest on now); and strenuously disclaimed any suggestion of hit and run entryism (noting defection for various norms with existing members of the party, membership mechanisms being somewhat based on trust that folks aren't going to 'game them', etc.), I would have no complaints. But it didn't (although I hope it will), so here we are.

Comment by gregory_lewis on The Labour leadership election: a high leverage, time-limited opportunity for impact (*1 week left to register for a vote*) · 2020-01-13T16:44:05.746Z · score: 35 (21 votes) · EA · GW

I'm not a huge fan of schemes like this, as it seems the path to impact relies upon strategic defection of various implicit norms.

Whether or not political party membership asks one to make some sort of political declaration, the spirit of membership is surely meant for those who sincerely intend to support the politics of the party in question.

I don't think Labour members (perhaps authors of this post excluded) or leadership would want to sell a vote for their future leader at £4.38 each to anyone willing to fill out an application form - especially to those indifferent or hostile to Labour Party politics. That we can buy one anyway (i.e. sign up then leave a month later) suggests we do so by taking advantage of their good faith: that folks signing up aren't just doing it to get a vote on the leadership election, that they intend to stick around for a while, that they'll generally vote for and support Labour, etc.

If this 'hit and run entryism' became a common tactic (e.g. suppose 'tactical tories' pretended to defect from the Conservatives this month to vote for the Labour candidate the Conservatives wanted to face in the next election) we would see parties act to close this vulnerability (I think the Conservatives did something like this in terms of restricting eligible members to those joining before a certain date for their most recent leadership contest).

I'd also guess that ongoing attempts to 'game' this sort of thing is bad for the broader political climate, as (as best as I can tell) a lot of it runs on trust rather than being carefully proofed against canny selectoral tactics (e.g. although all parties state you shouldn't be a member of more than one at a time, I'm guessing it isn't that hard to 'get away with it'). Perhaps leader selection is too important to justly leave to only party members (perhaps there should be 'open primaries'), but 'hit and run entryism' seems very unlikely to drive one towards this, but merely greater barriers to entry for party political participation, and lingering suspicion and mistrust.

Comment by gregory_lewis on Has pledging 10% made meeting other financial goals substantially more difficult? · 2020-01-09T14:56:32.313Z · score: 8 (5 votes) · EA · GW

FWIW I have found it more costly - I think this almost has to be true, as $X given to charity is $X I cannot put towards savings, mortgages, etc. - but, owed to fortunate circumstances, not very burdensome to deal with. I expect others will have better insight to offer.

Given your worries, an alternative to the GWWC pledge which might be worth contemplating is the one at The Life You Can Save. Their recommended proportion varies by income (i.e. a higher % with larger incomes), and is typically smaller than GWWC across most income bands (on their calculator, you only give 10% at ~$500 000 USD, and <2% up to ~$100 000).

Another suggestion I would make is it might be worth waiting for a while longer than "Once I have a job and I'm financially secure" before making a decision like this. It sounds like some of your uncertainties may become clearer with time (e.g. once you enter your career you may get a clearer sense of what your earning trajectory is going to look like, developments in your personal circumstances may steer you towards or away from buying a house). Further, 'experimenting' with giving different proportions may also give useful information.

How long to wait figuring things out doesn't have an easy answer: most decisions can be improved by waiting to gather more information, but most also shouldn't be 'put off' indefinitely. That said, commonsense advice would be to give oneself plenty of time when weighing up whether to make important lifelong commitments. Personally speaking, I'm glad I joined GWWC (when I was still a student), and I think doing so was the right decision, but - although I didn't rush in a whim - I think a wiser version of me would have taken greater care than I in fact did.

Comment by gregory_lewis on In praise of unhistoric heroism · 2020-01-08T11:03:15.042Z · score: 41 (20 votes) · EA · GW


Forgive me playing to type and offering a minor-key variation on the OP's theme. Any EA predisposition for vainglorious grasping after heroism is not only an unedifying shape to draw one's life, but also implies attitudes that are themselves morally ugly.

There are some (mercifully few) healthcare professionals who are in prison: so addicted to the thrill of 'saving lives' they deliberately inflicted medical emergencies on their patients so they had the opportunity to 'rescue' them.

The error in 'EA-land' is of a similar kind (but a much lower degree): it is much better from the point of view of the universe that no one needs your help. To wish instead they are arranged in jeopardy as some potemkin vale of soul-making to demonstrate one's virtue (rightly, ego) upon is perverse.

(I dislike 'opportunity' accounts of EA for similar reasons: that (for example) millions of children are likely to die before their fifth birthday is a grotesque outrage to the human condition. Excitement that this also means one has the opportunity make this number smaller is inapt.)

Likewise, 'total lifetime impact (in expectation)' is the wrong unit of account to judge oneself. Not only because moral luck intervenes in who you happen to be (more intelligent counterparts of mine could 'do more good' than I - but this can't be helped), but also in what world one happens to inhabit.

I think most people I met in medical school (among other comparison classes) are better people than I am: across the set of relevant possible circumstances we could find ourselves, I'd typically 'do less good' than the cohort average. If it transpires I end up doing much more good than them, it will be due to the accident where particular features of mine - mainly those I cannot take moral credit for, and some of which are blameworthy - happen to match usefully to particular features of the world which themselves should only be the subject of deep regret. Said accident is scant cause for celebration.

Comment by gregory_lewis on EA Survey 2019 Series: Cause Prioritization · 2020-01-07T15:18:16.919Z · score: 5 (3 votes) · EA · GW

It was commendable to seek advice, but I fear in this case the recommendation you got doesn't hit the mark.

I don't see the use of 'act (as if)' as helping much. Firstly, it is not clear what it means to be 'wrong about' 'acting as if the null hypothesis is false', but I don't think however one cashes this out it avoids the problem of the absent prior. Even if we say "We will follow the policy of rejecting the null whenever p < alpha", knowing the error rate of this policy overall still demands a 'grand prior' of something like "how likely is a given (/randomly selected?) null hypothesis we are considering to be true?"

Perhaps what Lakens has in mind is as we expand the set of null hypothesis we are testing to some very large set the prior becomes maximally uninformative (and so alpha converges to the significance threshold), but this is deeply uncertain to me - and, besides, we want to know (and a reader might reasonably interpret the rider as being about) the likelihood of this policy getting the wrong result for the particular null hypothesis under discussion.


As I fear this thread demonstrates, p values are a subject which tends to get more opaque the more one tries to make them clear (only typically rivalled by 'confidence interval'). They're also generally much lower yield than most other bits of statistical information (i.e. we generally care a lot more about narrowing down the universe of possible hypotheses by effect size etc. rather than simply excluding one). The write-up should be welcomed for providing higher yield bits of information (e.g. effect sizes with CIs, regression coefficients, etc.) where it can.

Most statistical work never bothers to crisply explain exactly what it means by 'significantly different (P = 0.03)' or similar, and I think it is defensible to leave it at that rather than wading into the treacherous territory of trying to give a clear explanation (notwithstanding the fact the typical reader will misunderstand what this means). My attempt would be not to provide an 'in-line explanation', but offer an explanatory footnote (maybe after the first p value), something like this:

Our data suggests a trend/association between X and Y. Yet we could also explain this as a matter of luck: even though in reality X and Y are not correlated [or whatever], it may we just happened to sample people where those high in X also tended to be high in Y, in the same way a fair coin might happen to give more heads than tails when we flip it a number of times.
A p-value tells us how surprising our results would be if they really were just a matter of luck: strictly, it is the probability of our study giving results as or more unusual than our data if the 'null hypothesis' (in this case, there is no correlation between X and Y) was true. So a p-value of 0.01 means our data is in the top 1% of unusual results, a p-value of 0.5 means our data is in the top half of unusual results, and so on.
A p-value doesn't say all that much by itself - crucially, it doesn't tell us the probability of the null hypothesis itself being true. For example, a p-value of 0.01 doesn't mean there's a 99% probability the null hypothesis is false. A coin being flipped 10 times and landing heads on all of them is in the top percentile (indeed, roughly the top 0.1%) of unusual results presuming the coin is fair (the 'null hypothesis'), but we might have reasons to believe, even after seeing only heads after flipping it 10 times, to believe it is probably fair anyway (maybe we made it ourselves with fastidious care, maybe its being simulated on a computer and we've audited the code, or whatever). At the other extreme, a P value of 1.0 doesn't mean we know for sure the null hypothesis is true: although seeing 5 heads and 5 tails from 10 flips is the least unusual result given the null hypothesis (and so all possible results are 'as more more unusual' than what we've seen), it could be the coin is unfair and we just didn't see it.
What we can use a p-value for is as a rule of thumb for which apparent trends are worth considering further. If the p-value is high the 'just a matter of luck' explanation for the trend between X and Y is credible enough we shouldn't over-interpret it, on the other hand, a low p-value makes the apparent trend between X and Y an unusual result if it really were just a matter of luck, and so we might consider alternative explanations (e.g. our data wouldn't be such an unusual finding if there really was some factor that causes those high in X to also be high in Y).
'High' and 'low' are matters of degree, but one usually sets a 'significance threshold' to make the rule of thumb concrete: when a p-value is higher than this threshold, we dismiss an apparent trend as just a matter of luck - if it is lower, we deem it significant. The standard convention is for this threshold to be p=0.05.
Comment by gregory_lewis on EA Survey 2019 Series: Cause Prioritization · 2020-01-03T11:14:24.032Z · score: 6 (3 votes) · EA · GW

Good work. A minor point:

I don't think the riders when discussing significant results along the lines of "being wrong 5% of the time in the long run" sometimes doesn't make sense. Compare

How substantial are these (likely overestimated) associations? We highlight here only the largest detected effects in our data (odds ratio close to or above 2 times greater) that would be surprising to see, if there were no associations in reality and we accepted being wrong 5% of the time in the long run.


Welch t-tests of gender against these scaled cause ratings have p-values of 0.003 or lower, so we can act as if the null hypothesis of no difference between genders is false, and we would not be wrong more than 5% of the time in the long run.

Although commonly the significance threshold is equated with the 'type 1 error rate' which in turn is equated with 'the chance of falsely rejecting the null hypothesis', this is mistaken (1). P values are not estimates of the likelihood of the null hypothesis, but of the observation (as or more extreme) conditioned on that hypothesis. P(Null|significant result) needs one to specify the prior. Likewise, T1 errors are best thought of as the 'risk' of the test giving the wrong indication, rather than the risk of you making the wrong judgement. (There's also some remarks on family-wise versus false discovery rates which can be neglected.)

So the first quote is sort-of right (although assuming the null then talking about the probability of being wrong may confuse rather than clarify), but the second one isn't: you may (following standard statistical practice) reject the null hypothesis given P < 0.05, but this doesn't tell you there is a 5% chance of the null being true when you do so.

Comment by gregory_lewis on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-30T19:00:25.931Z · score: 10 (6 votes) · EA · GW

I think it would be worthwhile to separate these out from the text, and (especially) to generate predictions that are crisp, distinctive, and can be resolved in the near term. The QRI questions on metaculus are admirably crisp (and fairly near term), but not distinctive (they are about whether certain drugs will be licensed for certain conditions - or whether evidence will emerge supporting drug X for condition Y, which offer very limited evidence for QRI's wider account 'either way').

This is somewhat more promising from your most recent post:

I’d expect to see substantially less energy in low-frequency CSHWs [Connectome-Specific Harmonic Waves] after trauma, and substantially more energy in low-frequency CSHWs during both therapeutic psychedelic use (e.g. MDMA therapy) and during psychological integration work.

This is crisp, plausibly distinctive, yet resolving this requires a lot of neuroimaging work which (presumably) won't be conducted anytime soon. In the interim, there isn't much to persuade a sceptical prior.

Comment by gregory_lewis on EA Hotel Fundraiser 5: Out of runway! · 2019-11-25T09:49:33.692Z · score: 11 (6 votes) · EA · GW

I agree it would surprise if EA happened upon the optimal cohabitation level (although perhaps not that surprising, given individuals can act by the lights of their best interest which may reasonably approximate the global optimum), yet I maintain the charitable intervention hypothetical is a poor intuition pump as most people would be dissuaded from 'intervening' to push towards the 'optimal cohabitation level' for 'in practice' reasons - e.g. much larger potential side-effects of trying to twiddle this dial, preserving the norm of leaving people to manage their personal lives as they see best, etc.

I'd probably want to suggest the optimal cohabitation level is below what we currently observe (e.g. besides the issue Khorton mentions, cohabitation with your employees/bosses/colleagues or funder/fundee seems to run predictable risks), yet be reluctant to 'intervene' any further up the coercion hierarchy than expressing my reasons for caution.

Comment by gregory_lewis on Are comment "disclaimers" necessary? · 2019-11-25T09:16:09.848Z · score: 20 (10 votes) · EA · GW

I sometimes disclaim (versus trying to always disclose relevant CoI), with a rule-of-thumb along the lines of the expected disvalue of being misconstrued as presenting a corporate view of my org.

This is a mix of likelihood (e.g. I probably wouldn't bother disclaiming an opinion on - say - SCI versus AMF, as a reasonable person is unlikely to think there's going to be an 'FHI view' on global health interventions) and impact (e.g. in those - astronomically rare - cases I write an asperous criticism of something-or-other, even if its pretty obvious I'm not speaking on behalf of my colleagues, I might want to make extra-sure).

I agree it isn't ideal (cf. Twitter, where it seems a lot of people need to expressly disclaim retweets are not endorsements, despite this norm being widely acknowledged and understood). Alas, some 'defensive' writing may be necessary if there are uncharitable or malicious members of ones audience, and on the internet this can be virtually guaranteed.

Also, boilerplate disclaimers don't magically prevent what you say reflecting upon your affiliates. I doubt EA org X, who has some association with Org Y, would be happy with a staffer saying something like, "Figuratively speaking, I hope we burn the awful edifice of Org Y - wrought out of its crooked and rotten timber from which nothing good and straight was ever made - to the ground, extirpate every wheedling tendril of its fell influence in our community, and salt the sewage-suffused earth from whence it came [speaking for myself, not my employer]". I get the impression I bite my tongue less than the typical 'EA org employee': it may be they are wiser, rather than I braver.

Comment by gregory_lewis on EA Hotel Fundraiser 5: Out of runway! · 2019-11-25T07:48:09.788Z · score: 14 (7 votes) · EA · GW

The reversal test doesn't mean 'if you don't think a charity for X is promising, you should be in favour of more ¬X'. I may not find homeless shelters, education, or climate change charities promising, yet not want to move in the direction of greater homelessness, illiteracy, or pollution.

If (like me) you'd prefer EA to move in the direction of 'professional association' rather than 'social movement', this attitude's general recommendation to move away from communal living (generally not a feature of the former, given the emphasis on distinguishing between personal and professional lives) does pass the reversal test, as I'd forecast having the same view even if the status quo was everyone already living in group house (or vice versa).

Comment by gregory_lewis on Updates from Leverage Research: history, mistakes and new focus · 2019-11-23T18:40:13.144Z · score: 40 (15 votes) · EA · GW

Scrubbing yourself from seems to be an action taken not from desire to save time communicating, but from a desire to avoid others learning. It seems like that's a pretty big factor that's going on here and would be worth mentioning.

This is especially jarring alongside the subsequent recommendation (3.2.5) that one should withhold judgement on whether 'Leverage 1.0' was a good use of resources given (inter alia) the difficulties of assessing unpublished research.

Happily, given the laudable intention of Leverage to present their work going forward (including, one presumes, developments of efforts under 'Leverage 1.0'), folks who weren't around at the start of the decade will be able to do this - and those of us who were should find the ~2012 evidence base superseded.

Comment by gregory_lewis on Leverage Research: reviewing the basic facts · 2019-11-21T16:54:27.908Z · score: 18 (5 votes) · EA · GW

I also don't know much about it, but I think Reserve includes a couple of coins. 'Reserve Rights' is not intended to be a stablecoin (I think it is meant to perform some function for the stablecoin system, but I'm ignorant of what it is), whilst 'Reserve', yet to be released, is meant to be stable.

Comment by gregory_lewis on EA Hotel Fundraiser 6: Concrete outputs after 17 months · 2019-11-06T16:22:28.185Z · score: 14 (5 votes) · EA · GW

In fairness, a lot of the larger grants/projects are not seeking funding from smaller donors, so discussing (e.g.) OpenPhil's latest grants may not be hugely action relevant.

I'd also guess that some critics may not be saying much not because they're put off by sounding mean, but rather their critical view arises from their impression of the existing evidence/considerations rather than from something novel to the existing discussion. If (e.g.) one believes the hotel has performed poorly in terms of outputs given inputs it seems unnecessary to offer that as commentary: folks (and potential donors) can read the OP and related documents themselves and come to their own conclusion.

Comment by gregory_lewis on Deliberation May Improve Decision-Making · 2019-11-05T15:23:45.908Z · score: 2 (1 votes) · EA · GW

I'm not sure how google scholar judges relevance (e.g. I can imagine eye-catching negative results also being boosted up the rankings) but I agree it is a source of distortion - I'd definitely offer it as 'better than nothing' rather than good. (Perhaps one tweak would be sample by a manageable date range rather than relevance, although one could worry about time trends).

A better option (although it has some learning curve and onerousness) is query a relevant repository, export all the results, and take a random sample from these.

Comment by gregory_lewis on Deliberation May Improve Decision-Making · 2019-11-05T11:16:03.437Z · score: 16 (8 votes) · EA · GW

It seems like this review contains a relative paucity of research supporting the null hypothesis that deliberation does not improve decision making (or, for that matter, the alternative hypothesis that it actually worsens decision making). Were you unable to find studies taking this position? If not, how worried are you about the file-drawer effect here?

One approach worth considering when engaging with topics like these is to adopt a systematic review rather than a narrative one. The part which would be particularly helpful re. selection worries is having a pre-defined search strategy, which can guard against inadvertently gathering a skewed body of evidence. (If there are lots of quantitative results, there are statistical methods to assess publication bias, but that typically won't be relevant here - that said, there are critical appraisal tools you can use to score study quality which offers indirect indication, as well as value generally in getting a sense of how trustworthy the consensus of the literature is).

These are a lot more work, but may be worth it all the same. With narrative reviews on topics where I expect there to be a lot of relevant (primary) literature, I always have the worry one could tell a similarly persuasive story for many different conclusions depending on whether the author happened upon (or was more favourably disposed to) one or other region of the literature.

'Quick and dirty' systematisation can also help: e.g. 'I used search term [x] in google scholar and took the first 20 results - of the relevant ones, 8 were favourable, and 1 was neutral'.

Comment by gregory_lewis on EA Hotel Fundraiser 5: Out of runway! · 2019-10-26T07:02:09.470Z · score: 31 (16 votes) · EA · GW

We were going to wait until “Giving Season” starts in December to start a fresh fundraiser, but now can’t afford to do that. We have several things in the pipeline to bolster our case (charity registration, compiling more outputs, more fundraising posts detailing the case for the hotel, hiring round for the Community & Projects Manager, refining internal systems), but they may not reach fruition in time unfortunately.

I'd expect the EA hotel to have fairly constant operating costs (and thus reliable forecasts of runway remaining). So I'd be keen to know what happened to leave the EA hotel out of position where its planned fundraising efforts would occur after they had already run out of money.

More directly, I'm concerned that the already-linked fb group discussion suggests the EA hotel bought the derelict building next door. I hesitate to question generosity, and it is unclear when this happened - or how much it cost - but CapEx (especially in this case where the CapEx doesn't secure more capacity but an option on more capacity, as further investment is needed to bring it online) when one has scarce reserves and uncertain funding looks inopportune.

Comment by gregory_lewis on Probability estimate for wild animal welfare prioritization · 2019-10-24T11:39:27.434Z · score: 2 (1 votes) · EA · GW

Yet in these cases, it is the lexicality, not the suffering focus, which is doing the work to avoid the counter-example. A total utilitarian could adopt lexicality in a similar way to avoid the (very/) repugnant conclusion (e.g., lives in a 'repugnant region' between zero and some 'barely worth living' should be 'counted as zero' when weighing things up, save as a tie-breaker between equally-good worlds). [I'm not recommending this approach - lexicality also has formidable costs across the scales from its potential to escape population ethics counter-examples].

It seems to miss the mark to say it is an advantage for suffering-focused views to avoid the (v/) repugnant conclusion, if the 'suffering focus' factor, taken alone, merely exchanges the (v/) repugnant conclusion for something which looks even worse by the lights of common intuition; and where the resources that can be called upon to avoid either counter-example are shared between SF and ¬SF views.

Comment by gregory_lewis on Conditional interests, asymmetries and EA priorities · 2019-10-22T19:03:55.262Z · score: 2 (1 votes) · EA · GW

I claim we can do better than simply noting 'all theories have intuitive costs, so which poison you pick is a judgement call'. In particular, I'm claiming that the 'only thwarted preferences count' poses extra intuitive costs: that for any intuitive population ethics counter-example C we can confront a 'symmetric' theory with, we can dissect the underlying engine that drives the intuitive cost, find it is orthogonal to the 'only thwarted preferences count' disagreement, and thus construct a parallel C* to the 'only thwarted preferences count' view which uses the same engine and is similarly counterintuitive, and often a C** which is even more counter-intuitive as it turns the screws to exploit the facial counter-intuitiveness of 'only thwarted preferences count' view. I.e.

Alice: Only counting thwarted preferences looks counter-intuitive (e.g. we generally take very happy lives as 'better than nothing', etc.) classical utilitarianism looks better.

Bob: Fair enough, these things look counter-intuitive, but theories are counter-intuitive. Classical utilitarianism leads to the very repugnant conclusion (C) in population ethics, after all, whilst mine does not.

Alice: Not so fast. Your view avoids the very repugnant conclusion, but if you share the same commitments re. continuity etc., these lead your view to imply the similarly repugnant conclusion (and motivated by factors shared between our views) that any n lives tormented are preferable to some much larger m of lives which suffer some mild dissatisfaction (C*).

Furthermore, your view is indifferent to how (commonsensically) happy the m people are, so (for example) 10^100 tormented lives are better than TREE(9) lives which are perfectly blissful, but for a 1 in TREE(3) chance [to emphasise, this chance is much smaller than P(0.0 ...[write a zero on every plank length in the observable universe]...1)] of suffering an hour of boredom once in their life. (C**)

Bob can adapt his account to avoid this conclusion (e.g. dropping continuity), but Alice can adapt her account in a parallel fashion to avoid the very repugnant conclusion too. Similarly, 'value receptacle'-style critiques seem a red herring, as even if they are decisive for preference views over hedonic ones in general, they do not rule between 'only thwarted preferences count' and 'satisfied preferences count too' in particular.

Comment by gregory_lewis on Conditional interests, asymmetries and EA priorities · 2019-10-22T17:09:33.619Z · score: 2 (1 votes) · EA · GW

I think the part that's the most unacceptable about the repugnant conclusion is that you go from an initial paradise where all the people who exist are perfectly satisfied (in terms of both life goals and hedonics) to a state where there's suffering and preference dissatisfaction.

I hesitate to exegete intuitions, but I'm not convinced this is the story for most. Parfit's initial statement of the RP didn't stipulate the initial population were 'perfectly satisfied' but that they 'merely' had a "very high quality of life" (cf.). Moreover, I don't think most people find the RP much less unacceptable if the initial population merely enjoys very high quality of life versus perfect satisfaction.

I agree there's some sort intuition that 'very good' should be qualitatively better than 'barely better than nothing', so one wants to resist being nickel-and-dimed into the latter (cf. critical level util, etc.). I also agree there's person-affecting intuitions (although there's natural moves like making the addition of A+ also increase the welfare of those originally in A, etc.)

Comment by gregory_lewis on Conditional interests, asymmetries and EA priorities · 2019-10-22T16:58:37.667Z · score: 2 (1 votes) · EA · GW

For what it's worth, that example is a special case of the Sadistic Conclusion

It isn't (at least not as Arrhenius defines it). Further, the view you are proposing (and which my example was addressed to) can never endorse a sadistic conclusion in any case. If lives can only range between more or less bad (i.e. fewer or more unsatisfied preferences, but the amount/proportion of satisfied preferences has no moral bearing), the theory is never in a position to recommend adding 'negative welfare' lives over 'positive welfare' ones, as it denies one can ever add 'positive welfare' lives.

Although we might commonsensically say people in A, or A+ in the repugnant conclusion (or 'A' in my example) have positive welfare, your view urges us that this is mistaken, and we should take them to be '-something relatively small' versus tormented lives which are '- a lot': it would still be better for those in any of the 'A cases' had they not come into existence at all.

Where we put the 'zero level' doesn't affect the engine of the repugnant conclusion I identify: if we can 'add up' lots of small positive increments (whether we are above or below the zero level), this can outweigh a smaller number of much larger negative shifts. In the (very/) repugnant conclusion, a vast multitude of 'slightly better than nothing' lives can outweigh very large negative shifts to a smaller population (either to slightly better than nothing, or, in the very repugnant case, to something much worse). In mine, avoiding a vast multitude of 'slightly worse than nothing' lives can be worth making a smaller group have 'much worse than nothing' lives.

As you say, you can drop separability, continuity (etc.) to avoid the conclusion of my example, but these are resources available for (say) a classical utilitarian to adopt to avoid the (very/) repugnant conclusion too (naturally, these options also bear substantial costs). In other words, I'm claiming that although this axiology avoids the (v/) repugnant conclusion, if it accepts continuity etc. it makes similarly counter-intuitive recommendations, and if it rejects them it faces parallel challenges to a theory which accepts positive utility lives which does the same.

Why I say it fares 'even worse' is that most intuit 'an hour of boredom and (say) a millenia of a wonderfully happy life' is much better, and not slightly worse, than nothing at all. Thus although it seems costly (for parallel reasons for the repugnant conclusion) to accept any number of tormented lives could be preferable than some vastly larger number of lives that (e.g.) pop into existence to briefly experience mild discomfort/preference dissatisfaction before ceasing to exist again, it seems even worse that the theory to be indifferent to that each of these lives are now long ones which, apart from this moment of brief preference dissatisfaction experience unalloyed joy/preference fulfilment, etc.

Comment by gregory_lewis on Conditional interests, asymmetries and EA priorities · 2019-10-22T10:54:16.939Z · score: 5 (3 votes) · EA · GW

I think it is generally worth seeing population ethics scenarios (like the repugnant conclusion) as being intuition pumps of some principle or another. The core engine of the repugnant conclusion is (roughly) the counter-intuitive implications of how a lot of small things can outweigh a large thing. Thus a huge multitude of 'slightly better than not' lives can outweigh a few very blissful ones (or, turning the screws as Arrhenius does, for any number of blissful lives, there some - vastly larger - number of 'slightly better than not' lives for which it would be worth making these lives terrible for.)

Yet denying lives can ever go better than neutral (counter-intuitive to most - my life isn't maximally good, but I think it is pretty great and better than nothing) may evade the repugnant conclusion, but doesn't avoid this core engine of 'lots of small things can outweigh a big thing'. Among a given (pre-existing, so possessing actual interests, not that this matters much) population, it can be worth torturing a few of these to avert sufficiently many pin-pricks/minor thwarted preferences to the rest.

I also think negative leaning views (especially with stronger 'you can't do better than nothing' ones as suggested here) generally fare worse with population ethics paradoxes, as we can construct examples which not just share the core engine driving things like the repugnant conclusion, but are amplified further by adding counter-intuitive aspects of the negative view in question.

E.g. (and owed to Carl Shulman): suppose A is a vast population (say Tree(9), whatever) of people who are much happier than we are now, and live lives of almost-perfect preference satisfaction, but for a single mild thwarted preference (say they have to wait in a queue bored for an hour before they get into heaven). Now suppose B is a vast (but vastly smaller, say merely 10^100) population living profoundly awful lives. The view outlined in the OP above seems to recommend B over A (as a lot of small thwarted preferences among those in B can trade off each awful life in B), and generally that that any number of horrendous lives can be outweighed if you can abolish a slightly imperfect utopia of sufficient size, which seems to go (wildly!) wrong both in the determination and the direction (as A gets larger and larger, B becomes a better and better alternative).

Comment by gregory_lewis on Reality is often underpowered · 2019-10-14T22:04:19.920Z · score: 4 (2 votes) · EA · GW

I think I would stick to my guns on the sample size point, although I think you would agree with it if I had expressed it better in the OP.

I agree with you sample sizes of 200 (or 20, or less) can be good enough depending on the context. My core claim is these contexts do not obtain for lots of EA problems: the units vary a lot, the variance is explained by other factors than the one we're interested in, and the variance explained by the intervention/factor of interest will be much smaller (i.e. high variance across units, small effect sizes).

[My intuition driving the confounders point is the balancing these doesn't look feasible if they are sufficiently heavy-tailed (e.g. take all countries starting with A-C, and randomly assign to two arms, these arms will tend to have very large differences in (say) mean GDP), and the implied premise being lots of EA problems will be ones where factors like these are expected to have greater effect than the upper bound on the intervention. I may be off-base.]

Comment by gregory_lewis on Reality is often underpowered · 2019-10-14T21:42:02.711Z · score: 11 (5 votes) · EA · GW

Thanks, Will!

I definitely agree we can look at qualitative data for hypothesis generation (after all, n=1 is still an existence proof). But I'd generally recommend breadth-first rather than depth-first if we're trying to adduce considerations.

For many/most sorts of policy decisions although we may find a case of X (some factor) --> Y (some desirable outcome), we can probably also find cases of ¬X --> Y and X --> ¬Y. E.g., contrasting with what happened with prospect theory, there are also cases where someone happened on an important breakthrough with much less time/effort, or where people over-committed to an intellectual dead-end (naturally, partisans of X or ¬X tend to be good at cultivating sets of case-studies which facially support the claim it leads to Y.)

I generally see getting a steer of the correlation of X and Y (so the relative abundance of (¬/)X --> (¬/)Y across a broad reference class as more valuable than determining whether in a given case (even one which seems nearby to the problem we're interested in) X really was playing a causal role in driving Y. Problems of selection are formidable, but I take the problems of external validity to tend even worse (and worse enough to make the former have a better ratio of insight:resources).

Thus I'd be much more interested to see (e.g.) a wide survey of cases which suggests movements prone to in-fighting tend to be less successful than an in depth look of how in-fighting caused the destruction of a nearby analogue to the EA community. Ditto the 'macro' in macrohistory being at least partly about trying to adduce takeaways across history, as well as trying to divine its big contours.

And although I think work like this is worthwhile to attempt, I think in some instances we may come to learn that reality is so underpowered that there's essentially no point doing research (e.g. maybe large bits of history are just ultra-chaotic, so all we can ever see is noise).

Comment by gregory_lewis on Shapley values: Better than counterfactuals · 2019-10-11T13:05:46.650Z · score: 9 (7 votes) · EA · GW

Thanks for this post. I'm also pretty enthusiastic about Shapley values, and it is overdue for a clear presentation like this.

The main worry I have is related to the first one GeorgeBridgwater notes: the values seem very sensitive to who one includes as a co-operative counterparty (and how finely we individuate them). As your example with vaccine reminders shows, different (but fairly plausible) accounts of this can change the 'raw' CE estimate by a factor of five.

We may preserve ordering among contributors if we twiddle this dial, but the more typical 'EA problem' is considering different interventions (and thus disjoint sets of counter-parties). Although typical 'EA style' CE estimates likely have expected errors in their exponent rather than their leading digit, a factor of 5 (or maybe more) which can hinge on relatively arbitrary decisions on how finely to individuate who we are working with looks pretty challenging to me.

Comment by gregory_lewis on [Link] What opinions do you hold that you would be reluctant to express in front of a group of effective altruists? Anonymous form. · 2019-10-03T14:43:37.148Z · score: 39 (15 votes) · EA · GW
I think there’s issues of adversarial bias with it being fully public (e.g. people writing inaccurate/false-flag entries out of spite) and it could be better in future to do a version with Forum users with >100 karma.

Indeed. Anon open forms are maximally vulnerable to this: not only can detractors write stuff (for example, this poll did show up on reddits that are archly critical of EA etc.), but you can signal-boost your own renegade opinion if you're willing to make the trivial effort to repeatedly submit it (e.g. "I think Alice sucks and people should stop paying attention to her", "I completely agree with the comment above - Alice is just really toxic to this community", "Absolutely agreed re. Alice, but I feel I can't say anything publicly because she might retaliate against me", etc.)

Comment by gregory_lewis on Bioinfohazards · 2019-09-30T08:28:22.026Z · score: 13 (8 votes) · EA · GW

Hello Spiracular,

Is it your impression that whenever you -or talented friends in this area- come up with a reasonably-implementable good idea, that after searching around, you tend to discover that someone else has already found it and tried it?

I think this is somewhat true, although I don't think this (or the suggestions for bottlenecks in the paragraph below) quite hits the mark. The mix of considerations are something like these:

1) I generally think the existing community covers the area fairly competently (from an EA perspective). I think the main reason for this is because the 'wish list' of what you'd want to see for (say) a disease surveillance system from an EA perspective will have a lot of common elements with what those with more conventional priorities would want. Combined with the billions of dollars and lots of able professionals, even areas which are neglected in relative terms still tend to have well-explored margins.

1.1) So there are a fair few cases where I come across something in the literature that anticipates an idea I had, or of colleagues/collaborators reporting back, "It turns out people are already trying to do all the things I'd want them to do re. X".

1.2) Naturally, given I'm working on this, I don't think there's no more good ideas to have. But it also means I foresee quite a lot of the value is rebalancing/pushing on the envelope of the existing portfolio rather than 'EA biosecurity' striking out on its own.

2) A lot turns on 'reasonably-implementable'. There's a generally treacherous terrain that usually lies between idea and implementation, and propelling the former to the latter through this generally needs a fair amount of capital (of various types). I think this is the typical story for why many fairly obvious improvements haven't happened.

2.1) For policy contributions, perhaps the main challenge is buy-in. Usually one can't 'implement yourself', and rely instead on influencing the relevant stakeholders (e.g. science, industry, government(s)) to have an impact. Bandwidth is generally limited in the best case, and typical cases tend to be fraught with well-worn conflicts arising from differing priorities etc. Hence the delicateness mentioned above.

2.2) For technical contributions, there are 'up-front' challenges common to doing any sort of bio-science research (e.g. wet-labs are very expensive). However, pushing one of these up the technology readiness levels to implementation also runs into similar policy challenges (as, again, you can seldom 'implement yourself').

3) This doesn't mean there are no opportunities to contribute. Even if there's a big bottleneck further down the policy funnel, new ideas upstream still have value (although knowing what the bottleneck looks like can help one target these to have easier passage - and not backfire), and in many cases there will be more incremental work which can lay the foundation for further development. There could be a synergistic relationship with folks who are more heavily enmeshed in the existing community can help translate initiatives/ideas from those less so.

Comment by gregory_lewis on Bioinfohazards · 2019-09-22T12:28:57.494Z · score: 46 (18 votes) · EA · GW

Thanks for writing the post. I essentially agree with the steers on which areas are more or less ‘risky’. Another point worth highlighting is that, given these issues tend to be difficult to judge and humans are error-prone, it can be worth running things by someone else. Folks are always welcome to contact me if I can be helpful for this purpose.

But I disagree with the remarks in the post along the lines of that ‘There’s lots of valuable discussion that is being missed out on in EA spaces on biosecurity, due to concerns over infohazards’. Often - perhaps usually - the main motivation for discretion isn’t ‘infohazards!’.

Whilst (as I understand it) the ‘EA’ perspective on AI safety covers distinct issues from mainstream discussion on AI ethics (e.g. autonomous weapons, algorithmic bias), the main distinction between ‘EA’ biosecurity and ‘mainstream’ biosecurity is one of scale. Thus similar topics are shared between both, and many possible interventions/policy improvements have dual benefit: things that help mitigate the risk of smaller outbreaks tend to help mitigate the risk of catastrophic ones.

These topics are generally very mature fields of study. To put it in perspective, with ~5 years in medicine and public health and 3 degrees, I am roughly par for credentials and substantially below-par for experience at most expert meetings I attend - I know people who have worked on (say) global health security longer than I have been alive. I’d guess some of this could be put down to unnecessary credentialism and hierarchalism, and it doesn’t mean there’s nothing to do as all the good ideas have already been thought, but it does make low hanging fruit likely to be plucked, and that useful contributions are hard to make without substantial background knowledge.

These are also areas which tend to have powerful stakeholders, entrenched interests, in many cases (especially security-adjacent issues) great political sensitivity. Thus even areas which are pretty ‘safe’ from an information hazard perspective (e.g. better governance of dual-use research of concern), can be nonetheless delicate to talk about publicly. Missteps are easy to make (especially without the relevant tacit knowledge), and the consequences can be to (as you note in the write-up) to innoculate the idea, but also to alienate powerful interests and potentially discredit the wider EA community.

The latter is something I’m particularly sensitive to. This is partly due to my impression that the ‘growing pains’ in other EA cause areas tended to incur unnecessary risk. It is also due to the reactions of folks the pre-existing community when contemplating EA involvement tend not to be unalloyed enthusiasm. They tend to be very impressed with my colleagues who are starting to work in the area, have an appetite for new ideas and ‘fresh eyes’, and reassured that EAs in this area tend to be cautious and responsible. Yet despite this they tend to remain cautious about the potential to have a lot of inexperienced people bouncing around delicate areas, both in general but also for their exposure to this community in particular, as they are often going somewhat ‘out on a limb’ to support ‘EA biosecurity’ objectives in the first place.

Another feature of this landscape is that the general path to impact of a ‘good biosecurity idea’ is to socialize it in the relevant expert community and build up a coalition of support. (One could argue how efficient this from the point of view of the universe, but it is the case regardless.) In consequence, my usual advice for people seeking to work in this area is that career capital is particularly valuable, not just for developing knowledge and skills, but also gaining the network and credibility to engage with the relevant groups.

Comment by gregory_lewis on Leverage Research: reviewing the basic facts · 2019-09-15T08:56:02.789Z · score: 55 (21 votes) · EA · GW

Hello Larissa,

I'd be eager to see anything that speaks to Leverage's past or present research activity: what have they been trying to find out, what have they achieved, and what are they aiming for at the moment (cf).

As you know from our previous conversations re. Leverage, I'm fairly indifferent to 'they're shady!' complaints (I think if people have evidence of significant wrongdoing, they should come forward rather than briefing adversely off the record), but much less so to the concern that Leverage has an has achieved extraordinarily little for an organisation with multiple full-time staff working for the better part of a decade. Showing something like, "Ah, but see! We've done all these things," or, "Yeah, 2012-6 was a bit of a write-off, but here's the progress we've made since", would hopefully reassure, but in any case be informative for people who would like to have a view on leverage independent of which rumour mill they happen to end up near.

Other things I'd be interested to hear about is what you are planning to work on at Leverage, and what information you investigated which - I assume - leads to a much more positive impression of Leverage than I take the public evidence to suggest.

Comment by gregory_lewis on Leverage Research: reviewing the basic facts · 2019-09-09T17:14:32.698Z · score: 112 (44 votes) · EA · GW
As to other questions relating to Leverage, EA, funding- and attention-worthiness, etc., I’ve addressed some concerns in previous comments and I intend to address a broader range of questions later. I don’t however endorse attack posts as a discussion format, and so intend to keep my responses here brief. The issues you raise are important to a lot of people and should be addressed, so please feel free to contact me or my staff via email if it would be helpful to discuss more.

[Own views]

If an issue is important to a lot of people, private follow-ups seem a poor solution. Even if you wholly satisfy Buck, he may not be able to relay what reassured him to all concerned parties, and thus likely duplication of effort on your part as each reaches out individually.

Of course, this makes more sense as an ill-advised attempt to dodge public scrutiny - better for PR if damning criticism remains in your inbox rather than on the internet-at-large. In this, alas, Leverage has a regrettable track record: You promised 13 months ago to write something within a month to better explain Leverage better, only to make a much more recent edit (cf.) that you've "changed your plans" and encourage private follow-ups rather than giving a public explanation. The pattern of 'promised forthcoming explanation that never arrives' has been going on about as long as Leverage itself (1, 2, 3).

The reason this is ill-advised is that silence is a poor salve for suspicion. If credible concerns remain publicly unanswered, people adversely infer they are likely 'on the money', and their target is staying quiet as they are rationally calculating the preserving whatever uncertainty remains still looks better than trying to contest the point. The more facially credible the concerns (e.g. Leverage has had dozens of person years and has seemed to produce extraordinarily little), and the more assiduous the attempts to avoid addressing them and obscure relevant evidence (e.g. not only taking down all old research, but doing your best to scrub any traces of it from the internet), the more persuasive the adverse inference becomes, and the more likely people are to start writing 'attack posts' [recte public criticism].

The public evidence looks damning to me. I hope it transpires that this is an unfortunate case of miscommunication and misunderstanding, and soon we shall see results that vindicate Leverage/Paradigm's efforts so far. I also hope your faith in Bennett is well-placed, that whatever mix of vices led him to write vile antisemitic ridicule on an email list called 'morning hate' in 2016 bear little relevance to the man he was when with Leverage in ~~2018, or the man he is now.

But not all hopes are expectations.

Comment by gregory_lewis on Are we living at the most influential time in history? · 2019-09-03T14:56:17.815Z · score: 54 (27 votes) · EA · GW

Excellent work; some less meritorious (and borderline repetitious) remarks:

1) One corollary of this line of argument is that even if one is living at a 'hinge of history', one should not reasonably believe this, given the very adverse prior and the likely weak confirmatory evidence one would have access to.

2) The invest for the future strategy seems to rely on our descendants improving their epistemic access to the point where they can reliably determine whether they're at a 'hinge' or not, and deploying resources appropriately. There are grounds for pessimism about this ability ever being attained. Perhaps history (or the universe as a whole) is underpowered for these inferences.

3) Although with the benefit of hindsight over previous times we could assess the distribution of hingeyness/influence across these, to get a sense of the distribution, and so a steer as to whether we should think there are hingey periods of vastly outsized influence in the first place.

4) If we grant the ground truth is occasional 'crucial moments', but we expect evidence at-the-time for living in one of these is scant, my intuition is the optimal strategy would to husband resources to spend these disproportionately when the evidence gives some (but not decisive) indication one of these crucial moments is now.

Depending on how common these 'probably false alarms' are (plus things like how reliably can we steward resources for long periods of time), this might amount to monomaniacal work on immediate challenges. E.g., the prior is (say) 1/million this decade, but if the evidence suggests it is 1%, perhaps we should drop everything to work on it, if we won't expect our credence to be this high again for another millenia.

5) Minor: Although partly priced in to considerations about how 'early' we are, there are also issues of conditional dependence. If extinction risk is 1% this century but 10% the next, one should probably spend somewhat disproportionately on the first one (and other cases where getting access to a 'bigger hinge' relies on going the right way on an earlier, smaller, one).

Comment by gregory_lewis on Movement Collapse Scenarios · 2019-08-29T21:28:15.145Z · score: 3 (2 votes) · EA · GW

I'd correct for attenuation, as we care more about getting the people who in fact will perform the best, rather than those who will seem like they are performing the best by our imperfect measurement.

Also selection procedures can gather other information (e.g. academic history, etc.) which should give incremental validity over work samples. I'd guess this should boost correlation, but there are countervailing factors (e.g., range restriction).

Comment by gregory_lewis on EA Mental Health Survey: Results and Analysis. · 2019-06-13T08:10:19.087Z · score: 22 (12 votes) · EA · GW

Thanks for this. A statistical note:

As best as I can tell, linear regressions are used throughout. This is not an appropriate method for when the dependent variable is binary (e.g. 'mentally ill or not'. Logistic regression should be used instead - and may find some further associations, given linear methods will intuitively be underpowered.