Posts

David Manheim: A Personal (Interim) COVID-19 Postmortem 2020-07-01T06:05:59.945Z · score: 28 (11 votes)
I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA 2020-06-30T19:35:13.376Z · score: 80 (37 votes)
Are there historical examples of excess panic during pandemics killing a lot of people? 2020-05-27T17:00:29.943Z · score: 28 (14 votes)
[Open Thread] What virtual events are you hosting that you'd like to open to the EA Forum-reading public? 2020-04-07T01:49:05.770Z · score: 16 (7 votes)
Should recent events make us more or less concerned about biorisk? 2020-03-19T00:00:57.476Z · score: 22 (9 votes)
Are there any public health funding opportunities with COVID-19 that are plausibly competitive with Givewell top charities per dollar? 2020-03-12T21:19:19.565Z · score: 25 (13 votes)
All Bay Area EA events will be postponed until further notice 2020-03-06T03:19:24.587Z · score: 25 (12 votes)
Are there good EA projects for helping with COVID-19? 2020-03-03T23:55:59.259Z · score: 31 (17 votes)
How can EA local groups reduce likelihood of our members getting COVID-19 or other infectious diseases? 2020-02-26T16:16:49.234Z · score: 23 (15 votes)
What types of content creation would be useful for local/university groups, if anything? 2020-02-15T21:52:00.803Z · score: 6 (1 votes)
How much will local/university groups benefit from targeted EA content creation? 2020-02-15T21:46:49.090Z · score: 24 (11 votes)
Should EAs be more welcoming to thoughtful and aligned Republicans? 2020-01-20T02:28:12.943Z · score: 31 (15 votes)
Is learning about EA concepts in detail useful to the typical EA? 2020-01-16T07:37:30.348Z · score: 42 (22 votes)
8 things I believe about climate change 2019-12-28T03:02:33.035Z · score: 58 (36 votes)
Is there a clear writeup summarizing the arguments for why deep ecology is wrong? 2019-10-25T07:53:27.802Z · score: 11 (6 votes)
Linch's Shortform 2019-09-19T00:28:40.280Z · score: 8 (2 votes)
The Possibility of an Ongoing Moral Catastrophe (Summary) 2019-08-02T21:55:57.827Z · score: 44 (23 votes)
Outcome of GWWC Outreach Experiment 2017-02-09T02:44:42.224Z · score: 14 (16 votes)
Proposal for an Pre-registered Experiment in EA Outreach 2017-01-08T10:19:09.644Z · score: 11 (11 votes)
Tentative Summary of the Giving What We Can Pledge Event 2015/2016 2016-01-19T00:50:58.305Z · score: 7 (7 votes)
The Bystander 2016-01-10T20:16:47.673Z · score: 5 (5 votes)

Comments

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-05T08:52:21.822Z · score: 2 (1 votes) · EA · GW
How is your experience acquiring expertise at forecasting similar/different to acquiring expertise in other domains, e.g. obscure board-games? How so?

Just FYI, I do not consider myself an "expert" on forecasting. I haven't put my 10,000 hours in, and my inside view is that there's so much ambiguity and confusion about so many different parameters. I also basically think judgmental amateur forecasting is a nascent field and there are very few experts[1], with the possible exception of the older superforecasters. Nor do I actually think I'm an expert in those games, for similar reasons. I basically think "amateur, but first (or 10th, or 100th, as the case might be) among equals" is a healthier and more honest presentation.

That said, I think the main commonalities for acquiring skill in forecasting and obscure games include:

  • Focus on generalist optimization for a well-specified score in a constrained system
    • I think it's pretty natural for both humans and AI to do better in more limited scenarios.
    • However, I think in practice, I am much more drawn to those types of problems than my peers (eg I have a lower novelty instinct and I enjoy optimization more).
  • Deliberate practice through fast feedback loops
    • Games often have feedback loops on the order of tens of seconds/minutes (Dominion) or hundreds of milliseconds/seconds (Beat Saber)
    • Forecasting has slower feedback loops, but often you can form an opinion in <30 minutes (sometimes <3 if it's a domain you're familiar with), and have it checked in a few days.
    • In contrast, the feedback loops for other things EA are interested in are often much slower. For example, research might have initial projects on the span of months and have it checked in the span of years, architecture in software engineering might take days to do and weeks to check (and sometimes the time to check is never)
  • Focus on easy problems
    • For me personally, it's often easier for me to get "really good" on less-contested domains than kinda good on very contested domains
      • For example, in Beat Saber rather than trying hard to beat the harder songs, I spent most of my improving time on getting very high scores for the easier songs
      • In forecasting, this meant that making covid-19 forecasts 2-8 weeks out was more appealing than making geopolitical forecasts on the timescale of years, or technological forecasts on the timescale of decades
    • This allowed me to slowly and comfortably move into harder questions
      • For example now I have more confidence and internal models on predicting covid-19 questions multiple months out.
      • If I were to get back into Beat Saber, I'd be a lot less scared of the harder songs than I used to be (after some time ramping back up).
    • I do think not being willing to jump into harder problems directly is something of a character flaw. I'd be interested in hearing other people's thoughts on how they do this.

The main difference, to me is that:

  • Forecasting relies on knowledge of the real world
    • As opposed to games (and for that matter programming challenges) the "system" that you're forecasting on is usually much more unbounded.
    • So knowledge acquisition and value-of-information is much more important per question
    • This is in contrast to games, where knowledge acquisition is important on the "meta-level" but for any specific game,
      • balancing how much knowledge you need to acquire is pretty natural/intuitive.
      • and you probably don't need much new knowledge anyway.

[1] For reasons I might go into later in a different answer

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-05T07:40:58.763Z · score: 20 (5 votes) · EA · GW
What do you think helps make you a better forecaster than the other 989+ people?

I'll instead answer this as:

What helps you have a higher rating than most of the people below you on the leaderboard?
  • I probably answered more questions than many of them.
  • I update my forecasts more quickly than many of them, particularly in March and April
    • Activity has consistently been shown to be one of (often, the) strongest predictors of overall accuracy in the academic literature.
  • I suspect I have a much stronger intuitive sense of probability/calibration.
    • For example, 17% (1:5) intuitively feels very different to me than 20% (1:4), and my sense is that this isn't too common
    • This could just be arrogance however, there isn't enough data for me to actually check this for actual predictions (as opposed to just calibration games)
  • I feel like I actually have lower epistemic humility compared to most forecasters who are top 100 or so on Metaculus. "Epistemic humility" defined narrowly as "willingness to make updates based on arguments I don't find internally plausible just because others believed them."
    • Caveat is that I'm making this comparison solely to top X% (in either activity or accuracy) forecasters.
      • I suspect a fair number of other forecasters are just wildly overconfident (in both senses of the term)
      • Certainly, non-forecasters (TV pundits, say, or just people I see on the internet) frequently seem very overconfident for what seems to me like bad reasons.
  • I'm a pretty competitive person, and I care about scoring well.
    • This might be surprising, but I think a lot of forecasters don't.
    • Some forecasters just want to record their predictions publicly and be held accountable to them, or want to cultivate more epistemic humility by seeing themselves be wrong
      • I think these are perfectly legitimate uses of forecasting, and I actively encourage my friends to use Metaculus and other prediction platforms to do this.
      • However, it should not be surprising that people who want to score well end up on average scoring better.
    • So I do a bunch of things like meditate on my mistakes and try really hard to do better. I think most forecasters, including good ones, do this much less than I do.
  • I know more facts about covid-19.
    • I think the value of this is actually exaggerated, but it probably helps a little.

_____

What do you think other forecasters do to make them have a higher rating than you? [Paraphrased]

Okay, a major caveat here is that I think there is plenty of heterogeneity among forecasters. Another is that I obviously don't have clear insight into why other forecasters are better than me (otherwise I'd have done better!) However, in general I'm guessing they:

  • Have more experience with forecasting.
    • I started in early March and I think many of them have already been forecasting for a year or more (some 5+ years!).
    • I think experience probably helps a lot in building intuition and avoiding a lot of subtle (and not-so-subtle!) mistakes.
  • They usually forecast more questions.
    • It takes me some effort to forecast on new questions, particularly if the template is different from other questions I've forecasted on before, and they aren't something I've thought about before in a non-forecasting context
    • I know some people in the Top 10 literally forecast all questions on Metaculus, which seems like a large time commitment to me.
  • They update forecasts more quickly than me, particularly in May and June.
    • Back in March and April, I was *super* "on top of my game." But right now I have a backlog of old predictions, of which I'm >30 days behind on the earliest one (as in, the last time I updated that prediction was 30+ days ago).
    • This is partially due to doing more covid forecasting on day job, partially due to having some other hobbies, and partially due to general fatigue/loss of interest (akin to lockdown fatigue from others)
  • On average, they're more inclined to do simple mathematical modeling (Guesstimate, Excel, Google Sheets, foretold etc), whereas personally I'm often (not always) satisfied with a few jotted notes on a Google Doc plus a simple arithmetic calculator.

There are also more specific reasons some other forecasters are better than me, but I don't think all or even most of the forecasters better than me have:

  • JGalt seems to read the news both more and more efficiently than I do, and probably knows much more factual information than me.
    • In particular, I recall many times where I see interesting news on Twitter or other places, want to bring it Metaculus, and bam, JGalt has already linked it ahead of me.
      • This is practically a running meme among Metaculus users that JGalt has read all the news.
  • Lukas Gloor and Pablo Stafforini plausibly has a stronger internal causal model of various covid-19 related issues.
  • datscilly often decomposes questions more cleanly than me, and (unlike me and several other forecasters), appears to aggressively prioritize not updating on irrelevant information.
    • He also cares about scores more than I do.
  • I think Pablo, datscilly and some others started predicting on covid-19 questions almost as soon as the pandemic started, so they built up more experience than me not only on general forecasting, but also on forecasting covid-19 related questions specifically.

At least this is what I can gather from their public comments and (in some cases) private conversations. It's much harder for me to interpret how forecasters higher than me on the leaderboard but are otherwise mostly silent think.

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-04T00:52:02.674Z · score: 5 (2 votes) · EA · GW

I mostly just forecasted the covid-19 questions on Metaculus directly. I do think predicting covid early on (before May?) was a near-ideal epistemic environment for this, because of various factors like

a) important
b) in a weird social epistemic state where lots of disparate, individually easy to understand, true information is out there
c) where lots of false information is out there
d) have very fast feedback loops and
e) predicting things/truth-seeking is shocking uncompetitive.

The feedback cycle (maybe several times a week for some individual questions) are still slower than what the deliberate practice research was focused on (specific techniques in arts and sports with sub-minute feedback). But it's much much better than other plausibly important things.

I probably also benefited from practice through the South Bay EA meetups[1] and the Open Phil calibration game.

[1] If going through all the worksheets is intimidating, I recommend just trying this one (start with "Intro to forecasting" and then do the "Intro to forecasting worksheet." EDIT 2020/07/04: Fixed worksheet.

[2] https://www.openphilanthropy.org/blog/new-web-app-calibration-training

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-03T07:23:56.423Z · score: 14 (5 votes) · EA · GW
If you look at your forecasting mistakes, do they have a common thread?

A botched Tolstoy quote comes to mind:

Good forecasts are all alike; every mistaken forecast is wrong in its own way

Of course that's not literally true. But when I reflect on my various mistakes, it's hard to find a true pattern. To the extent there is one, I'm guessing that the highest-order bit is that many of my mistakes are emotional rather than technical. For example,

  • doubling down on something in the face of contrary evidence,
  • or at least not updating enough because I was arrogant,
  • getting burned that way and then updating too much from minor factors
  • "updating" from a conversation because it was socially polite to not ignore people rather than their points actually being persuasive, etc.

If the emotion hypothesis is true, to get better at forecasting, the most important thing might well to be looking inwards, rather than say, a) learning more statistics or b) acquiring more facts about the "real world."

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-03T07:15:41.326Z · score: 2 (1 votes) · EA · GW

Okay, I answered some questions! All the questions are great, keep them coming!

If you have a highly upvoted question that I have yet to answer, then it's because I thought answering it was hard and I need to think more before answering! But I intend to get around to answering as many questions as I can eventually (especially highly upvoted ones!)

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-03T07:13:30.326Z · score: 3 (2 votes) · EA · GW

Yeah I think Tara Kirk Sell mentioned this on the 80k podcast. I think I mostly agree, with the minor technical caveat that if you were trying to get people to forecast numerical questions, getting the ranges exactly right matters more when you have buckets (like in the JHU Disease Prediction Project that Tara ran, and Good Judgement 2.0), but asking people to forecast a distribution (like in Metaculus) allows the question asker to be more agnostic about ranges. Though the specific thing I would agree with is something like:

at current margins, getting useful forecasts out is more bottlenecked by skill in question operationalization than by judgemental forecasting skill.

I think other elements of the forecasting pipeline plausibly matter even more, which I talked about in my answer to JP's question.

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-03T06:47:42.610Z · score: 2 (1 votes) · EA · GW

I actually answered this before, on the meme page:

The gut instinct answer is 100 duck-sized horses, because ducks are much scarier than horses. But one of the things that being a page admin of counterintuitive philosophical problems has taught us is that sometimes we can’t always go with our gut. Here, for example, a horse-sized duck, while very intimidating and scary looking, is probably not structurally sound enough to stand upright for long, and we can probably escape it and let it collapse under its own weight. In contrast, 100 duck-sized horses wouldn’t be much weaker than 100 normal-sized horses (https://supersonicman.wordpress.com/2011/11/13/the-square-cube-law-all-animals-jump-the-same-height/), and they’ll definitely have scary kicks.

I still stand by this. maybe 85% that I can win against the duck, and 20% the horses? Depends a lot on initial starting position of course.

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-03T06:44:48.133Z · score: 9 (3 votes) · EA · GW

No. At the high level I don't think I'm that good at forecasting, and beating the bar for being better at day-to-day investing than the Efficient Market Hypothesis must be really hard. Also, finding financial market inefficiencies is very much not neglected, so even if by some miracle I discovered some small inefficiency, I doubt the payoff would be worth it, relative to finding more neglected things to forecast on.

At a lower level, the few times I actually attempted to do forecasting on economic indicators, I did much worse than even I expected. For example, I didn't predict the May jobs rally, and I'm also still pretty confused about why the S&P 500 is so high now.

I don't think it's impossible for EAs to sometimes predictably beat the stock market without intense effort. However, the way to do this isn't by doing the typical forecaster thing of having a strong intuitive sense of probabilities and doing the homework (because that's the bare minimum that I assume everybody in finance has).

Rather, I think the thing to maybe focus on is that EAs and adjacent communities in a very real sense "live in the future." For example, I think covid and the rise of Bitcoin were both moderately predictable way earlier than the stock market caught on (in Bitcoin's case, not that it will definitely take off, but it would have been reasonable to assign >1% chance of it taking off), and in fact have been predicted by those in community. So we're maybe really good in relative terms at having an interdisciplinary understanding of discontinuities/black swans that only touch finance indirectly.

The financial world will be watching for the next pandemic, but maybe the next time we see the glimmers of something real and big on the horizon (localized nuclear war, AI advances, some large technological shift, something else entirely?), we might be able to act fast and make a lot of (in expectation) money. Or at least lose less money by encouraging our friends and EAs with lots of financial assets to de-risk at important moments.

Anyway, the main thing I believe is something like

you can maybe spot a potential EMH violation once a decade, so you gotta be ready to pull the trigger when that happens (but also have enough in reserves to weather being wrong

This looks very different from normal "investing" since almost all of the time your money just sits in normal financial assets until you need to pull it out.

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-03T06:18:00.149Z · score: 18 (10 votes) · EA · GW

Yes, I think part of feeling like you don't belong is just pretty normal to being human! So on the outside, this should very much be expected.

But specifically:

  • I think of myself as very culturally Americanized (or perhaps more accurately EA + general "Western internet culture"), so I don't really feel like I belong among Chinese people anymore. However, I also have a heavy (Chinese) accent, so I think I'm not usually seen as "one of them" among Americans, or perhaps Westerners in general.
    • I mitigate this a lot by hanging out in a largely international group. I also strongly prefer written communication to talking, especially to strangers, likely to a large part because of this reason (but it's usually not conscious).
    • I also keep meaning to train my accent, but get lazy about it.
  • I think most EAs are natives to, for want of a better word, "elite culture":
    • eg they went to elite high schools and universities,
    • they may have done regular travel in the summers,
    • most but not all of them had rich or upper-class parents even by Western standards
    • Some of them go to burning man and take recreational drugs
    • Some of them are much more naturally comfortable with navigating certain ambiguities of elite culture
      • This is modulated somewhat by the fairly high rates of neurodiversity in our community.
  • In contrast, I think of myself as very much an immigrant to elite culture.
  • I felt the same way at Google, because of not an elite college background and also didn't major in CS.
  • I think elite culture in general, (and very much) EA specifically, portrays itself as aggressively meritocratic. This isn't entirely true, but the portrayal reflects the underlying reality by a large margin. So I feel pretty bad that I don't perceive myself as doing things that are "actually impactful", and also (to a lesser extent) that I'm not working as hard as the EAs I look up to, and producing much less.
    • There are exceptions and I work hard for things I get very obsessed by, eg. forecasting.
  • I used to culturally be much more of a gamer, but it's hard for me to identify that way anymore except to people who don't game.
    • I still play games, but (this is hard to describe well) I don't feel like I'm culturally a gamer anymore.
    • I'm not sure how to describe it, but I feel like it's something about gamers usually either
      • tie a lot of their identity to gaming and game a lot, or
      • tie a lot of their identity to gaming and game when they can, but have large family or work commitments that makes it infeasible to be gaming a lot.
    • And I guess I feel like neither really applies to me, exactly?
    • Even though I think this is for the best, I still am somewhat sad about losing this part of my identity. Board games with EAs help scratch the urge to play games, but not the culture
  • I feel like there's a sense that even though I haven't done much original research in it, I've contributed enough secondary ideas/communication to philosophy, and I understand enough academic philosophy, that I weakly should be in the edges of that community. However, I don't think the philosophers feel this way! 😅
    • Which is not to say I don't have friends in philosophy, or that they don't respect me, to be clear!
  • Surprisingly, I do not recall feeling like an imposter in the forecasting community.

But at a higher level, I definitely feel like all the communities I think of myself as fully part of (internet culture, college, EA, workplaces, amateur forecasting, etc.) have largely accepted me with open arms, and I'm grateful for that. I also think my emotions are reasonably calibrated here, and I don't have an undue level of imposter syndrome.

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-03T05:35:43.468Z · score: 10 (3 votes) · EA · GW

My guess is to just forecast a lot! The most important part is probably just practicing a lot and evaluating how well you did.

Beyond that, my instinct is that the closer you can get to deliberate practice the more you can improve. My guess is that there's multiple desiderata that's hard to satisfy all at once, so you do have to make some tradeoffs between them.

  • As close to the target domain of what you actually care about as possible. For example, if you care about having accurate forecasts on which psychological results are true, covid-19 tournaments or geo-political forecasting are less helpful than replication markets.
  • Can answer lots of questions and have fast feedback loops. For example, if the question you really care about is "will humans be extinct by 3000 AD?" you probably want to answer a bunch of other short term questions first to build up your forecasting muscles to actually have a better sense of these harder questions.
  • Can initially be easy to evaluate well. For example, if you want to answer "will AI turn out well?" it might be helpful to answer a bunch of easy-to-evaluate questions first and grade them.

In case you're not aware of this, I think there's also some evidence that calibration games, like OpenPhil's app, is pretty helpful.

Being meta-cognitive and reflective of your mistakes likely helps too.

In particular, beyond just calibration, you want to have a strong internal sense of when and how much your forecasts can update based on new information. If you update too much, then this is probably evidence that your beliefs should be closer to the mean (if you went from 20% to 80% to 20% to 75% to 40% in one day, you probably didn't really believe it was 20% to start with). If you update too little, then maybe the bar of evidence for you to change your mind is too high.

What did you do to hone your skill?

Before I started forecasting seriously, I attended several forecasting meetups that my co-organizer of South Bay Effective Altruism ran. Maybe going through the worksheets will be helpful here?

One thing I did that was pretty extreme was that I very closely followed a lot of forecasting-relevant details of covid-19. I didn't learn a lot of theoretical epidemiology, but when I was most "on top of things" (I think around late April to early May), I was basically closely following the disease trajectory, policies, and data ambiguities of ~20 different countries. I also read pretty much every halfway decent paper on covid-19 fatality rates that I could find, and skimmed the rest.

I think this is really extreme and I suspect very few forecasters do it to that level. Even I stopped trying to keep up because it was getting too much (and I started forecasting narrower questions professionally, plus had more of a social life). However, I think it is generally the case that forecasters usually know quite a lot of specific details of the thing they're forecasting: nowhere near that of subject matter experts, but they also have a lot more focus into the forecasting-relevant details, as opposed to grand theories or interesting frontiers of research.

That being said, I think it's plausible a lot of this knowledge is a spandrel and not actually that helpful for making forecasts. This answer is already too long but I might go in more detail about why I believe factual knowledge is a little overrated in other answers.

I also think that by the time I started Forecasting seriously, I probably started with a large leg up because (as many of you know) I spend a lot of my time arguing online. I highly doubt it's the most effective way to train forecasting skills (see first bullet point), and I'm dubious it's a good use of time in general. However, if we ignore efficiency, I definitely think the way I argued/communicated was a decent way to train having above-average general epistemology and understanding of the world.

Other forecasters often have backgrounds (whether serious hobbies or professional expertise) in things that require or strongly benefit from having a strong intuitive understanding of probability. Examples include semi-professional poker, working in finance, data science, some academic subfields (eg AI, psychology) and sometimes domain expertise (eg epidemiology).

It is unclear to me how much of these things are selection effects vs training, but I suspect that at this stage, a lot of the differences in forecasting success (>60%?) is explainable by practice and training, or just literally forecasting a lot.

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-03T04:57:29.852Z · score: 2 (1 votes) · EA · GW

There's a bunch of things going on here. But roughly speaking, I think there's at least two things going on:

  • When people think of "success from judgmental forecasting", they usually think of a narrow thing that looks like the end product of the most open part of what Metaculus and Good Judgement .* does: coming up with good answers to specific, well-defined and useful questions. But a lot of the value of forecasting comes before and after that.
  • Even in the near-ideal situation of specific, well-defined forecasts, there are often metrics other than pure accuracy (beyond a certain baseline) that matters more.

For the first point, Ozzie Gooen (and I'm sure many other people) has thought a lot more about this. But my sense is that there's a long pipeline of things that makes a forecast actually useful for people:

  • Noticing quantifiable uncertainty. I think a lot of the value of forecasting comes from the pre-question operationalization stage. This is being able to recognize both that something that might be relevant to a practical decision that you (or a client, or the world) rely on is a) uncertain and b) can be reasonably quantifiable. I think a lot of our assumptions we do not recognize as such, or the uncertainty is not crisp enough that we can even think that it's a question we can ask others.
  • Data collection. Not sure where this fits in the pipeline, but often precise forecasts of the future are contextualized in the world of the relevant data that you have.
  • Question operationalization. This is what William Kiely's question is referring to, which I'll answer more in detail there. But roughly, it's making your quantifiable uncertainty into a precise, well-defined question that can be evaluated and scored later.
  • Actual judgmental forecasting. This is mostly what I did, and what the leaderboards are ranked on, and what people think about when they think about "forecasting."
  • Making those forecasts useful. If this is for yourself, it's usually easier in some sense. If it's for the "world" or the "public," making forecasts useful often entails clear communication and marketing/advertising the forecasts so it can be taken up by relevant decision-makers (even if it's just individuals). If it's for a client, then this involves working closely with the client to make sure the client understands both your forecasts and its relevant implications, as well as possibly "going back to the drawing board" if the questions that you thought was operationalized well isn't actually useful for the client.
  • Evaluation. Usually, if the earlier steps are done well, this is easy because the question is set up to be easy to evaluate. That said, there are tradeoffs here. For example, if people trusts you to evaluate forecasts well, you can afford to cut corners and thus expand the range of what is "quantifiable", or start with worse question operationalizations and still deliver value.

For the second point, accuracy often trades off against other things. For example, cost-effectiveness and interpretability may matter more for clients.

If you spend a lot of time drilling down to a few questions, your forecasts are more "expensive" (both literally and figuratively) per question, and you will not be able to provide as much value in total. For interpretability, often just a number is not as helpful for clients, both in the sense of literal clients you directly work with and the world.

One thing that drives this point home to me is the existing "oracles" we have, like the stock market. There's a sense in which the stock market is extremely accurate (for example options are mostly "correctly" priced for future prices, but for many of our non-financial decisions it takes a LOT of effort to interpret what signals the market sends that's relevant to our future decisions, like how scared we should be of a future recession or large-scale famine.

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-03T03:23:52.562Z · score: 9 (2 votes) · EA · GW

Tl;dr: In the short run (a few weeks) seroprevalence, in the medium run (months) behavior. In the long-run likely behavior as well, but other factors like wealth and technological access might start to dominate in hard-to-predict ways.

Thanks for the question! When I made this AMA, I was worried that all the questions would be about covid. Since there’s only one, I might as well devote a bunch of time to it.

There are of course factors other than those three, unless you stretch “behavior” to be maximally inclusive. For example, having large family sizes in a small house means it’s a lot harder to control disease spread within the home (in-house physical distancing is basically impossible if 7 people live in the same room). Density (population-weighted) more generally probably means it’s harder to control disease spread. One large factor is state capacity, which I operationalize roughly as “to the extent your gov’t can be said to be a single entity, how much can it carry out the actions it wants to carry out.” Poverty and sanitation norms more generally likely matters a lot, though I haven’t seen enough data to be sure. Among high-income countries, I also will not be surprised if within-country inequality is a large factor, though I am unsure what the causal mechanism will be.

In the timescale you need to think about for prioritizing hospital resources and other emergency measures, aka “the short run” of say a few weeks, seroprevalence of the virus (how many people are infected and infectious) dominates by a very large margin. There’s so much we still don’t know about how the disease spreads, so I think (~90%) by far the most predictive factors for how many cases there will be in a few weeks are high-level questions like how many people are currently infected and what the current growth rate is, with a few important caveats like noting that confirmed infections definitely do NOT equal active infections.

In the medium run (2+ months), I think (~85%), at least if I was to choose between {current prevalence, behavior, seasonality}, this is almost entirely driven by behavior, both governmental actions (test and trace policies, school closures, shutting large events) and individual responses (compliance, general paranoia, voluntary social distancing, personal mask usage). This is especially clear to me when I compare the trajectories of countries in Latin America to ones in (especially East) Asia. In March and April, there was not a very large seasonality difference, and wealth levels were similar, and household sizes weren’t that different, and East Asia started with much higher seroprevalence, but through a combination of governmental interventions and individual behaviors, the end of April looked very different for Latin America countries and Asian ones.

Seasonality probably matters. I tried studying how much it matters and got pretty confused. My best guess is ~25% reduction in Rt (with high uncertainty), so maybe it matters a lot in relative terms compared to specific interventions (like I wouldn’t be surprised if it’s a bigger deal than a single intervention like going from 20% to 70% cloth mask coverage, or university closures, or 50% increase in handwashing, or banning public events larger than N people), but I’d be very surprised if it’s bigger than the set of all behaviors. In the short run seasonality will be a lot smaller than the orders of magnitude differences in current prevalence, and in the long run seasonality is significantly smaller than behavioral change.

One thing to note is that some of the effects of seasonality is likely mediated through behavior or the lack thereof. For example, schools opening in fall are plausibly a large part of disease spread for flu and thus maybe covid; this channel is irrelevant in places that have school closures anyway. Likewise, summer vs winter (in many countries) changes where and how people interact with each other. There’s also countervailing factors I don’t know enough about, like maybe hotter weather makes it less palatable to wear masks, or especially hot/cold weather interface poorly with existing ventilation setups.

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-03T03:12:15.141Z · score: 9 (5 votes) · EA · GW

Meta: Wow, thanks a lot for these questions. They're very insightful and have made me think a lot, please keep the questions (and voting on them) coming! <3

It turns out I had some prior social commitments on Sunday that I forgot about, so I'm going to start answering these questions tonight plus Saturday, and maybe Friday evening too.

But *please* don't feel discouraged from continuing to ask questions, reading these questions have been a load of fun and I might keep answering things for a while.

Comment by linch on A quick and crude comparison of epidemiological expert forecasts versus Metaculus forecasts for COVID-19 · 2020-07-02T11:32:06.948Z · score: 7 (4 votes) · EA · GW

UPDATE: With more data, Metaculus users have instead done better again.

Comment by linch on EA Forum feature suggestion thread · 2020-07-01T08:38:46.705Z · score: 2 (1 votes) · EA · GW

Triple sounds approximately right to me in terms of relative weighting.

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-01T04:45:11.256Z · score: 6 (4 votes) · EA · GW

I understood your intent! :) I actually plan to answer the spirit of your question on Sunday, just decided to break the general plan to "not answer questions until my official AMA time" because I thought the caveat was sufficiently important to have in the open!

Comment by linch on I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA · 2020-07-01T02:12:59.200Z · score: 29 (11 votes) · EA · GW

Hey I want to give a more directly informative answer later but since this might color other people's questions too: I just want to flag that I don't think I'm a better forecaster than all the 989+ people below me on the leaderboards, and I also would not be surprised if I'm better than some of the people above me on the leaderboard. There's several reasons for this:

  • Reality is often underpowered. While medium-term covid-19 forecasting is less prone to those issues in comparison to many other EA questions, you still have a bunch of fundamental uncertainty about how actually good you are. Being correct for one question often relies on a "bet" that's loosely correlated with being correct on another question. At or near the top, there are not enough questions for you to be sure if you just got lucky in a bunch of correlated ways that others slightly below you in the ranks got unlucky on, vs you actually being more skilled. The differences are things like whether you "called" it correctly at 90% when others put 80%, or conversely when you were sufficiently calibrated at 70% when others were overconfident (or just unlucky) at 90%.
  • Metaculus rankings is a composite measurement of both activity and accuracy (all forecasting leaderboards have to be this way, otherwise the top 10 will be dominated by people who are overconfident and right on a few questions). For all I know, people who answer <10 covid-19 questions on Metaculus are actually amazing forecasters, they just chose a different platform than Metaculus after a brief trial, or they almost always only answer non-covid questions on Metaculus.
  • Garden of forking paths in what selection criteria you choose. For example I'm below top 50 on Metaculus overall (my excuse is I only joined 4 months ago) and below top 20 on some specific covid-19 subtournaments (though I'm also higher than 10 on others; my excuse is that I didn't participate in a lot of those questions). But you only get so many excuses, and it's hard to pick a fully neutral prior.

At worst, I can only be moderately confident that I'm somewhat above average at medium-term predictions of a novel pandemic, though I also don't want to be falsely humble. My best guess is that a) this represents some underlying skill more than that and b) there's some significant generalizability.

A lifetime ago, when I interviewed Liv Boeree about poker and EA, one thing she said really struck out to me. (I was dumb and didn't include it in the final interview, so hopefully I didn't butcher this rephrase completely). Roughly speaking, that in poker, among professionals, the true measurement of poker skill isn't how much money you make (because poker is both a high-skill and high-luck game, and there's so much randomness); rather a better measurement is the approval of your peers who are professional poker players.

This was really memorable to me because I always had the impression of poker as this extremely objective game with a very clear winning criteria (I guess from the outside, so is forecasting). If you can't even have a clear and externally legible metric for poker, what hope does anything else significantly more fuzzy have?

That said, I do think this is a question of degree rather than kind. I think the rankings are an okay proxy for minimal competence. You probably shouldn't trust forecasters too much (at least in their capacity as forecasters) who are below 50th percentile in the domain they are asked to forecast on, and maybe not below 90th percentile either, unless there are strong countervailing reasons.

Comment by linch on Max_Daniel's Shortform · 2020-06-29T08:23:54.890Z · score: 4 (2 votes) · EA · GW

You're probably aware of this, but Anders Sandberg has done some thinking about this. Also presumably David Roodman based on his public writings (though I have not contacted him myself).

More broadly, I'm guessing that anybody who either you've referenced above, or who I've linked in my doc, would be helpful, though of course many of them are very busy.

Comment by linch on Max_Daniel's Shortform · 2020-06-26T23:37:38.959Z · score: 14 (3 votes) · EA · GW

I did some initial research/thinking on this before the pandemic came and distracted me completely. Here's a very broad outline that might be helpful.

Comment by linch on What coronavirus policy failures are you worried about? · 2020-06-22T01:12:05.986Z · score: 11 (5 votes) · EA · GW

(This is well outside my area of expertise. To the extent I have a comparative advantage in covid-y matters, it is almost entirely in medium-term predictions <6 months out. I am also entirely thinking of "biosecurity" as preventing pandemics as bad, or up to 1-2 orders of magnitude worse, than covid-19. I don't have a strong sense of what preventing existential GCBRs will look like, though I would love to learn, and I imagine there are probably many subtle considerations that I have missed)

The framing of this question suggests to me that you are looking for policy failures from overreacting to covid-19. I think this is a very serious concern.

However, one thing I want to flag is that it's not obvious to me that failures from overreaction are necessarily worse than failures from underreaction. And I'm personally roughly 50-50 for this proposition.

Several considerations for me in favor of thinking of underreaction being the default:

1. Early on in the pandemic (say before mid-late March), I'm aware of many groups underreacting (basically all of the geographical West, including Latin America). There are some governments and institutions that reacted well (shout-outs: Mongolia, Seattle Flu Study), but I'm not aware of any institutions severely overreacting early on in the pandemic, other than a) maybe tiny groups like the Bay Area rationalist community, and b) debatably a few national governments like India (even there it's not clear to me that it's obviously worse than underreacting).

2. At the high level, I think what happens is something like regularization: governments and other institutions by default have something like a "target reaction level": This cause them to overreact to small events but underreact to large events. Since small events happen much more than large events, this will cause them to appear to usually overreact, so the received wisdom becomes that governments overreact. However, when reactions really matter, this will usually mean that they'll be underreacting.

(Note that this is subtly different from claiming that this is an unavoidable error in that in the absence of clairvoyance, you always either underreact or overreact; Instead, I claim that even in situations where something is predictably much worse than anything you've seen before, you, institutionally, will still be biased towards underreaction).

3. The history of pandemics (at least for events large enough to enter a casual reading of the "history of pandemics") is mostly one where governments institutionally underreact, eg, by focusing on assuaging fears rather than a) honest communication or b) focused entirely on solving the object-level crisis.

4. While it is not exclusively the case, a lot of the academic and political Twitter that I follow appears to either a) be counterfactually fatalistic and think we couldn't have done much better or) be playing a "blame game": where the bad reactions are blamed on specific politicians or officials. If a) is the case, then we did the best we could and we should be worried about overupdating on this pandemic and thus making worse decisions in the next pandemic. If b) is the case, we likewise should not change much except for which leaders we elect or otherwise select to be in charge of public health issues.

However, I find this dubious because I think most of the West (geographically, not culturally. New Zealand seems to have done fine) has done substantially worse than East Asia. While the decisions of specific politicians and bureaucrats certainly exacerbated this pandemic in predictably poor ways, the fact that most of the West has not handled the pandemic well means I don't think it's fair to pin most of this on individuals. Instead, it might be instructive to (possibly in private) consider institutional and cultural issues.

____

With all that said, why do I think underreaction is not more likely than overreaction?

1. Most of all, this just seems like a really hard question that I feel woefully ignorant on. There are a lot of important considerations I haven't thought through in detail (and many I probably didn't even consider)

2. It intuitively makes sense to me that governments (especially elected officials and political appointees) will overreact/overindex on the particulars of a specific past crisis. A common saying is that "generals are always preparing for the last war."

3. Even if you buy my argument that gov'ts and institutions underreacted in the beginning of a pandemic, it does not necessarily follow that they will underreact at the end. An analogy I like is to consider an inexperienced driver. The driver might be slow to realize that they should stop, but by the time they do, they will rapidly slam on the brakes even when it's unnecessary to do so.

I studied the backside of pandemics much less than I studied the beginnings of pandemics (and even then it's quite limited). For future work, it might be helpful to a) look at historical examples of recoveries after pandemics and other major crises, and thus form a well-informed outside view of whether we'll more likely to expect underreaction or overreaction and b) carefully consider the details of specific proposals in question to think through whether they help or hinder biosecurity and other health efforts.

This answer should be seen as a general case for agnosticism and fighting against potential biases towards conservatism, rather than positively advancing a case for any specific predictable policy failures from underreaction.

Comment by linch on How to make the most impactful donation, in terms of taxes? · 2020-06-15T07:29:15.219Z · score: 4 (2 votes) · EA · GW

Am I correct in assuming that you're not in a state with a large state income tax?

Comment by linch on Will protests lead to thousands of coronavirus deaths? · 2020-06-06T18:33:52.609Z · score: 2 (1 votes) · EA · GW

lol I thought that 10! was a surprise, rather than a factorial...

Comment by linch on Are there historical examples of excess panic during pandemics killing a lot of people? · 2020-05-29T15:17:23.981Z · score: 2 (1 votes) · EA · GW

No worries! :)

Comment by linch on Are there historical examples of excess panic during pandemics killing a lot of people? · 2020-05-29T02:45:35.949Z · score: 2 (1 votes) · EA · GW

Why was this comment downvoted? :O

Comment by linch on [Job ad] Lead an ambitious COVID-19 forecasting project [Deadline: June 1st] · 2020-05-28T21:09:50.672Z · score: 4 (2 votes) · EA · GW

I'm one of the forecasters/generalists on this project. Feel free to PM me or AMA here.

Comment by linch on Are there historical examples of excess panic during pandemics killing a lot of people? · 2020-05-28T19:32:14.638Z · score: 3 (2 votes) · EA · GW

Okay, here's the conclusion of the paper (emphasis mine):

To date, studies on social violence, hate and disease have focused on less than a handful of pandemics – drawing parallels at times between the Black Death and cholera, in other places between syphilis and A.I.D.S.,and on occasion two or three other diseases. No one has gone beyond these few pandemics to chart comparatively the patterns of disease and hate. No one has compared the levels of violence or intensity of hate with different pandemics in different places and periods; instead, epidemics’ potential for hate has been levelled, so that blaming, perhaps but not necessarily implicit in popular names given to diseases, is placed on the same plane as the genocide of Jews across vast regions of Europe during the Black Death and again of Eastern Jews with twentieth-century typhus. Furthermore, no one has factored to what extent certain characteristics of diseases – rates of mortality, rates of fatality, quickness of death, newness of a disease, mysterious causes, degrees of contagion, gruesomeness and horror of signs and symptoms – determine whether a wave of collective hate and violence is likely to ensue. Instead, both in the popular imagination and the scholarly literature, violent hatred and even pogroms are held to have been pandemics’ normal course, supposedly engrained in timeless mental structures – to use René Baehrel’s words again, ‘certaines structures mentales, certaines constantes psychologiques’.123 Further examples of such scholarly opinion can easily be provided,124 but were these the constant consequences of epidemics? According to my survey thus far, they were not: the Black Death, typhus in late nineteenth- and twentieth-century eastern Europe, plague in the sixteenth and seventeenth centuries (although only in some areas), cholera in places in the eighteen-thirties, in Italy as late as 1911, in Peru andVenezuela to the nineteen-nineties, in Haiti to today, sometimes smallpox, sometimes Yersina pestis, and perhaps to some extent A.I.D.S. in our own time were exceptions but hardly the rule. No matter how contentious the underlying social and political circumstances,how high the body counts, how gruesome the signs and symptoms, how fast or slow the spread or course of a disease, pandemics did not inevitably give rise to violence and hatred. In striking cases they in fact did the opposite, as witnessed with epidemics of unknown causes in antiquity, the Great Influenza of 1918–19 and yellow fever across numerous cities and regions in America and Europe. These epidemic crises unified communities, healing wounds cut deep by previous social, political, religious, racial and ethnic tensions and anxieties. On occasion, it is true, pandemics did split societies with accusations and violence. Historians, doctors and psychologists have yet to map when and where they happened, to measure their intensities, or to examine the complex interaction of factors to explain why some diseases were more or less persistently the exceptions. They have yet to raise the questions within a comparative framework of world epidemics.125 It is now time to construct the databases of disease and hate.

The rest of the paper documents many epidemics he looked at (going back to before common era Athens and Rome) which did not end up in the expected violence. I think I have a more balanced view of whether it's possible for excess panic to cause major problems during a pandemic now, and I'm still surprised that there isn't more.

I'd like to see

a) a study of how much those incidences of outgroup violence were primarily a result of panic (as opposed to eg. opportunists, since during the Black Death Christians appeared at least as invested in confiscating a lot of possessions from Jewish people) and

b) a similarly comprehensive study of how many epidemics/pandemics resulted in bad things happening from insufficient worry or officials hiding information.

I also weakly suspect that some cases of a) and b) are tied together, eg. excess panic/panic synchronization happening because officials have lied about the situation earlier on.

Comment by linch on Are there historical examples of excess panic during pandemics killing a lot of people? · 2020-05-28T19:30:42.706Z · score: 2 (1 votes) · EA · GW

Yeah perhaps I should be less credulous?

Comment by linch on Some thoughts on deference and inside-view models · 2020-05-28T19:10:34.146Z · score: 16 (8 votes) · EA · GW

Epistemic status: grappling with something confusing. May not make sense.

One thing that confuses me is whether we should be just willing to "eat that loss" in expectation. I think most EAs agree that individuals should be somewhat risk-seeking in eg, career choice, since this allows the movement to have a portfolio. But maybe there are correlated risks that the movement will have (for example, if we're wrong about Bayesian decision theory, say, or meta-philosophy concepts like preferring parsimony), that we basically can't de-risk without cutting a lot into expected value.

An analogy is startups. Startups implicitly have to take on some epistemic (and other) risks about the value of the product, the vision for team organization being good, etc. VCs are fine with funding off-shoot ideas as long as their portfolio is good (lots of startups with relatively uncorrelated risks).

So maybe in some ways we should think of the world as a whole of having a portfolio of potential do-gooder social movements, and we should just try our best to have the best movement we can under the movements' assumptions.

Another analogy is the 100 schools of thought era in China, where at least one school of thought had important similarities to ours. That school of thought (Mohism) did not end up winning, for reasons that are not necessarily the best according to our lights. But maybe it was a good shot anyway, and if they compromised too much on their values or epistemology, they wouldn't have produced much value.

This is what confuses me when people like Will Macaskill talks about EA being a new ethical revolution. Should we think of an "EA ethical revolution" as something that is the default outcome as long as we work really hard at it, and is something we can de-risk and still do, or is the implicit assumption that we should think of ourselves as a startup that is one of the world's bets (among many) for achieving an ethical revolution?

Comment by linch on Are there historical examples of excess panic during pandemics killing a lot of people? · 2020-05-28T15:42:25.950Z · score: 4 (3 votes) · EA · GW

Though that said when I searched for "pogroms during epidemics," this paper claims that after the first Black Death, there wasn't much evidence for plague-based pogroms and other outgroup violence, even during subsequent plagues.

Comment by linch on Are there historical examples of excess panic during pandemics killing a lot of people? · 2020-05-28T14:40:53.337Z · score: 4 (3 votes) · EA · GW

This does seem unusually bad, so would qualify. Strongly upvoted. This makes me more sympathetic to people who were claiming that anti-Chinese xenophobia was the biggest problem with the novel coronavirus, even though I still think they made the wrong call even ex ante.

I'm fine with examples from relatively early historical pandemics, because the current situation is an unusually large upheaval compared to say the Hong Kong flu, so to get a historical sense of what could happen we need more examples of "unusually fast+large upheavals in history", and I think earlier on maybe people (including myself) are over-indexing a bit on "recent epidemics that are less lethal" (so less good as a reference class) as well as the Spanish flu (which is only one data point).

Comment by linch on Climate Change Is Neglected By EA · 2020-05-27T15:29:49.708Z · score: 17 (11 votes) · EA · GW

I concur; I disagree with the tone of kbog's comment but broadly agree with the content.

Comment by linch on Climate Change Is Neglected By EA · 2020-05-25T14:56:09.873Z · score: 16 (11 votes) · EA · GW
A year ago Louis Dixon posed the question “Does climate change deserve more attention within EA?”. On May 30th I will be discussing the related question “Is Climate Change Neglected Within EA?” with the Effective Environmentalism group. This post is my attempt to answer that question.

It's definitely possible I'm misunderstanding what you're trying to do here. However, I think it is usually not the case that if you attempt to do an impartial assessment of a yes-no question, all the possible factors point in the same direction.

I mean, I don't know this for sure, but I imagine if you were to ask me to closely investigate a cause area that I haven't thought about much before (wild animal suffering, say, or consciousness research, or Alzheimer's mitigation), and I investigated 10 sub-questions, I don't think all 10 of them will point in the same way. My intuition is that it's much more likely that I'd either find 1 or 2 overwhelming factors, or many weak arguments in favor or against, and some in the other direction.

I feel bad for picking on you here. I think it is likely the case that other EAs (myself included) have historically made this mistake, and I will endeavor to be more careful about this in the future.

Comment by linch on A quick and crude comparison of epidemiological expert forecasts versus Metaculus forecasts for COVID-19 · 2020-05-10T05:10:23.894Z · score: 3 (2 votes) · EA · GW

Metaculus predictions are now featured in those surveys (yay!) so I was able to make a more direct comparison for the first survey where you can compare those predictions head-to-head.

tl;dr: Experts have broadly outperformed the Metaculus aggregative predictions, however the differences were not exceptionally large.

Comment by linch on COVID-19 in developing countries · 2020-05-01T11:46:46.799Z · score: 3 (2 votes) · EA · GW

I think ~1.1% (with fairly wide uncertainty) is a fairly realistic guess for a global IFR (including all age ranges). I basically don't buy that the balance of factors would necessarily favor poorer and younger countries over richer/healthier/older ones, though it certainly is possible.

Here's a preliminary document listing why I believe this. Usual caveats of being a non-professional apply, and also the tone is a bit sharper than I'd use on the EA Forum (basically the intended audience was other amateur forecasters so there are certain stylistic differences, especially around caveats).

~0.1%, or even slightly lower, seems believable for <60s in some rich countries but I don't think you want to extrapolate age-structure arguments too strongly to novel situations (in essence I think age is a biased estimator whereas something like crude death rate may not be), and if you want to look at specific countries you'd want to look at a bunch of known comorbidities*; eg per capita, Nigerians die of heart disease at ~1/10 the rate of Indians.

One thing that I didn't mention in my document above is that even if .1%-.2% is a realistic IFR for young people in developing countries, and developing countries are skewed young, the full IFR in developing countries will likely still be much higher.

For example, Guayas province in Ecuador has had ~11,561 excess deaths from the beginning of March to mid-April (base rate is ~3000 in that time period). My understanding is that close to all of it is directly due to covid-19 (I talked to people from Ecuador and if there was mass starvations or a different epidemic that accounted for even 2x all-cause mortality I'd have heard by now). The population of Guayas is ~3 million, so this is already a lower bound of ~0.39% of the entire population(!), and I really don't buy that anywhere near 100% of Guayas were infected as of mid-April (or more accurately late March to account for lag between infection and death).

Ecuador has a median age of 27.9, a life expectancy of 76.6, and a GDP per capita of $6400, so definitely not unusually old or unhealthy by middle-income country standards.

Comment by linch on [Open Thread] What virtual events are you hosting that you'd like to open to the EA Forum-reading public? · 2020-04-27T19:44:50.142Z · score: 7 (4 votes) · EA · GW

Intro to Forecasting (EA San Francisco + Stanford EA discussion event)

cross-posted from Facebook. (I will check both this thread and the FB event for comments)

Date: 2020/04/29

Time: 19:00-21:00 (PDT)

Location: Zoom videoconferencing (recommend web portal)

You might have heard of forecasting by now. Many of the cool kids* are doing it, using fancy terms like "brier score," "metaculus", "log-odds", "calibration" and "modeling." You might have heard of superforecasters: savvy amateurs who make robustly better forecasts on geopolitical events than trained analysts at the CIA. What you might not have learned is that these skills are eminently trainable: In the original Good Judgement Project, researchers have found that a short 1-hour training course can robustly improve accuracy over the course of a year!

Next Wednesday, EA SF is collaborating with Stanford Effective Altruism to host an introductory event on forecasting. Together, we will practice (super)forecasting techniques: the skills and ways of thinking that allowed savvy amateurs to make better forecasts on geopolitical events than trained analysts at the CIA.

Background
https://en.wikipedia.org/wiki/The_Good_Judgment_Project
https://en.wikipedia.org/wiki/Superforecasting
http://www.academia.edu/download/37711009/2015_-_superforecasters.pdf
https://aiimpacts.org/evidence-on-good-forecasting-practices-from-the-good-judgment-project-an-accompanying-blog-post/

Here are some of the topics we'll try to cover and practice in small groups, time permitting:

Base Rates: Outside View vs. Inside View
Credence References: Thinking in Credible Intervals
Controlling for Scope: Consider the probability distribution across different outcomes than posed by the question, such as longer/shorter timeframes
Analytics: Fermizing, assessing signal vs. noise. Controlling for biases and fallacies
Comment: Making explicit rationales to prevent hindsight bias and share information
Compare: Explain your reasoning, benefit from viewpoint diversity, and accelerate learning
Update: Revise your forecast as new information comes in or you change your view

Structure: We'll meet together briefly to go over the details and then split into smaller groups with 3-6 group members each, including a group leader. Each group will be given a discussion sheet that they can copy and group leaders will be given an answer key.

We'll be using Zoom as our platform as that allows the most seamless split into smaller groups. For security reasons, we recommend using the Zoom Web portal over the Mac App.

We expect many of the attendees to be new to forecasting, but also several people who're very experienced forecasters and/or otherwise quite plugged in into the current state-of-the-art of forecasting. Depending on who shows up, it might also make sense to have a Q&A in addition to the small group discussions.

As this is a virtual event, all are welcome. However, we only have limited small group leader capacity, so in the (very fortunate!) world where many more people show up than we expect, groups may be asked to nominate their own group leaders instead of having an appointed one with prior experience managing EA discussions.

Hope to see you there!

*and some of the uncool kids

Comment by linch on COVID-19 in developing countries · 2020-04-27T19:14:46.836Z · score: 3 (2 votes) · EA · GW

I know the Washington post opinion column isn't the right place to post numbers, but do you have ballpark estimates for how costly (economically and/or in terms of human toll) lockdowns will be in low income and middle income countries?

I do think that some people (not saying you are one!) often underestimate the human harm of getting covid-19 in developing countries (eg, they'll quote widely discredited numbers for IFR like .1%, which obviously is ~impossible).

So it'd be helpful to do ballpark Fermi estimates for the cost of different interventions (or not doing those interventions) vs the benefits, either for the world as a whole or a specific country in mind.

I can possibly help provide the modeling on the covid side, but I don't have a good grasp of the "cost" side of lockdowns at the moment.

Comment by linch on COVID-19 in developing countries · 2020-04-23T11:22:22.261Z · score: 16 (7 votes) · EA · GW

Oh man, I have lots of thoughts on this, hope to process it and have a good response in the next few days!

Initial thoughts:

1. Thank you so much for writing this and linking it on the EA Forum! I definitely think such ideas are under-explored, especially the important differences between high-income and other countries (and also heterogeneity in both groups!)

2. Secondly, I wasn't sure when you said "lockdowns are not practical in most low-income countries," what do you mean by low-income? Are you only referring to "low-income countries" in the technical sense of low-income as defined by the world bank, or are you including low middle income countries (like Nepal and Bangladesh, which you mention in your article), or even high middle-income countries like Brazil? I think my response to the article will be somewhat different if you are saying "I don't think it's worthwhile to attempt lockdowns in the DRC" vs "I think lockdowns are a bad idea even in places like Mexico and Brazil."

Comment by linch on [Open Thread] What virtual events are you hosting that you'd like to open to the EA Forum-reading public? · 2020-04-09T07:55:26.206Z · score: 2 (1 votes) · EA · GW

I was impressed by how high the turnout was. 34 concurrent on the livestream, 100+ views, and 29 people at the follow-up Q&A Zoom call afterwards.

Comment by linch on [Open Thread] What virtual events are you hosting that you'd like to open to the EA Forum-reading public? · 2020-04-09T04:31:21.714Z · score: 2 (1 votes) · EA · GW

The talk is finished! You can view the video here.

Comment by linch on [Open Thread] What virtual events are you hosting that you'd like to open to the EA Forum-reading public? · 2020-04-09T00:53:34.548Z · score: 2 (1 votes) · EA · GW

This is happening on Facebook in an hour! Please check the FB event for more details.

Comment by linch on [Open Thread] What virtual events are you hosting that you'd like to open to the EA Forum-reading public? · 2020-04-07T02:10:38.449Z · score: 2 (1 votes) · EA · GW

We'd like to try using this forum to coalesce possible questions to ask Cullen. Please use this comment chain to ask and rank questions about the Windfall clause!!

(We will attempt, but not guarantee, asking him questions in the order of highest-upvoted questions in this thread that wasn't covered in the talk, as well as some live questions!)

Comment by linch on [Open Thread] What virtual events are you hosting that you'd like to open to the EA Forum-reading public? · 2020-04-07T01:57:55.463Z · score: 3 (2 votes) · EA · GW

What's Up With the Windfall Clause? (online EA SF event)

cross-posted from Facebook. (I will check both this thread and the FB event for comments)

Date: 2020/04/08

Time: 19:00-21:00 (PDT)

How can we ensure that the gains from Transformative AI collectively benefit humanity, rather than just the lucky few? How can we incentivize innovation in AI in a way that's broadly positive for humanity? Other than alignment, what are distributive issues with the status quo in AI profits? What are practical issues with different distributive mechanisms?

In this talk, Cullen O'Keefe, Research Scientist at OpenAI and Research Affiliate at FHI, will argue for the "windfall clause": in short, that companies should donate excess windfall profits from Transformative AI for the common good.

You may be interested in reading his paper summarizing the core ideas [1], or his AMA on the EA Forum [2].

This will be EA: San Francisco's inaugural online event (and only our second general event).

We're still looking into different technological options for the best way to host this talk, but please have the Zoom App downloaded and create a Zoom account.

As this will be online, I see little reason to restrict this to people living within the physical Bay Area. Feel free to invite friends from all over the world (in a compatible time zone) if they wish to attend.

Tentative schedule:
Talk: 7:00-7:25
Q&A: 7:25-8:10
Structured Mingling: 8:10-9:00.
(Details of exact schedule TBD)

For the sake of everyone's mental health, we are banning all discussions of The Disease Which Must Not Be Named.

[1] https://arxiv.org/pdf/1912.11595.pdf
[2] https://forum.effectivealtruism.org/posts/9cx8TrLEooaw49cAr/i-m-cullen-o-keefe-a-policy-researcher-at-openai-ama

Comment by linch on New Top EA Causes for 2020? · 2020-04-02T23:44:31.183Z · score: 2 (1 votes) · EA · GW

Is it too late to submit new answers?

Comment by linch on New Top EA Causes for 2020? · 2020-04-02T07:13:32.244Z · score: 5 (4 votes) · EA · GW
During a crisis, people tend to implement the preferred policies of whoever seems to be accurately predicting each phase of the problem

This seems incredibly optimistic.

Comment by linch on What are the best arguments that AGI is on the horizon? · 2020-04-02T07:04:24.450Z · score: 2 (1 votes) · EA · GW

I edited that section, let me know if there are remaining points of confusion!

Comment by linch on What are the best arguments that AGI is on the horizon? · 2020-04-02T07:01:21.614Z · score: 3 (2 votes) · EA · GW
Do you include in "People working specifically on AGI" people working on AI safety, or just capabilities?

Just capabilities (in other words, people working to create AGI), although I think the safety/capabilities distinction is less clear-cut outside of a few dedicated safety orgs like MIRI.

"bullish" in the sense of "thinking transformative AI (TAI) is coming soon"

Yes.

what do you mean by "experts not working on AGI"?

AI people who aren't explicitly thinking of AGI when they do their research (I think this correctly describes well over 90% of ML researchers at Google Brain, for example).

Why say "even"

Because it might be surprising (to people asking or reading this question who are imagining long timelines) to see timelines as short as the ones AI experts believe, so the second point is qualifying that AGI experts believe it's even shorter.

In general it looks like my language choice was more ambiguous than desirable so I'll edit my answer to be clearer!

Comment by linch on Finding equilibrium in a difficult time · 2020-04-01T02:14:58.924Z · score: 7 (5 votes) · EA · GW

I also like this quote:

“I wish it need not have happened in my time," said Frodo.
"So do I," said Gandalf, "and so do all who live to see such times. But that is not for them to decide. All we have to decide is what to do with the time that is given us.”

J.R.R. Tolkien, The Fellowship of the Ring

Comment by linch on Ask Me Anything! · 2020-03-21T08:02:13.627Z · score: 3 (2 votes) · EA · GW

I think there's some evidence that Metaculus, while a group of fairly smart and well-informed people, are nowhere near as knowledgeable as a fairly informed EA (perhaps including a typical user of this forum?) for the specific questions around existential and global catastrophic risks.

One example I can point to is that for this question on climate change and GCR before 2100 (that has been around since October 2018), a single not-very-informative comment from me was enough to change the community median from 24% to 10%. This suggests to me that Metaculus users did not previously have strong evidence or careful reasoning on this question, or perhaps GCR-related thinking in general.

Now you might think that actual superforecasters are better, but based on the comments given so far for COVID-19, I'm unimpressed. In particular the selected comments point to use of reference classes that EAs and avid Metaculus users have known to be flawed for over a week before the report came out (eg, using China's low deaths as evidence that this can be easily replicated in other countries as the default scenario).

Now COVID-19 is not an existential risk or GCR, but it is an "out of distribution" problem showing clear and fast exponential growth that seems unusual for most questions superforecasters are known to excel at.

Comment by linch on Are there any public health funding opportunities with COVID-19 that are plausibly competitive with Givewell top charities per dollar? · 2020-03-20T05:37:26.872Z · score: 2 (1 votes) · EA · GW

Currently at 200 million a day, though NPR says they're facing shortages with the materials used to make masks.

Comment by linch on How would crop pollination work without commercial beekeeping? · 2020-03-20T02:33:31.643Z · score: 3 (2 votes) · EA · GW

Hmm, if everybody stopped eating honey and wild bees are not picking up the slack, then presumably farmers would instead pay for commercial beekeeping to pollinate their fields?