I'm Linch Zhang, an amateur COVID-19 forecaster and generalist EA. AMA

post by Linch · 2020-06-30T19:35:13.376Z · EA · GW · 80 comments

I am doing an Ask Me Anything. Work and other time constraints permitting, I intend to start answering questions on Sunday, 2020/07/05 12:01PM PDT.


I am Top 20 (currently #11) out of 1000+ on covid-19 questions on the amateur forecasting website Metaculus. I also do fairly well on other prediction tournaments, and my guess is that my thoughts have a fair amount of respect in the nascent amateur forecasting space. Note that I am not a professional epidemiologist and have very little training in epidemiology and adjacent fields, and there are bound to be considerations I will inevitably miss as an amateur forecaster.

I also do forecasting semi-professionally, though I will not be answering questions related to work. Other than forecasting, my past hobbies and experiences include undergrad in economics and mathematics, a data science internship [EA(p) · GW(p)] in the early days of Impossible Foods (a plant-based meats company), software engineering at Google, running the largest utilitarian memes page on Facebook, various EA meetups and outreach projects, long-form interviews of EAs on Huffington Post, lots of random thoughts [EA · GW] on EA questions, and at one point being near the top of several obscure games.

For this AMA, I am most excited about answering high-level questions/reflections on forecasting (eg, what EAs get wrong about forecasting, my own past mistakes, outside views and/or expert deference, limits of judgmental forecasting, limits of expertise, why log-loss is not always the best metric, calibration, analogies between human forecasting and ML, why pure accuracy is overrated, the future of forecasting...), rather than doing object-level forecasts.

I am also excited to talk about interests unrelated to forecasting or covid-19. In general, you can ask me anything, though I might not be able to answer everything. All opinions are, of course, my own, and do not represent those of past, current or future employers.


Comments sorted by top scores.

comment by Stephen Clare · 2020-06-30T22:36:19.308Z · EA(p) · GW(p)

Most of the forecasting work covered in Expert Political Judgement and Superforecasting related to questions with time horizons of 1-6 months. It doesn't seem like we know much about the feasibility or usefulness of forecasting on longer timescales. Do you think longer-range forecasting, e.g. on timescales relevant to existential risk, is feasible? Do you think it's useful now, or do you think we need to do more research on how to make these forecasts first?

Replies from: MichaelA, Linch
comment by MichaelA · 2020-08-09T04:48:20.084Z · EA(p) · GW(p)

I think this is a very important question. In case you haven't seen it, here's Luke Muehlhauser’s overview of his post How Feasible Is Long-range Forecasting? (I'd also highly recommend reading the whole post):

How accurate do long-range (≥10yr) forecasts tend to be, and how much should we rely on them?

As an initial exploration of this question, I sought to study the track record of long-range forecasting exercises from the past. Unfortunately, my key finding so far is that it is difficult to learn much of value from those exercises, for the following reasons:

  1. Long-range forecasts are often stated too imprecisely to be judged for accuracy. [More]
  2. Even if a forecast is stated precisely, it might be difficult to find the information needed to check the forecast for accuracy. [More]
  3. Degrees of confidence for long-range forecasts are rarely quantified. [More]
  4. In most cases, no comparison to a “baseline method” or “null model” is possible, which makes it difficult to assess how easy or difficult the original forecasts were. [More]
  5. Incentives for forecaster accuracy are usually unclear or weak. [More]
  6. Very few studies have been designed so as to allow confident inference about which factors contributed to forecasting accuracy. [More]
  7. It’s difficult to know how comparable past forecasting exercises are to the forecasting we do for grantmaking purposes, e.g. because the forecasts we make are of a different type, and because the forecasting training and methods we use are different. [More]

We plan to continue to make long-range quantified forecasts about our work so that, in the long run, we might learn something about the feasibility of long-range forecasting, at least for our own case.

(See also the comments on the EA Forum link post [EA · GW].)

Replies from: Stephen Clare
comment by Stephen Clare · 2020-08-10T14:27:25.660Z · EA(p) · GW(p)

Yeah, I don't blame Linch for passing on this question since I think the answer is basically "We don't know and it seems really hard to find out."

That said, it seems that forecasting research has legitimately helped us get better at sussing out nonsense and improving predictions about geopolitical events. Maybe it can improve our epistemic status on ex risks too. Given that there don't seem to be too many other promising candidates in this space, more work to gauge the feasibility of longterm forecasting and test different techniques for improving it seems like it would be valuable.

Replies from: Linch, MichaelA
comment by Linch · 2020-08-11T10:57:13.462Z · EA(p) · GW(p)

I agree with what you said at a high-level! Both that it's hard and that I'm bullish on it being plausibly useful.

FWIW, I still intend to answer this question eventually, hopefully before the question becomes moot!

comment by MichaelA · 2020-08-10T23:06:09.050Z · EA(p) · GW(p)

Yeah, I share the view that that sort of research could be very useful and seems worth trying to do, despite the challenges. (Though I hold that view with relatively low confidence, due to having relatively little relevant expertise.)

Some potentially useful links: I discussed the importance and challenges of estimating existential risk in my EAGx lightning talk and Unconference talk, provide some other useful links (including to papers and to a database of all x-risk estimates I know of) in this post [EA · GW], and quote from and comment on a great recent paper here [EA · GW].

I think there are at least two approaches to investigating this topic: solicit new forecasts about the future and then see how calibrated they are, or find past forecasts and see how calibrated they were. The latter is what Muehlhauser did, and he found it very difficult to get useful results. But it still seems possible there’d be room for further work taking that general approach, so in a list of history topics it might be very valuable to investigate [EA · GW] I mention:

6. The history of predictions (especially long-range predictions and predictions of things like extinction), millenarianism, and how often people have been right vs wrong about these and other things.

Hopefully some historically minded EA has a crack at researching that someday! (Though of course that depends on whether it'd be more valuable than other things they could be doing.)

(One could also perhaps solicit new forecasts about what’ll happen in some actual historical scenario, from people who don’t know what ended up happening. I seem to recall Tetlock discussing this idea on 80k, but I’m not sure.)

comment by Linch · 2020-07-15T09:10:22.534Z · EA(p) · GW(p)

Hi smclare! This is a very interesting question and I've been spending quite a bit of time mulling over it! Just want to let you know that me not answering (yet) is a result of me wanting to spend some time giving the question the gravity it deserves, rather than deliberately ignoring you!

comment by Peter Wildeford (Peter_Hurford) · 2020-06-30T23:21:53.347Z · EA(p) · GW(p)

What do you think helps make you a better forecaster than the other 989+ people?

What do you think makes the other ~10 people a better forecaster than you?

Replies from: Linch, Linch
comment by Linch · 2020-07-01T02:12:59.200Z · EA(p) · GW(p)

Hey I want to give a more directly informative answer later but since this might color other people's questions too: I just want to flag that I don't think I'm a better forecaster than all the 989+ people below me on the leaderboards, and I also would not be surprised if I'm better than some of the people above me on the leaderboard. There's several reasons for this:

  • Reality is often underpowered [EA · GW]. While medium-term covid-19 forecasting is less prone to those issues in comparison to many other EA questions, you still have a bunch of fundamental uncertainty about how actually good you are. Being correct for one question often relies on a "bet" that's loosely correlated with being correct on another question. At or near the top, there are not enough questions for you to be sure if you just got lucky in a bunch of correlated ways that others slightly below you in the ranks got unlucky on, vs you actually being more skilled. The differences are things like whether you "called" it correctly at 90% when others put 80%, or conversely when you were sufficiently calibrated at 70% when others were overconfident (or just unlucky) at 90%.
  • Metaculus rankings is a composite measurement of both activity and accuracy (all forecasting leaderboards have to be this way, otherwise the top 10 will be dominated by people who are overconfident and right on a few questions). For all I know, people who answer <10 covid-19 questions on Metaculus are actually amazing forecasters, they just chose a different platform than Metaculus after a brief trial, or they almost always only answer non-covid questions on Metaculus.
  • Garden of forking paths in what selection criteria you choose. For example I'm below top 50 on Metaculus overall (my excuse is I only joined 4 months ago) and below top 20 on some specific covid-19 subtournaments (though I'm also higher than 10 on others; my excuse is that I didn't participate in a lot of those questions). But you only get so many excuses, and it's hard to pick a fully neutral prior.

At worst, I can only be moderately confident that I'm somewhat above average at medium-term predictions of a novel pandemic, though I also don't want to be falsely humble. My best guess is that a) this represents some underlying skill more than that and b) there's some significant generalizability.

A lifetime ago, when I interviewed Liv Boeree about poker and EA, one thing she said really struck out to me. (I was dumb and didn't include it in the final interview, so hopefully I didn't butcher this rephrase completely). Roughly speaking, that in poker, among professionals, the true measurement of poker skill isn't how much money you make (because poker is both a high-skill and high-luck game, and there's so much randomness); rather a better measurement is the approval of your peers who are professional poker players.

This was really memorable to me because I always had the impression of poker as this extremely objective game with a very clear winning criteria (I guess from the outside, so is forecasting). If you can't even have a clear and externally legible metric for poker, what hope does anything else significantly more fuzzy have?

That said, I do think this is a question of degree rather than kind. I think the rankings are an okay proxy for minimal competence. You probably shouldn't trust forecasters too much (at least in their capacity as forecasters) who are below 50th percentile in the domain they are asked to forecast on, and maybe not below 90th percentile either, unless there are strong countervailing reasons.

Replies from: Peter_Hurford
comment by Peter Wildeford (Peter_Hurford) · 2020-07-01T04:21:59.079Z · EA(p) · GW(p)

This was a lot of good discussion of epistemics, and I highly valued that, but I was also hoping for some hot forecasting tips. ;) I'll try asking the question differently. [EA(p) · GW(p)]

Replies from: Linch
comment by Linch · 2020-07-01T04:45:11.256Z · EA(p) · GW(p)

I understood your intent! :) I actually plan to answer the spirit of your question on Sunday, just decided to break the general plan to "not answer questions until my official AMA time" because I thought the caveat was sufficiently important to have in the open!

comment by Linch · 2020-07-05T07:40:58.763Z · EA(p) · GW(p)
What do you think helps make you a better forecaster than the other 989+ people?

I'll instead answer this as:

What helps you have a higher rating than most of the people below you on the leaderboard?
  • I probably answered more questions than most of them.
  • I update my forecasts more quickly than most of them, particularly in March and April
    • Activity has consistently been shown to be one of (often, the) strongest predictors of overall accuracy in the academic literature.
  • I suspect I have a much stronger intuitive sense of probability/calibration.
    • For example, 17% (1:5) intuitively feels very different to me than 20% (1:4), and my sense is that this isn't too common
    • This could just be arrogance however, there isn't enough data for me to actually check this for actual predictions (as opposed to just calibration games)
  • I feel like I actually have lower epistemic humility compared to most forecasters who are top 100 or so on Metaculus. "Epistemic humility" defined narrowly as "willingness to make updates based on arguments I don't find internally plausible just because others believed them."
    • Caveat is that I'm making this comparison solely to top X% (in either activity or accuracy) forecasters.
      • I suspect a fair number of other forecasters are just wildly overconfident (in both senses of the term)
      • Certainly, non-forecasters (TV pundits, say, or just people I see on the internet) frequently seem very overconfident for what seems to me like bad reasons.
      • A certain epistemic attitude that I associate with both Silicon Valley and Less Wrong/rationalist culture is "strong opinions, held lightly"
        • This is where you believe concrete, explicit and overly specific models of the world strongly, but you quickly update whenever someone points out a hole in your reasoning.
        • I suspect this attitude is good for things like software design and maybe novel research, but is bad for having good explicit probabilities for Metaculus-style questions.
  • I'm a pretty competitive person, and I care about scoring well.
    • This might be surprising, but I think a lot of forecasters don't.
    • Some forecasters just want to record their predictions publicly and be held accountable to them, or want to cultivate more epistemic humility by seeing themselves be wrong
      • I think these are perfectly legitimate uses of forecasting, and I actively encourage my friends to use Metaculus and other prediction platforms to do this.
      • However, it should not be surprising that people who want to score well end up on average scoring better.
    • So I do a bunch of things like meditate on my mistakes and try really hard to do better. I think most forecasters, including good ones, do this much less than I do.
  • I know more facts about covid-19.
    • I think the value of this is actually exaggerated, but it probably helps a little.


What do you think other forecasters do to make them have a higher rating than you? [Paraphrased]

Okay, a major caveat here is that I think there is plenty of heterogeneity among forecasters. Another is that I obviously don't have clear insight into why other forecasters are better than me (otherwise I'd have done better!) However, in general I'm guessing they:

  • Have more experience with forecasting.
    • I started in early March and I think many of them have already been forecasting for a year or more (some 5+ years!).
    • I think experience probably helps a lot in building intuition and avoiding a lot of subtle (and not-so-subtle!) mistakes.
  • They usually forecast more questions.
    • It takes me some effort to forecast on new questions, particularly if the template is different from other questions I've forecasted on before, and they aren't something I've thought about before in a non-forecasting context
    • I know some people in the Top 10 literally forecast all questions on Metaculus, which seems like a large time commitment to me.
  • They update forecasts more quickly than me, particularly in May and June.
    • Back in March and April, I was *super* "on top of my game." But right now I have a backlog of old predictions, of which I'm >30 days behind on the earliest one (as in, the last time I updated that prediction was 30+ days ago).
    • This is partially due to doing more covid forecasting on day job, partially due to having some other hobbies, and partially due to general fatigue/loss of interest (akin to lockdown fatigue from others)
  • On average, they're more inclined to do simple mathematical modeling (Guesstimate, Excel, Google Sheets, foretold etc), whereas personally I'm often (not always) satisfied with a few jotted notes on a Google Doc plus a simple arithmetic calculator.

There are also more specific reasons some other forecasters are better than me, but I don't think all or even most of the forecasters better than me have:

  • JGalt seems to read the news both more and more efficiently than I do, and probably knows much more factual information than me.
    • In particular, I recall many times where I see interesting news on Twitter or other places, want to bring it Metaculus, and bam, JGalt has already linked it ahead of me.
      • This is practically a running meme among Metaculus users that JGalt has read all the news.
  • Lukas Gloor and Pablo Stafforini plausibly has a stronger internal causal model of various covid-19 related issues.
  • datscilly often decomposes questions more cleanly than me, and (unlike me and several other forecasters), appears to aggressively prioritize not updating on irrelevant information.
    • He also cares about scores more than I do.
  • I think Pablo, datscilly and some others started predicting on covid-19 questions almost as soon as the pandemic started, so they built up more experience than me not only on general forecasting, but also on forecasting covid-19 related questions specifically.

At least this is what I can gather from their public comments and (in some cases) private conversations. It's much harder for me to interpret how forecasters higher than me on the leaderboard but are otherwise mostly silent think.

Replies from: Peter_Hurford
comment by Peter Wildeford (Peter_Hurford) · 2020-07-05T15:59:06.931Z · EA(p) · GW(p)

1.) This is amazing, thank you. Strongly upvoted - I learned a lot.

2.) Can we have an AMA with JGalt where he teaches us how to read all the news?

comment by jungofthewon · 2020-07-02T12:16:33.202Z · EA(p) · GW(p)

Non-forecasting question: have you ever felt like an outsider in any of the communities you consider yourself to be a part of?

Replies from: Linch
comment by Linch · 2020-07-03T06:18:00.149Z · EA(p) · GW(p)

Yes, I think part of feeling like you don't belong is just pretty normal to being human! So on the outside, this should very much be expected.

But specifically:

  • I think of myself as very culturally Americanized (or perhaps more accurately EA + general "Western internet culture"), so I don't really feel like I belong among Chinese people anymore. However, I also have a heavy (Chinese) accent, so I think I'm not usually seen as "one of them" among Americans, or perhaps Westerners in general.
    • I mitigate this a lot by hanging out in a largely international group. I also strongly prefer written communication to talking, especially to strangers, likely to a large part because of this reason (but it's usually not conscious).
    • I also keep meaning to train my accent, but get lazy about it.
  • I think most EAs are natives to, for want of a better word, "elite culture":
    • eg they went to elite high schools and universities,
    • they may have done regular travel in the summers,
    • most but not all of them had rich or upper-class parents even by Western standards
    • Some of them go to burning man and take recreational drugs
    • Some of them are much more naturally comfortable with navigating certain ambiguities of elite culture
      • This is modulated somewhat by the fairly high rates of neurodiversity in our community.
  • In contrast, I think of myself as very much an immigrant to elite culture.
  • I felt the same way at Google, because of not an elite college background and also didn't major in CS.
  • I think elite culture in general, (and very much) EA specifically, portrays itself as aggressively meritocratic. This isn't entirely true, but the portrayal reflects the underlying reality by a large margin. So I feel pretty bad that I don't perceive myself as doing things that are "actually impactful", and also (to a lesser extent) that I'm not working as hard as the EAs I look up to, and producing much less.
    • There are exceptions and I work hard for things I get very obsessed by, eg. forecasting.
  • I used to culturally be much more of a gamer, but it's hard for me to identify that way anymore except to people who don't game.
    • I still play games, but (this is hard to describe well) I don't feel like I'm culturally a gamer anymore.
    • I'm not sure how to describe it, but I feel like it's something about gamers usually either
      • tie a lot of their identity to gaming and game a lot, or
      • tie a lot of their identity to gaming and game when they can, but have large family or work commitments that makes it infeasible to be gaming a lot.
    • And I guess I feel like neither really applies to me, exactly?
    • Even though I think this is for the best, I still am somewhat sad about losing this part of my identity. Board games with EAs help scratch the urge to play games, but not the culture
  • I feel like there's a sense that even though I haven't done much original research in it, I've contributed enough secondary ideas/communication to philosophy, and I understand enough academic philosophy, that I weakly should be in the edges of that community. However, I don't think the philosophers feel this way! 😅
    • Which is not to say I don't have friends in philosophy, or that they don't respect me, to be clear!
  • Surprisingly, I do not recall feeling like an imposter in the forecasting community.

But at a higher level, I definitely feel like all the communities I think of myself as fully part of (internet culture, college, EA, workplaces, amateur forecasting, etc.) have largely accepted me with open arms, and I'm grateful for that. I also think my emotions are reasonably calibrated here, and I don't have an undue level of imposter syndrome.

comment by WilliamKiely · 2020-06-30T23:55:31.997Z · EA(p) · GW(p)

Is forecasting plausibly a high-value use of one's time if one is a top-5% or top-1% forecaster?

What are the most important/valuable questions or forecasting tournaments for top forecasters to forecast or participate in? Are they likely questions/tournaments that will happen at a later time (e.g. during a future pandemic)? If so, how valuable is it to become a top forecaster and establish a track record of being a top forecaster ahead of time?

Replies from: Linch
comment by Linch · 2020-07-07T01:33:11.586Z · EA(p) · GW(p)
Is forecasting plausibly a high-value use of one's time if one is a top-5% or top-1% forecaster?

Yes, it's plausible.

What are the most important/valuable questions or forecasting tournaments for top forecasters to forecast or participate in?

My sense is that right now there's a market mismatch with an oversupply of high forecasting talent relative to direct demand/actual willingness/ability to use said talent. I'm not sure why this is, intuitively there are so many things in the world where having a higher-precision understanding of our uncertainty is just extremely helpful.

One thing I'd love to do is help figure out how to solve this and find lots of really useful things for people to forecast on.

comment by alexrjl · 2020-06-30T20:58:22.974Z · EA(p) · GW(p)

Here's a ton of questions pick your favourites to answer. What's your typical forecasting workflow like? Subquestions:

  • Do you tend to make guesstimate/elicit/other models, or mostly go qualitative? If this differs for different kinds of questions, how?

  • How long do you spend on initial forecasts and how long on updates? (Per question and per update would both be interesting)

  • Do you adjust towards the community median and if so how/why?

More general forecasting:

  • What's the most important piece of advice for new forecasters that isn't contained in Tetlock's superforecasting?

  • Do you forecast everyday things in your own life other than Twitter followers?

  • What unresolved question are you furthest from the community median on?

comment by NunoSempere · 2020-06-30T20:54:42.966Z · EA(p) · GW(p)
  • If you look at your forecasting mistakes, do they have a common thread?
  • How is your experience acquiring expertise at forecasting similar/different to acquiring expertise in other domains, e.g. obscure board-games? How so?
  • Any forecasting ressources you recommend?
  • Who do you look up to?
  • How does the distribution skill / hours of effort look for forecasting for you?
  • Do you want to wax poetically or ramble disorganizedly about any aspects of forecasting?
  • Any secrets of reality you've discovered & which you'd like to share?
Replies from: Linch, Linch
comment by Linch · 2020-07-03T07:23:56.423Z · EA(p) · GW(p)
If you look at your forecasting mistakes, do they have a common thread?

A botched Tolstoy quote comes to mind:

Good forecasts are all alike; every mistaken forecast is wrong in its own way

Of course that's not literally true. But when I reflect on my various mistakes, it's hard to find a true pattern. To the extent there is one, I'm guessing that the highest-order bit is that many of my mistakes are emotional rather than technical. For example,

  • doubling down on something in the face of contrary evidence,
  • or at least not updating enough because I was arrogant,
  • getting burned that way and then updating too much from minor factors
  • "updating" from a conversation because it was socially polite to not ignore people rather than their points actually being persuasive, etc.

If the emotion hypothesis is true, to get better at forecasting, the most important thing might well to be looking inwards, rather than say, a) learning more statistics or b) acquiring more facts about the "real world."

Replies from: Davidmanheim
comment by Davidmanheim · 2020-07-03T09:47:47.669Z · EA(p) · GW(p)

I think that as you forecast different domains, more common themes can start to emerge. And I certainly find that my calibration is off when I feel personally invested in the answer.

And re:

How does the distribution skill / hours of effort look for forecasting for you?

I would say there's a sharp cutoff in terms of needing a minimal level of understanding (which seems to be fairly high, but certainly isn't above, say, the 10th percentile.) After that, it's mostly effort, and skill that is gained via feedback.

comment by Linch · 2020-07-05T08:52:21.822Z · EA(p) · GW(p)
How is your experience acquiring expertise at forecasting similar/different to acquiring expertise in other domains, e.g. obscure board-games? How so?

Just FYI, I do not consider myself an "expert" on forecasting. I haven't put my 10,000 hours in, and my inside view is that there's so much ambiguity and confusion about so many different parameters. I also basically think judgmental amateur forecasting is a nascent field and there are very few experts[1], with the possible exception of the older superforecasters. Nor do I actually think I'm an expert in those games, for similar reasons. I basically think "amateur, but first (or 10th, or 100th, as the case might be) among equals" is a healthier and more honest presentation.

That said, I think the main commonalities for acquiring skill in forecasting and obscure games include:

  • Focus on generalist optimization for a well-specified score in a constrained system
    • I think it's pretty natural for both humans and AI to do better in more limited scenarios.
    • However, I think in practice, I am much more drawn to those types of problems than my peers (eg I have a lower novelty instinct and I enjoy optimization more).
  • Deliberate practice through fast feedback loops
    • Games often have feedback loops on the order of tens of seconds/minutes (Dominion) or hundreds of milliseconds/seconds (Beat Saber)
    • Forecasting has slower feedback loops, but often you can form an opinion in <30 minutes (sometimes <3 if it's a domain you're familiar with), and have it checked in a few days.
    • In contrast, the feedback loops for other things EA are interested in are often much slower. For example, research might have initial projects on the span of months and have it checked in the span of years, architecture in software engineering might take days to do and weeks to check (and sometimes the time to check is never)
  • Focus on easy problems
    • For me personally, it's often easier for me to get "really good" on less-contested domains than kinda good on very contested domains
      • For example, I got quite good at Dominion but I bounced pretty quickly off Magic, and I bounced (after a bunch of frustration) off chess.
      • Another example: in Beat Saber rather than trying hard to beat the harder songs, I spent most of my improving time on getting very high scores for the easier songs
      • In forecasting, this meant that making covid-19 forecasts 2-8 weeks out was more appealing than making geopolitical forecasts on the timescale of years, or technological forecasts on the timescale of decades
    • This allowed me to slowly and comfortably move into harder questions
      • For example now I have more confidence and internal models on predicting covid-19 questions multiple months out.
      • If I were to get back into Beat Saber, I'd be a lot less scared of the harder songs than I used to be (after some time ramping back up).
    • I do think not being willing to jump into harder problems directly is something of a character flaw. I'd be interested in hearing other people's thoughts on how they do this.

The main difference, to me is that:

  • Forecasting relies on knowledge of the real world
    • As opposed to games (and for that matter programming challenges) the "system" that you're forecasting on is usually much more unbounded.
    • So knowledge acquisition and value-of-information is much more important per question
    • This is in contrast to games, where knowledge acquisition is important on the "meta-level" but for any specific game,
      • balancing how much knowledge you need to acquire is pretty natural/intuitive.
      • and you probably don't need much new knowledge anyway.

[1] For reasons I might go into later in a different answer

comment by Peter Wildeford (Peter_Hurford) · 2020-06-30T23:22:34.799Z · EA(p) · GW(p)

What do EAs get wrong about forecasting?

Replies from: Linch
comment by Linch · 2020-07-16T16:53:01.055Z · EA(p) · GW(p)

I think the biggest is that EAs (definitely including myself before I started forecasting!) often underestimate the degree to which judgmental forecasting is very much a nascent, pre-paradigm field. This has a lot of knock-on effects, including but not limited to:

  • Thinking that the final word on forecasting is the judgmental forecasting literature
    • For example, the forecasting research/literature is focused entirely on accuracy, which has its pitfalls [EA(p) · GW(p)].
    • There are many fields of human study that does things like forecasting, even if it's not always called that, including but not limited to:
      • Weather forecasting (where Brier score came from!)
      • Intelligence analysis
      • Data science
      • Statistics
      • Finance
      • some types of consulting
      • insurance/reinsurance
      • epidemiology
      • ...
        • More broadly, any quantified science needs to make testable predictions
  • Over-estimating how much superforecasters "have it figured out"
  • Relatedly, overestimating how much other good forecasters/aggregation platforms have things figured out.
    • For example, I think some people over-estimate the added accuracy of prediction markets like PredictIt, or aggregation engines like Metaculus/GJO, or that of top forecasters there.
      • PredictIt especially basically seems safe to ignore when compared to expert models like 538.
  • Thinking that there's "one right way" to do forecasting
    • If there is, I sure haven't found it!
    • I think there's a lot of prescientific experimentation going on while people are still trying to figure out what the right experiments to do, the right questions to ask, etc., when it comes to advancing the science of forecasting.
  • Thinking that superforecasting/associated techniques are used a lot in government and business
    • It's not.
comment by henrycooksley · 2020-06-30T20:34:43.236Z · EA(p) · GW(p)

so you've done quite a few different things - right now, would you rather go into research, or entrepreneurship, and why?

comment by AronM · 2020-07-01T04:17:33.255Z · EA(p) · GW(p)

I would like to hear your thoughts on Generalist vs Specialist debate.

    • Advice for someone early as a generalist?
    • Did you stumble upon these different fields of interest by your own or did you surround yourself with smart people to get good understandings of various fields?
    • Thoughts on impact comparissons? (Eg can a generalist maybe bring knowledge/wisdom from intuitively non-adjacent disciplines into a project and help advance it?)
    • What skills are you lacking \ or which ones would you like to aquire to become a "Jack of all trades"?
    • Are you even aiming to become even more of a generalist? Yes or no - please elaborate.
Replies from: Linch
comment by Linch · 2020-07-16T16:38:23.426Z · EA(p) · GW(p)

Hmm this doesn't answer any of your questions directly, but might be helpful context to set: My impression is that relatively few people actually set out to become generalists! I think it's more accurate of an explanation to think of some people being willing to do what needs to get done (or doing things they find interesting, or has high exploration value, or a myriad of other reasons). And if those things keep seeming like highly impactful things to do (or continues to be interesting, has high learning/exploration value, etc), they keep doing them, and then eventually become specialists in that domain.

If this impression is correct, specialists start off as generalists who eventually specialize more and more, though when they start specializing might vary a lot (Some people continue to be excited about the first thing they tried, so are set on their life path by the time they were 12. Others might have tried 30 different things before settling on the right one).

(I obviously can't speak for other EAs; these are just my own vague impressions. Don't take it too seriously, etc)

Advice for someone early as a generalist?

Hmm, I don't feel too strongly about this, but maybe be really inquisitive and questioning of received wisdom? 80,000 Hours doesn't have a monopoly on truth of what jobs are impactful! The same (to a lesser degree) is true for GiveWell and global health charities, or FHI and what is good for the future of humanity.

Often, in any given case they'll have more experience in the topic and have spent more time on it, so you should probably take their advice under consideration, but 1) you are a domain expert in being you, so there are always gaps and pieces of information that others will miss when trying to give you broadly applicable advice and 2) ultimately your life is your own, and you should probably take charge of your own life plans and epistemology.

(For what it's worth, I don't think this is in-principle always true. I can imagine a world very much like ours where specialization is a lot better and the expert careers advice people are just much better than individuals at figuring out what careers are best for them, etc. I just think the world we live in is very far from that).

Did you stumble upon these different fields of interest by your own or did you surround yourself with smart people to get good understandings of various fields?

From the inside, it often feels like I just get interested in things and then explore them! But I don't think introspection is the most reliable source on these things. For example, if I had different friends, I'd undoubtedly have different interests, and be subconsciously guided to different things to focus on. There are also other unconscious biases, for example it is likely that I spend more time on interesting things that I perceive to be higher status among my friends.

I think once I do get interested in stuff, I'm often very willing to ping others to learn about it, eg, to ask dumb questions. For example I probably had at least 20 discussions about programming before my first programming job, and I had calls with Daniel Filan (friend and Metaculus moderator) and David Manheim (superforecaster) before deciding on spending increasingly high amounts of time on forecasting. For specific questions I'm confused about, I also often ping other amateur forecasters, or occasionally biosecurity people I know in EA. I also know a few medical students (and less closely due to the age of my peer group, medical doctors) who I sometimes ask questions about, but relatively few of the questions I'm interested in are directly related to treatment.

Thoughts on impact comparisons?

Too soon to tell.

can a generalist maybe bring knowledge/wisdom from intuitively non-adjacent disciplines into a project and help advance it?)

Yeah I think there are plenty of examples in history where lots of insights are gained from creatively applying fairly standard models in one field to another seemingly disparate field. Admittedly I think this as someone who's NOT a historian (or amateur historian), so I don't have a good idea of the true base rates.

Being a generalist is also not a free win. For example I think there's little stopping me from becoming a fairly good programmer, but obviously I wouldn't be as good at programming compared to if I started programming when I was 12. Something similar will be true for mathematics, or law, or entrepreneurship, or any number of other plausibly impactful things.

It's also easier for other people to interact with you if they think you're a specialist that they can peg easily.

What skills are you lacking \ or which ones would you like to acquire to become a "Jack of all trades"?

Hmm. I think there's a bunch of skills in the general domain of "presentation" that I'm pretty bad at, and meant to improve this year, but the pandemic has become my excuse to not worry about it. For example, improving my accent, having nicer clothes, better general grooming, etc. Pandemic renders a lot of these things moot both a) because there seem to be better things to do in the short term, and b) because the default way to communicate is text plus Zoom calls, so I think people subconsciously judge on those things a lot less.

More broadly, better "people skills" seems pretty helpful, though at the moment I think I have not defined what I'm bad at well enough for me to train a lot in that domain, other than presentation.

I'd also like to get better at math, history, and generalist research.

I think I'd like to get better at introspection and meta-cognition of my own feelings, though I'm not meta-cognitive enough to know if this would actually be really helpful (it's a vicious cycle!).

It'd be great if I can get really good at doing work consistently just because the work needs to be done at a high level, rather than because I'm personally in-the-moment excited about it.

Are you even aiming to become even more of a generalist? Yes or no - please elaborate.

I think my first paragraph might be a good answer to this? I roughly think I'll keep looking for things to do until I find something to "settle down" and specialize in (which might be in 6 months or 6 years).

Caveat everything by a decent margin. It sure feels weird for me to be giving life advice as someone in my twenties. I'm reminded of this line I read recently.

“They made us all look like complete geniuses! There’s a lot to be learned from the wisdom of age,” says a man who is 24.
comment by WilliamKiely · 2020-06-30T23:59:25.149Z · EA(p) · GW(p)

I vaguely recall hearing something like 'the skill of developing the right questions to pose in forecasting tournaments is more important than the skill of making accurate forecasts on those questions.' What are your thoughts on this and the value of developing questions to pose to forecasters?

Replies from: Linch
comment by Linch · 2020-07-03T07:13:30.326Z · EA(p) · GW(p)

Yeah I think Tara Kirk Sell mentioned this on the 80k podcast. I think I mostly agree, with the minor technical caveat that if you were trying to get people to forecast numerical questions, getting the ranges exactly right matters more when you have buckets (like in the JHU Disease Prediction Project that Tara ran, and Good Judgement 2.0), but asking people to forecast a distribution (like in Metaculus) allows the question asker to be more agnostic about ranges. Though the specific thing I would agree with is something like:

at current margins, getting useful forecasts out is more bottlenecked by skill in question operationalization than by judgemental forecasting skill.

I think other elements of the forecasting pipeline plausibly matter even more, which I talked about in my answer to JP's question [EA(p) · GW(p)].

Replies from: Davidmanheim
comment by Davidmanheim · 2020-07-03T09:59:38.272Z · EA(p) · GW(p)

"The right question" has 2 components. First is that the thing you're asking about is related to what you actually want to know, and second is that it's a clear and unambiguously resolvable target. These are often in tension with each other.

One clear example is COVID-19 cases - you probably care about total cases much more than confirmed cases, but confirmed cases are much easier to use for a resolution criteria. You can make more complex questions to try to deal with this, but that makes them harder to forecast. Forecasting excess deaths, for example, gets into whether people are more or less likely to die in a car accident during COVID-19, and whether COVID reduction measures also blunt the spread of influenza. And forecasting retrospective population percentages that are antibody positive runs into issues with sampling, test accuracy, and the timeline for when such estimates are made - not to mention relying on data that might not be gathered as of when you want to resolve the question.

comment by JP Addison (jpaddison) · 2020-06-30T20:24:27.593Z · EA(p) · GW(p)

Can you give your reflections on the limits of expertise?

Replies from: Stefan_Schubert
comment by Stefan_Schubert · 2020-06-30T21:30:33.014Z · EA(p) · GW(p)

Relatedly, on the nature of expertise. What's the relative importance of domain-specific knowledge and domain-general forecasting abilities (and which facets of those are most important)?

comment by Peter Wildeford (Peter_Hurford) · 2020-07-01T04:22:51.678Z · EA(p) · GW(p)

What should a typical EA who is informed on the standard forecasting advice do if they actually want to become good at forecasting? What did you do to hone your skill?

Replies from: Linch
comment by Linch · 2020-07-03T05:35:43.468Z · EA(p) · GW(p)

My guess is to just forecast a lot! The most important part is probably just practicing a lot and evaluating how well you did.

Beyond that, my instinct is that the closer you can get to deliberate practice the more you can improve. My guess is that there's multiple desiderata that's hard to satisfy all at once, so you do have to make some tradeoffs between them.

  • As close to the target domain of what you actually care about as possible. For example, if you care about having accurate forecasts on which psychological results are true, covid-19 tournaments or geo-political forecasting are less helpful than replication markets.
  • Can answer lots of questions and have fast feedback loops. For example, if the question you really care about is "will humans be extinct by 3000 AD?" you probably want to answer a bunch of other short term questions first to build up your forecasting muscles to actually have a better sense of these harder questions.
  • Can initially be easy to evaluate well. For example, if you want to answer "will AI turn out well?" it might be helpful to answer a bunch of easy-to-evaluate questions first and grade them.

In case you're not aware of this, I think there's also some evidence that calibration games, like OpenPhil's app, is pretty helpful.

Being meta-cognitive and reflective of your mistakes likely helps too.

In particular, beyond just calibration, you want to have a strong internal sense of when and how much your forecasts can update based on new information. If you update too much, then this is probably evidence that your beliefs should be closer to the naive prior (if you went from 20% to 80% to 20% to 75% to 40% in one day, you probably didn't really believe it was 20% to start with). If you update too little, then maybe the bar of evidence for you to change your mind is too high.

What did you do to hone your skill?

Before I started forecasting seriously, I attended several forecasting meetups that my co-organizer of South Bay Effective Altruism ran. Maybe going through the worksheets will be helpful here?

One thing I did that was pretty extreme was that I very closely followed a lot of forecasting-relevant details of covid-19. I didn't learn a lot of theoretical epidemiology, but when I was most "on top of things" (I think around late April to early May), I was basically closely following the disease trajectory, policies, and data ambiguities of ~20 different countries. I also read pretty much every halfway decent paper on covid-19 fatality rates that I could find, and skimmed the rest.

I think this is really extreme and I suspect very few forecasters do it to that level. Even I stopped trying to keep up because it was getting too much (and I started forecasting narrower questions professionally, plus had more of a social life). However, I think it is generally the case that forecasters usually know quite a lot of specific details of the thing they're forecasting: nowhere near that of subject matter experts, but they also have a lot more focus into the forecasting-relevant details, as opposed to grand theories or interesting frontiers of research.

That being said, I think it's plausible a lot of this knowledge is a spandrel and not actually that helpful for making forecasts. This answer is already too long but I might go in more detail about why I believe factual knowledge is a little overrated in other answers.

I also think that by the time I started Forecasting seriously, I probably started with a large leg up because (as many of you know) I spend a lot of my time arguing online. I highly doubt it's the most effective way to train forecasting skills (see first bullet point), and I'm dubious it's a good use of time in general. However, if we ignore efficiency, I definitely think the way I argued/communicated was a decent way to train having above-average general epistemology and understanding of the world.

Other forecasters often have backgrounds (whether serious hobbies or professional expertise) in things that require or strongly benefit from having a strong intuitive understanding of probability. Examples include semi-professional poker, working in finance, data science, some academic subfields (eg AI, psychology) and sometimes domain expertise (eg epidemiology).

It is unclear to me how much of these things are selection effects vs training, but I suspect that at this stage, a lot of the differences in forecasting success (>60%?) is explainable by practice and training, or just literally forecasting a lot.

Replies from: agent18
comment by agent18 · 2020-07-03T09:23:02.777Z · EA(p) · GW(p)

What sort of training material did you use to predict and get feedback on (#deliberate practice)

Replies from: Linch
comment by Linch · 2020-07-04T00:52:02.674Z · EA(p) · GW(p)

I mostly just forecasted the covid-19 questions on Metaculus directly. I do think predicting covid early on (before May?) was a near-ideal epistemic environment for this, because of various factors like

a) important
b) in a weird social epistemic state where lots of disparate, individually easy to understand, true information is out there
c) where lots of false information is out there
d) have very fast feedback loops and
e) predicting things/truth-seeking is shocking uncompetitive.

The feedback cycle (maybe several times a week for some individual questions) are still slower than what the deliberate practice research was focused on (specific techniques in arts and sports with sub-minute feedback). But it's much much better than other plausibly important things.

I probably also benefited from practice through the South Bay EA meetups[1] and the Open Phil calibration game[2].

[1] If going through all the worksheets is intimidating, I recommend just trying this one (start with "Intro to forecasting" and then do the "Intro to forecasting worksheet." EDIT 2020/07/04: Fixed worksheet.

[2] https://www.openphilanthropy.org/blog/new-web-app-calibration-training

comment by Peter Wildeford (Peter_Hurford) · 2020-06-30T23:22:09.338Z · EA(p) · GW(p)

How many Twitter followers will you have next week?

Replies from: Davidmanheim
comment by Davidmanheim · 2020-07-01T06:01:13.430Z · EA(p) · GW(p)

I already said I'd stop messing with him now.

Replies from: edoarad
comment by EdoArad (edoarad) · 2020-07-01T10:01:36.926Z · EA(p) · GW(p)

didn't you just violate that?

comment by JP Addison (jpaddison) · 2020-06-30T20:24:47.854Z · EA(p) · GW(p)

Why is pure accuracy overrated?

Replies from: Linch
comment by Linch · 2020-07-03T04:57:29.852Z · EA(p) · GW(p)

There's a bunch of things going on here. But roughly speaking, I think there's at least two things going on:

  • When people think of "success from judgmental forecasting", they usually think of a narrow thing that looks like the end product of the most open part of what Metaculus and Good Judgement .* does: coming up with good answers to specific, well-defined and useful questions. But a lot of the value of forecasting comes before and after that.
  • Even in the near-ideal situation of specific, well-defined forecasts, there are often metrics other than pure accuracy (beyond a certain baseline) that matters more.

For the first point, Ozzie Gooen (and I'm sure many other people) has thought a lot more about this. But my sense is that there's a long pipeline of things that makes a forecast actually useful for people:

  • Noticing quantifiable uncertainty. I think a lot of the value of forecasting comes from the pre-question operationalization stage. This is being able to recognize both that something that might be relevant to a practical decision that you (or a client, or the world) rely on is a) uncertain and b) can be reasonably quantifiable. I think a lot of our assumptions we do not recognize as such, or the uncertainty is not crisp enough that we can even think that it's a question we can ask others.
  • Data collection. Not sure where this fits in the pipeline, but often precise forecasts of the future are contextualized in the world of the relevant data that you have.
  • Question operationalization. This is what William Kiely's question [EA(p) · GW(p)] is referring to, which I'll answer more in detail there. But roughly, it's making your quantifiable uncertainty into a precise, well-defined question that can be evaluated and scored later.
  • Actual judgmental forecasting. This is mostly what I did, and what the leaderboards are ranked on, and what people think about when they think about "forecasting."
  • Making those forecasts useful. If this is for yourself, it's usually easier in some sense. If it's for the "world" or the "public," making forecasts useful often entails clear communication and marketing/advertising the forecasts so it can be taken up by relevant decision-makers (even if it's just individuals). If it's for a client, then this involves working closely with the client to make sure the client understands both your forecasts and its relevant implications, as well as possibly "going back to the drawing board" if the questions that you thought was operationalized well isn't actually useful for the client.
  • Evaluation. Usually, if the earlier steps are done well, this is easy because the question is set up to be easy to evaluate. That said, there are tradeoffs here. For example, if people trusts you to evaluate forecasts well, you can afford to cut corners and thus expand the range of what is "quantifiable", or start with worse question operationalizations and still deliver value.

For the second point, accuracy often trades off against other things. For example, cost-effectiveness and interpretability may matter more for clients.

If you spend a lot of time drilling down to a few questions, your forecasts are more "expensive" (both literally and figuratively) per question, and you will not be able to provide as much value in total. For interpretability, often just a number is not as helpful for clients, both in the sense of literal clients you directly work with and the world.

One thing that drives this point home to me is the existing "oracles" we have, like the stock market. There's a sense in which the stock market is extremely accurate (for example options are mostly "correctly" priced for future prices, but for many of our non-financial decisions it takes a LOT of effort to interpret what signals the market sends that's relevant to our future decisions, like how scared we should be of a future recession or large-scale famine.

comment by MichaelStJules · 2020-07-01T22:37:39.443Z · EA(p) · GW(p)

Are you using or do you plan to use your forecasting skills for investing?

Replies from: Linch
comment by Linch · 2020-07-03T06:44:48.133Z · EA(p) · GW(p)

No. At the high level I don't think I'm that good at forecasting, and beating the bar for being better at day-to-day investing than the Efficient Market Hypothesis must be really hard. Also, finding financial market inefficiencies is very much not neglected, so even if by some miracle I discovered some small inefficiency, I doubt the payoff would be worth it, relative to finding more neglected things to forecast on.

At a lower level, the few times I actually attempted to do forecasting on economic indicators, I did much worse than even I expected. For example, I didn't predict the May jobs rally, and I'm also still pretty confused about why the S&P 500 is so high now.

I think it's possible for EAs to sometimes predictably beat the stock market without intense effort. However, the way to do this isn't by doing the typical forecaster thing of having a strong intuitive sense of probabilities and doing the homework (because that's the bare minimum that I assume everybody in finance has).

Rather, I think the thing to maybe focus on is that EAs and adjacent communities in a very real sense "live in the future." For example, I think covid and the rise of Bitcoin were both moderately predictable way earlier than the stock market caught on (in Bitcoin's case, not that it will definitely take off, but it would have been reasonable to assign >1% chance of it taking off), and in fact have been predicted by those in our community. So we're maybe really good in relative terms at having an interdisciplinary understanding of discontinuities/black swans that only touch finance indirectly.

The financial world will be watching for the next pandemic, but maybe the next time we see the glimmers of something real and big on the horizon (localized nuclear war, AI advances, some large technological shift, something else entirely?), we might be able to act fast and make a lot of (in expectation) money. Or at least lose less money by encouraging our friends and EAs with lots of financial assets to de-risk at important moments.

Anyway, the main thing I believe is something like

you can maybe spot a potential EMH violation once a decade, so you gotta be ready to pull the trigger when that happens (but also have enough in reserves to weather being wrong)

This looks very different from normal "investing" since almost all of the time your money just sits in normal financial assets until you need to pull it out to do something weird.

Replies from: MichaelStJules
comment by MichaelStJules · 2020-07-03T17:38:01.884Z · EA(p) · GW(p)

Thanks for the answer. Makes sense!

I'm also still pretty confused about why the S&P 500 is so high now.

Some possible insight: the NASDAQ is doing even better, at its all-time high and wasn't hit as hard initially, and the equal-weight S&P 500 is doing worse than the regular S&P 500 (which weights based on market cap), so this tells me that disproportionately large companies (and tech companies) are still growing pretty fast. Some of these companies may even have benefitted in some ways, like Amazon (online shopping and streaming) and Netflix (streaming).

20% of the S&P 500 is Microsoft, Apple, Amazon, Facebook and Google. Only Google is still down since February at their peaks before the crash, the rest are up 5-15%, other than Amazon (4% of the S&P 500), which is up 40%!

comment by Thomas Kwa (tkwa) · 2020-07-01T04:02:29.086Z · EA(p) · GW(p)

Say an expert (or a prediction market median) is much stronger than you, but you have a strong inside view. What's your thought process for validating it? What's your thought process if you choose to defer?

Replies from: Linch
comment by Linch · 2020-07-07T00:02:48.450Z · EA(p) · GW(p)

I know this isn't the answer you want, but I think the short answer here is that I really don't know, because I don't think this situation is common. so I don't have a good reference class/list of case studies to describe how I'd react in this situation.

If this were to happen often for a specific reference class of questions (where some people just very obviously do better than me for those questions), I imagine I'd quickly get out of the predictions business for those questions, and start predicting on other things instead.

As a forecaster, I'm mostly philosophically opposed to updating strongly (arguably at all) based on other people's predictions. If I updated strongly, I worry that this will cause information cascades.

However, if I was in a different role, eg making action-relevant decisions myself, or "representing forecasters" to decision-makers, I might try to present a broader community view, or highlight specific experts.

Past work on this includes comments on Greg Lewis's excellent EA forum article [EA(p) · GW(p)] on epistemic modesty, Scott Sumner on why the US Fed should use market notions of monetary policy rather than what the chairperson of the Fed believes and notions of public vs. private uses of reason by Immanuel Kant.

I also raised this question on Metaculus.

Replies from: Linch
comment by Linch · 2020-07-07T00:11:08.617Z · EA(p) · GW(p)

Footnote on why this scenario

an expert (or a prediction market median) is much stronger than you, but you have a strong inside view

I think is in practice uncommon:

I think the ideal example in my head for showcasing what you describe goes something like this:

  • An expert/expert consensus/prediction market median that I respect strongly (as predictors) have high probability on X
  • I strongly believe not X. (or equivalently, very low probability on X).
  • I have strong inside views for why I believe not X.
  • X is the answer to a well-operationalized question
  • with a specific definition...
  • that everybody on the definition of.
  • I learned about the expert view very soon after they made it
  • I do not think there is new information that the experts are not updating on
  • This question's answer has a resolution in the near future, in a context that I have both inside-view and outside-view confidence in our relative track records (in either direction).

I basically think that there are very few examples of situations like this, for various reasons:

  • For starters, I don't think I have very strong inside views on a lot of questions.
  • Though sometimes the outside views look something like "this simple model predicts stuff around X, and the outside view is that this class of simple models outpredict both experts and my own more complicated models "
  • Eg, 20 countries have curves that look like this, I don't have enough Bayesian evidence that this particular country's progression will be different.
  • There are also weird outside views on people's speech acts, for example "our country will be different" is on a meta-level something that people from many countries believe, and this conveys almost no information
  • These outsideish views can of course be wrong (for example I was wrong about Japan and plausibly Pakistan).
  • Unfortunately, what is and isn't a good outside view is often easy to self-hack by accident.
  • Note that outside view doesn't necessarily look like expert deference.
  • Usually if there are experts or other aggregations whose opinion as forecasters that I strongly respect, I will just defer to them and not think that much myself
  • For example I'm deferring serious thinking around the 2020 election because I basically think 538.com has "got this."
  • I mostly select easier/relatively neglected domains to forecast on, at least with "ease" defined as "the market looks basically efficient"
  • Eg, I stay away from financial and election forecasts
  • A lot of the time, when experts say something that I think is wildly wrong and I dig into it further, it turns out they said it Y days/weeks ago, and I've already heard contradictory evidence that updated my internal picture since (and presumably the experts as well).

A caveat to all this is that I'm probably not as good at deferring to the right experts as many EA Forum users. Perhaps if I was better at it ("it" being identifying/deeply interpreting the right experts), I will feel differently.

comment by Stephen Clare · 2020-06-30T22:38:31.495Z · EA(p) · GW(p)

Lots of EAs seem pretty excited about forecasting, and especially how it might be applied to help assess the value of existential risk projects. Do you think forecasting is underrated or overrated in the EA community?

comment by Stephen Clare · 2020-06-30T22:30:56.921Z · EA(p) · GW(p)

Good forecasts seem kind of like a public good to me: valuable to the world, but costly to produce and the forecaster doesn't benefit much personally. What motivates you to spend time forecasting?

comment by Mark Xu · 2020-07-01T02:54:47.287Z · EA(p) · GW(p)

When I look at most forecasting questions, they seem goodharty in a very strong sense. For example, the goodhart tower for COVID might look something like:

1. How hard should I quarantine?

2. How hard I should quarantine is affected by how "bad" COVID will be.

3. How "bad" COVID should be caches out into something like "how many people", "when vaccine coming", "what is death rate", etc.

By the time something I care about becomes specific enough to be predictable/forecastable, it seems like most of the thing I actually cared about has been lost.

Do you have a sense of how questions can be better constructed to lose less of the thing that might have inspired the question?

comment by Linch · 2020-07-03T03:12:15.141Z · EA(p) · GW(p)

Meta: Wow, thanks a lot for these questions. They're very insightful and have made me think a lot, please keep the questions (and voting on them) coming! <3

It turns out I had some prior social commitments on Sunday that I forgot about, so I'm going to start answering these questions tonight plus Saturday, and maybe Friday evening too.

But *please* don't feel discouraged from continuing to ask questions, reading these questions have been a load of fun and I might keep answering things for a while.

Replies from: Linch
comment by Linch · 2020-07-03T07:15:41.326Z · EA(p) · GW(p)

Okay, I answered some questions! All the questions are great, keep them coming!

If you have a highly upvoted question that I have yet to answer, then it's because I thought answering it was hard and I need to think more before answering! But I intend to get around to answering as many questions as I can eventually (especially highly upvoted ones!)

comment by Peter Wildeford (Peter_Hurford) · 2020-07-01T23:17:14.774Z · EA(p) · GW(p)

What do you think you do that other forecasters don't do?

comment by Juan Cambeiro · 2020-06-30T20:51:22.801Z · EA(p) · GW(p)

What news sites, data sources, and/or experts have you found to be most helpful for informing your forecasts on COVID-19?

comment by RyanCarey · 2020-06-30T20:13:24.826Z · EA(p) · GW(p)

For Covid-19 spread, what seems to be the relative importance of: 1) climate, 2) behaviour, and 3) seroprevalence?

Replies from: Linch
comment by Linch · 2020-07-03T03:23:52.562Z · EA(p) · GW(p)

Tl;dr: In the short run (a few weeks) seroprevalence, in the medium run (months) behavior. In the long-run likely behavior as well, but other factors like wealth and technological access might start to dominate in hard-to-predict ways.

Thanks for the question! When I made this AMA, I was worried that all the questions would be about covid. Since there’s only one, I might as well devote a bunch of time to it.

There are of course factors other than those three, unless you stretch “behavior” to be maximally inclusive. For example, having large family sizes in a small house means it’s a lot harder to control disease spread within the home (in-house physical distancing is basically impossible if 7 people live in the same room). Density (population-weighted) more generally probably means it’s harder to control disease spread. One large factor is state capacity, which I operationalize roughly as “to the extent your gov’t can be said to be a single entity, how much can it carry out the actions it wants to carry out.” Poverty and sanitation norms more generally likely matters a lot, though I haven’t seen enough data to be sure. Among high-income countries, I also will not be surprised if within-country inequality is a large factor, though I am unsure what the causal mechanism will be.

In the timescale you need to think about for prioritizing hospital resources and other emergency measures, aka “the short run” of say a few weeks, seroprevalence of the virus (how many people are infected and infectious) dominates by a very large margin. There’s so much we still don’t know about how the disease spreads, so I think (~90%) by far the most predictive factors for how many cases there will be in a few weeks are high-level questions like how many people are currently infected and what the current growth rate is, with a few important caveats like noting that confirmed infections definitely do NOT equal active infections.

In the medium run (2+ months), I think (~85%), at least if I was to choose between {current prevalence, behavior, seasonality}, this is almost entirely driven by behavior, both governmental actions (test and trace policies, school closures, shutting large events) and individual responses (compliance, general paranoia, voluntary social distancing, personal mask usage). This is especially clear to me when I compare the trajectories of countries in Latin America to ones in (especially East) Asia. In March and April, there was not a very large seasonality difference, and wealth levels were similar, and household sizes weren’t that different, and East Asia started with much higher seroprevalence, but through a combination of governmental interventions and individual behaviors, the end of April looked very different for Latin America countries and Asian ones.

Seasonality probably matters. I tried studying how much it matters and got pretty confused. My best guess is ~25% reduction in Rt (with high uncertainty), so maybe it matters a lot in relative terms compared to specific interventions (like I wouldn’t be surprised if it’s a bigger deal than a single intervention like going from 20% to 70% cloth mask coverage, or university closures, or 50% increase in handwashing, or banning public events larger than N people), but I’d be very surprised if it’s bigger than the set of all behaviors. In the short run seasonality will be a lot smaller than the orders of magnitude differences in current prevalence, and in the long run seasonality is significantly smaller than behavioral change.

One thing to note is that some of the effects of seasonality is likely mediated through behavior or the lack thereof. For example, schools opening in fall are plausibly a large part of disease spread for flu and thus maybe covid; this channel is irrelevant in places that have school closures anyway. Likewise, summer vs winter (in many countries) changes where and how people interact with each other. There’s also countervailing factors I don’t know enough about, like maybe hotter weather makes it less palatable to wear masks, or especially hot/cold weather interface poorly with existing ventilation setups.

comment by alexrjl · 2020-07-02T13:17:02.284Z · EA(p) · GW(p)

Forecast your win probability in a fight against:

500 horses, each with the mass of an average duck.

1 duck, with the mass of an average horse.

(numbers chosen so mass is roughly equal)

Replies from: Linch
comment by Linch · 2020-07-03T06:47:42.610Z · EA(p) · GW(p)

I actually answered this before, on the meme page:

The gut instinct answer is 100 duck-sized horses, because ducks are much scarier than horses. But one of the things that being a page admin of counterintuitive philosophical problems has taught us is that sometimes we can’t always go with our gut. Here, for example, a horse-sized duck, while very intimidating and scary looking, is probably not structurally sound enough to stand upright for long, and we can probably escape it and let it collapse under its own weight. In contrast, 100 duck-sized horses wouldn’t be much weaker than 100 normal-sized horses (https://supersonicman.wordpress.com/2011/11/13/the-square-cube-law-all-animals-jump-the-same-height/), and they’ll definitely have scary kicks.

I still stand by this. maybe 85% that I can win against the duck, and 20% the horses? Depends a lot on initial starting position of course.

Replies from: Engineer_Jayce314
comment by Engineer_Jayce314 · 2020-07-06T06:34:35.910Z · EA(p) · GW(p)

Speaking of gut instincts, cognitive psychology looks A LOT into what forms gut instincts take shape and fool us into bad answers or bad lines of reasoning. They'd call them cognitive biases. When building models, how do you ensure that there is as little of these biases in the model? To add to that, does some of the uncertainty you mentioned in other answers come from these biases, or are they purely statistical?

comment by Dewi Erwan (dewierwan) · 2020-07-01T09:11:01.559Z · EA(p) · GW(p)

How important do you think it is that your or others' forecasts are more well-understood or valued among policy-makers? And if you think they should listen to forecasts more often, how do you think we should go about making them more aware?

comment by Ben Millwood (BenMillwood) · 2020-07-06T13:40:23.957Z · EA(p) · GW(p)

I'm very motivated to make accurate decisions about when it will be safe for me to see the people I love again. I'm in Hong Kong and they're in the UK, though I'm sure readers will prefer generalizable stuff. Do you have any recommendations about how I can accurately make this judgement, and who or what I should follow to keep it up to date?

Replies from: Linch
comment by Linch · 2020-07-15T06:35:50.003Z · EA(p) · GW(p)

For your second question, within our community, Owain Evans seems to have good thoughts on the UK. alexrj (on this forum) and Vidur Kapur are based in the UK and they both do forecasting pretty actively, so they presumably have reasonable thoughts/internal models about different covid-19 related issues for the UK. To know more, you probably want to follow UK-based domain experts too. I don't know who are the best epidemiologists to follow in the UK, though you can probably figure this out pretty quickly from who Owain/Alex/Vidur listen to.

For your first question, I have neither a really good generalizable model or object-level insights to convey at this moment, sorry. I'll update you if something comes up!

comment by Ben Millwood (BenMillwood) · 2020-07-06T13:31:07.618Z · EA(p) · GW(p)

As someone with some fuzzy reasons to believe in their own judgement, but little explicit evidence of whether I would be good at forecasting or not, what advice do you have for figuring out if I would be good at it, and how much do you think it's worth focusing on?

comment by Peter Wildeford (Peter_Hurford) · 2020-07-05T16:22:26.959Z · EA(p) · GW(p)

How much time do you spend forecasting? (Both explicitly forecasting on Metaculus and maybe implicitly doing things related to forecasting, though the latter I suspect is currently a full-time job for you?)

comment by Mark Xu · 2020-07-01T02:57:26.655Z · EA(p) · GW(p)

How optimistic about "amplification" forecast schemes, where forecasters answer questions like "will a panel of experts say <answer> when considering <question> in <n> years?"

comment by elifland · 2020-07-02T00:53:49.926Z · EA(p) · GW(p)

I've recently gotten into forecasting and have also been a strategy game addict enthusiast at several points in my life. I'm curious about your thoughts on the links between the two:

  • How correlated is skill at forecasting and strategy games?
  • Does playing strategy games make you better at forecasting?
Replies from: Linch
comment by Linch · 2020-07-06T21:33:45.215Z · EA(p) · GW(p)
How correlated is skill at forecasting and strategy games?

I’m not very good at strategy games, so hopefully not much!

The less quippy answer is that strategy games are probably good training grounds for deliberate practice and quick optimization loops, so that likely counts for something (see my answer to Nuno about games [EA · GW]). There are also more prosaic channels, like general cognitive ability and willingness to spend time in front of a computer.

Does playing strategy games make you better at forecasting?

I’m guessing that knowing how to do deliberate practice and getting good at a specific type of optimization is somewhat generalizable, and it's good to do that in something you like (though getting good at things you dislike is also plausibly quite useful). I think specific training usually trumps general training, so I very much doubt playing strategy games is the most efficient way to get better at forecasting, unless maybe you’re trying to forecast results of strategy games.

comment by WilliamKiely · 2020-06-30T23:53:58.282Z · EA(p) · GW(p)

What were your reasons for getting more involved in forecasting?

comment by NunoSempere · 2020-06-30T20:33:22.118Z · EA(p) · GW(p)

Hi Linch! So what's up with the Utilitarian Memes page? Can you tell more about it? Any deep lessons from utilitarian memes?

comment by Ben Millwood (BenMillwood) · 2020-07-06T13:32:38.593Z · EA(p) · GW(p)

Do you think people who are bad at forecasting or related skills (e.g. calibration) should try to become mediocre at it? (Do you think people who are mediocre should try to become decent but not great? etc.)

comment by jamesjuniper (jamestitchener) · 2020-07-02T00:25:49.193Z · EA(p) · GW(p)

What's your process like for tackling a forecast?

Do you think forecasting has a place in improving the decision making in business?

comment by Emanuele_Ascani · 2020-07-01T10:23:56.221Z · EA(p) · GW(p)

How much time do you spend on forecasting, including researching the topics?

comment by Jotto (Justin Otto) · 2020-07-01T15:59:13.016Z · EA(p) · GW(p)

Forecasting has become slightly prestigious in my social circle. At current margins of forecastingness, this seems like a good thing. Do you predict much corruption or waste if the hobby got much more prestigious than it currently is? This question is not precise and comes from a soup of vaguely-related imagery.

comment by Mark Xu · 2020-07-01T02:59:11.406Z · EA(p) · GW(p)

In what meaningful ways can forecasting questions be categorized?

This is really broad, but one possible categorization might be questions that have inside view predictions versus questions that have outside view predictions.

comment by jungofthewon · 2020-06-30T22:43:44.479Z · EA(p) · GW(p)

I will forecast a personal question for you e.g. "How many new friends will I make this year?" What do you want to ask me?

Replies from: Linch
comment by Linch · 2020-07-15T06:35:24.391Z · EA(p) · GW(p)

In 2021, what percentage of my working hours will I spend on things that I would consider to be forecasting or forecasting-adjacent?

Replies from: jungofthewon
comment by jungofthewon · 2020-07-19T11:47:03.400Z · EA(p) · GW(p)

I'll make a distribution. Do you want to make a distribution too and then we can compare?

Replies from: jungofthewon
comment by jungofthewon · 2020-07-24T16:41:13.795Z · EA(p) · GW(p)

https://elicit.ought.org/builder/RT9kxWoF9 My distribution! Good question Linch; it had a fun mix of investigative LinkedIn sleuthing + decomposition + reasoning about Linch + thoughts that I could sense others might disagree with.

comment by MichaelA · 2020-08-10T01:33:32.123Z · EA(p) · GW(p)

Thanks for doing this AMA! In case you still might answer questions, I'm curious as to how much value you think there'd be in: 

  • further research into forecasting techniques
  • improving existing forecasting tools and platforms
  • developing better tools and platforms

E.g., if someone asked you for advice on whether to do work in academia similar to Tetlock's work, or build things like Metaculus or calibration games, or do something else EAs often think is valuable, what might you say? 

(I ask in part because you wrote about judgemental forecasting being "very much a nascent, pre-paradigm field" [EA(p) · GW(p)].)

comment by Engineer_Jayce314 · 2020-07-06T06:22:22.974Z · EA(p) · GW(p)

Often times, to me it seems, machine learning models reveal solutions or insights that, while researchers may have known them already, are actually closely linked to the problem it's modelling. In your experience, does this happen often with ML? If so, does that mean ML is a very good tool to use in Effective Altruism? If not, then where exactly does this tendency come from?

(As an example of this 'tendency', this study used neural networks to find that estrogen exposure and folate deficiency were closely correlated to breast cancer. Source: https://www.sciencedirect.com/science/article/abs/pii/S0378111916000706 )

comment by jungofthewon · 2020-07-02T12:16:05.144Z · EA(p) · GW(p)

Which types of forecasting questions do you like / dislike more?