Misha_Yagudin's Shortform 2020-01-08T14:23:19.002Z


Comment by misha_yagudin on 2020: Forecasting in Review · 2021-01-10T17:16:39.827Z · EA · GW

I'm back to being the #1 forecaster there, after having momentarily lost the position to user @Hinterhunter.

This happened in 2021 :P

Comment by misha_yagudin on EA and the Possible Decline of the US: Very Rough Thoughts · 2021-01-08T15:11:45.922Z · EA · GW

I am aware of two (short-term) questions related to civil war scenarios on Metaculus:

Comment by misha_yagudin on EA and the Possible Decline of the US: Very Rough Thoughts · 2021-01-08T15:08:27.920Z · EA · GW

I think the evidence from the financial markets is a bit weaker.

First, let's imagine predicting that the forecasting platform will stop operating and assume that forecasting is only incentivized by points on this platform. The reasonable prediction is that platform will continue to operate because otherwise, points will become meaningless. Same about predicting existential risk (because if it occurs, one won't be able to claim a prize).

The US collapse will be devastating for the financial markets (plausible to me unless the USA will gradually lose power and importance, in which case interventions are less crucial). The incentives assumption seems plausible to me as well. So the market might not be a reliable predictor of it.

Comment by misha_yagudin on Open and Welcome Thread: January 2021 · 2021-01-08T01:18:11.729Z · EA · GW

It seems Considering Considerateness: Why communities of do-gooders should be exceptionally considerate is not as visible now because CEA removed "Our current thinking" (or something) from their webpage and the essay is not linked e.g. at So I want to highlight it as I liked it a lot a few years ago.

Comment by misha_yagudin on CHOICE - Creating a memorable acronym for EA principles · 2021-01-08T01:12:20.862Z · EA · GW

I weakly downvoted.  I felt meh about coming up with better acronyms because

  • it feels low-fidelity and I would rather have people forget/rephrase EA principles rather than learn them by heart;
  • guiding principles should not be changed frequently and without great need.

Also, I disliked the proposed acronym because

  • pro-life/pro-choice associations;
  • while choice is a generic word, it is associated with the choice/obligation debate within the community, which makes it not a very good choice.
Comment by misha_yagudin on Prize: Interesting Examples of Evaluations · 2020-12-12T17:45:02.942Z · EA · GW

Huh! The thread I linked to and David Manheim's winning comment cite the same paper :)

Comment by misha_yagudin on Prize: Interesting Examples of Evaluations · 2020-12-12T17:41:10.806Z · EA · GW

Correlating subjective metrics with objective outcomes to provide better intuitions about what an additional point on a scale might mean. Resulting intuitions still suffers from "correlation ≠ causation" and all curses of self-reported data (which, in my opinion, makes such measurements close to useless) but is a step forward.

See this tweet and whole tread h/t Guzey

Comment by misha_yagudin on What are some low-information priors that you find practically useful for thinking about the world? · 2020-11-28T13:30:13.888Z · EA · GW

Here is a Wikipedia reference:

The Lindy effect is a theory that the future life expectancy of some non-perishable things like a technology or an idea is proportional to their current age, so that every additional period of survival implies a longer remaining life expectancy. Where the Lindy effect applies, mortality rate decreases with time.

Comment by misha_yagudin on Please Take the 2020 EA Survey · 2020-11-14T02:00:34.000Z · EA · GW

Well, I am far from expert, but my understanding is that differential privacy  operates on queries as opposed to individual datapoints. But there are tools s.a. randomized response which will provide plausible deniability to individual responses.

Comment by misha_yagudin on Thoughts on whether we're living at the most influential time in history · 2020-11-03T07:23:56.064Z · EA · GW

re: "This post has a lot of very small numbers in it. I might have missed a zero or two somewhere."

Hey Buck, consider using scientific notation instead of decimal one: "0.00000009%" is hard to read and 9e-10 is less prone to typos.

Comment by misha_yagudin on Aligning Recommender Systems as Cause Area · 2020-10-07T19:52:39.257Z · EA · GW

Partnership on AI now has a paper on the topic: What are you optimizing for? Aligning Recommender Systems with Human Values.

Comment by misha_yagudin on Introducing LEEP: Lead Exposure Elimination Project · 2020-10-06T17:40:48.711Z · EA · GW

I am curious about which other countries you identified as promising?

Listing them might be beneficial, as I can imagine that finding an experienced and well-connected candidate for a target location can change the outcome of cost-effectiveness calculation by increasing tractability. On other hand, good candidates might not be hard to find or be especially likely discovered via the EA network.

Comment by misha_yagudin on Singapore’s Technical AI Alignment Research Career Guide · 2020-10-03T12:47:04.352Z · EA · GW

This forecast suggests that extreme reputational risks are non-negligible.

Comment by misha_yagudin on Singapore’s Technical AI Alignment Research Career Guide · 2020-10-03T12:45:00.940Z · EA · GW

Working for SenseTime might be associated with reputational risks, according to FT:

The US blacklisted Megvii and SenseTime in October, along with voice recognition company iFlytek and AI unicorn Yitu, accusing the companies of aiding the “repression, mass arbitrary detention and high-technology surveillance” in the western Chinese region of Xinjiang.

At the same time, someone working for them might provide our community with cultural knowledge relevant to surveillance and robust totalitarianism.

Comment by misha_yagudin on Linch's Shortform · 2020-09-29T10:50:35.580Z · EA · GW

I think it is useful to separately deal with the parts of a disturbing event over which you have an internal or external locus of control. Let's take a look at riots:

  • An external part is them happening in your country. External locus of control means that you need to accept the situation. Consider looking into Stoic literature and exercises (say, negative visualizations) to come to peace with that possibility.
  • An internal part is being exposed to dangers associated with them. Internal locus of control means that you can take action to mitigate the risks. Consider having a plan to temporarily move to a likely peaceful area within your country or to another county.
Comment by misha_yagudin on AMA: Markus Anderljung (PM at GovAI, FHI) · 2020-09-21T22:15:42.287Z · EA · GW

Any insights into what constitutes good research management on the levels of (a) a facilitator helping a lab to succeed, and (b) an individual researcher managing himself (and occasional collaborators)?

Comment by misha_yagudin on The case for building more and better epistemic institutions in the effective altruism community · 2020-09-20T21:06:00.665Z · EA · GW

Roam Research is

> starting a fellowship program where we are giving grants to researchers to explore the space of Tools for Thought, Collective Intelligence, Augmenting The Human Intellect.

They recently raised $9M at a $200M seed evaluation and previously received two grants from EA LTFF.

Comment by misha_yagudin on New book: Moral Uncertainty by MacAskill, Ord & Bykvist · 2020-09-20T16:11:30.465Z · EA · GW

Now a thread from Toby Ord:

What is moral uncertainty?

Comment by misha_yagudin on Pablo Stafforini’s Forecasting System · 2020-09-17T12:46:07.561Z · EA · GW

I use Emacs for my personal forecasts because it is convenient: the questions are in the todo-list, I can resolve the question with a few keystrokes, TODO-states make questions look beautiful, a small python script gives me a calibration chart…

To be honest, all major forecasting platforms have quite bad UX for small personal things, it always takes to many clicks to make forecasting question and so on. I wish they'd popularize personal predictions by having sort of "very quick capture" like many todo-list apps have [e.g. Amazing Marvin].

I forecast much fewer questions on GJ Open and found Tab Snooze to be an easy way to remind me that I wanted to make updates/take a look at new data.

Comment by misha_yagudin on Judgement as a key need in EA · 2020-09-12T16:57:39.437Z · EA · GW

I like the list of resources you put together, another laconic source of wisdom is What can someone do now to become a stronger fit for future Open Philanthropy generalist RA openings?.

Comment by misha_yagudin on Judgement as a key need in EA · 2020-09-12T16:54:31.419Z · EA · GW

Hey Ben, what makes you think that judgment can be generally improved?


When Owen posted "Good judgement" and its components, I briefly reviewed the literature on  transfer of cognitive skills:

This makes me think that general training (e.g. calibration and to a lesser extent forecasting) might not translate to an overall improvement in judgment. OTOH, surely, getting skills broadly useful for decision making (e.g. spreadsheets, probabilistic reasoning, clear writing) should be good.


A bit of a tangent. Hanson's Reality TV MBAs is an interesting idea. Gaining experience via being a personal assistant to someone else seems to be beneficial², so maybe this could be scaled up by having a reality TV show. Maybe it is a good idea to invite people with good judgment/research taste to stream some of their working sessions and so on? 

[1]: According to Wikipedia: Near transfer occurs when many elements overlap between the conditions in which the learner obtained the knowledge or skill and the new situation. Far transfer occurs when the new situation is very different from that in which learning occurred.

[2]: Moreover, it is one of the 80K's paths that may turn out to be very promising.

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-12T15:40:17.296Z · EA · GW

Thanks for challenging me :) I wrote my takes after this discussion above.

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-12T15:37:16.488Z · EA · GW

This example is somewhat flawed (because forecasting only once breaks the assumption I am making) but might challenge your intuitions a bit :)

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-12T15:35:46.236Z · EA · GW

Thanks, everyone, for engaging with me. I will summarize my thoughts and would likely not actively comment here anymore:

  • I think the argument holds given the assumption [(a) probability to forecast on each day are proportional for the forecasters (previously we assumed uniformity) + (b) expected number of active days] I made.
    • > I think intuition to use here is that the sample mean is an unbiased estimator of expectation (this doesn't depend on the frequency/number of samples). One complication here is that we are weighing samples potentially unequally, but if we expect each forecast to be active for an equal number of days this doesn't matter.
  • The second assumption seems to be approximately correct assuming the uniformity but stops working on the edge [around the resolution date], which impacts the average score on the order of .
    • This effect could be noticeable, this is an update.
  • Overall, given the setup, I think that forecasting weekly vs. daily shouldn't differ much for forecasts with a resolution date in 1y.
  • I intended to use this toy model to emphasize that the important difference between the active and semi-active forecasters is the distribution of days they forecast on.
  • This difference, in my opinion, is mostly driven by the 'information gain' (e.g. breaking news, pull is published, etc).
    • This makes me skeptical about features s.a. automatic decay and so on.
    • This makes me curious about ways to integrate information sources automatically.
    • And less so about notifications that community/followers forecasts have significantly changed. [It is already possible to sort by the magnitude of crowd update since your last forecast on GJO].

On a meta-level, I am

  • Glad I had the discussion and wrote this comment :)
  • Confused about people's intuitions about the linearity of EV.
    • I would encourage people to think more carefully through my argument.
  • This makes me doubt I am correct, but still, I am quite certain. I undervalued the corner cases in the initial reasoning. I think I might undervalue other phenomena, where models don't capture reality well and hence triggers people's intuitions:
    • E.g. randomness of the resolution day might magnify the effect of the second assumption not holding, but it seems like it shouldn't be given that in expectation one resolves the question exactly once.
  • Confused about not being able to communicate my intuitions effectively.
    • I would appreciate any feedback [not necessary on communication], I have a way to submit it anonymously:
Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-12T11:39:12.138Z · EA · GW

I mildly disagree. I think intuition to use here is that the sample mean is an unbiased estimator of expectation (this doesn't depend on frequency/number of samples). One complication here is that we are weighing samples potentially unequally, but if we expect each forecast to be active for an equal number of days this doesn't matter.


ETA: I think the assumption of "forecasts have an equal expected number of active days" breaks around the closing date, which impacts things in the monotonical example (this effect is linear in the expected number of active days and could be quite big in extremes).

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-12T10:43:52.737Z · EA · GW

re: limit — a nice example. Please notice, that Bob makes a forecast on a (uniformly) random day, so when you take an expectation over the days he is making forecasts on you get the average of scores for all days as if he forecasted every day.

Let be the number of total days, be the probability Bob forecasted on a day , be the brier score of the forecast made on day :


I am a bit surprised that it worked out here because it breaks the assumption of the equality of the expected number of days forecast will be active. Lack of this assumption will play out if when aggregating over multiple questions [weighted by the number of active days]. Still, I hope this example gives helpful intuitions


Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-11T23:16:27.744Z · EA · GW

Here is a sketch of a formal argument, which will show that freshness doesn't matter much.

Let's calculate the average Brier score of a forecaster. We can see the contribution of hypothetical forecasts on day toward sum: . If forecasts are sufficiently random the expected number of days forecasts are active should be equal. Because , expected average Brier score is equal to the average of Briers scores for all days.

Comment by misha_yagudin on New book: Moral Uncertainty by MacAskill, Ord & Bykvist · 2020-09-11T18:57:40.450Z · EA · GW

Here’s an informal history and summary, in tweet form.

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-11T15:28:22.599Z · EA · GW

Aha, of the top of my head one might go in the directions of (a) TD-learning type of reward; (b) variance reduction for policy evaluation.

After thinking for a few more minutes, it seems that forecasting more often but at random moments shouldn't impact the expected Brier score. But in practice people frequent forecasters are evaluated with respect to a different distribution (which favors information gain/"something relevant just happen") — so maybe some sort of importance sampling might help to equalize these two groups?

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-11T00:34:03.073Z · EA · GW

Also, because the Median score is the median of all Brier scores (and not Brier score of the median forecast) it might still be good for your Accuracy score to forecast something close to community's median.

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-11T00:28:09.585Z · EA · GW says:

To determine your accuracy over the lifetime of a question, we calculate a Brier score for every day on which you had an active forecast, then take the average of those daily Brier scores and report it on your profile page. On days before you make your first forecast on a question, you do not receive a Brier score. Once you make a forecast on a question, we carry that forecast forward each day until you update it by submitting a new forecast.

Comment by misha_yagudin on Are there any other pro athlete aspiring EAs? · 2020-09-09T19:16:46.552Z · EA · GW

Sure, I think your views are much more nuanced (sorry, I didn't make it clear). The items I listed are kinda my low-effort impression; in the same mode, I could be tricked into believing the post is written by a mediocre writer when it is actually written by GPT-3). These impressions caused annoyance.


[At this point, I might be overthinking it; forgot how I actually felt.]

Comment by misha_yagudin on AMA: Tobias Baumann, Center for Reducing Suffering · 2020-09-09T16:55:37.522Z · EA · GW

Jonas, I am curious how are you dealing with the above implication?

Comment by misha_yagudin on Are there any other pro athlete aspiring EAs? · 2020-09-09T16:52:13.239Z · EA · GW

re: your first reaction

I think outreach to some athletes might be easier than you think. As part of them rely on evidence-based advice from their coaches. It is plausible that personal experience will make it easier for them to see value in and relate to GiveWell's approach to giving.

Further, maybe, attitudes towards evidence-based medicine could be a proxy to guide outreach initially?

Comment by misha_yagudin on Are there any other pro athlete aspiring EAs? · 2020-09-09T16:50:21.213Z · EA · GW

Hey Ryan, I didn't downvote; but was somewhat annoyed after the first paragraph. I don't have anything against the second and the third; I actually like them especially the third one.

Intuitively, I didn't like your first reaction because it feels too stereotypical: "athletes are dumb." Also, your argument presupposes that high intelligence is needed to engage/understand EA ideas, which feels a bit cringy [as it is sort of self-praising].

I think these considerations might be valid, but they don't feel decisive. [I think they would be fine as a part of a larger discussion about pros/cons or how to do outreach to athletes better. Also, lately, I become much more confused about good conversational norms…]

Comment by misha_yagudin on How have you become more (or less) engaged with EA in the last year? · 2020-09-09T11:07:31.170Z · EA · GW

This comment is currently at 0 karma and 5 votes. I would appreciate it if someone would tell me why did they downvote. I am not questioning the decision; I am looking for a more nuanced perspective on how to have better norms around sensitive topics.

My uncertain guess is that, while the comment's story could improve discussion on conversational norms, being a devil's advocate in a thread about unpleasant and alienating interactions doesn't contribute much to it?

Comment by misha_yagudin on Misha_Yagudin's Shortform · 2020-09-07T22:35:25.290Z · EA · GW

The Bourgeois Virtues: Ethics for an Age of Commerce by Deirdre N. McCloskey

McCloskey’s sweeping, charming, and even humorous survey of ethical thought and economic realities—from Plato to Barbara Ehrenreich—overturns every assumption we have about being bourgeois. Can you be virtuous and bourgeois? Do markets improve ethics? Has capitalism made us better as well as richer? Yes, yes, and yes, argues McCloskey, who takes on centuries of capitalism’s critics with her erudition and sheer scope of knowledge. Applying a new tradition of “virtue ethics” to our lives in modern economies, she affirms American capitalism without ignoring its faults and celebrates the bourgeois lives we actually live, without supposing that they must be lives without ethical foundations.

h/t Gavin

Comment by misha_yagudin on Misha_Yagudin's Shortform · 2020-09-07T22:34:32.793Z · EA · GW

Any reference on the economic history of moral development? Seems like a potentially important topic for research on moral circle expansion.

Comment by misha_yagudin on Suggest a question for Peter Singer · 2020-09-07T15:48:04.051Z · EA · GW
  • People switched from whale oil to electricity, not because of any ethical considerations. Do you think that without any moral advocacy humanity would eventually abolish meat?
  • What are the positive and negative effects of animal advocacy on the adaptation of new food technologies (e.g. Beyond Meat's plant-based patties)?
Comment by misha_yagudin on AMA: Tobias Baumann, Center for Reducing Suffering · 2020-09-07T15:18:22.866Z · EA · GW

What grand futures do suffering-focused altruists tend to imagine? Or in other words, how plausible win conditions look like?

Comment by misha_yagudin on AMA: Tobias Baumann, Center for Reducing Suffering · 2020-09-06T14:04:41.166Z · EA · GW

What are some common misconceptions about the suffering-focused world-view within the EA community?

Comment by misha_yagudin on AMA: Owen Cotton-Barratt, RSP Director · 2020-09-03T14:26:51.529Z · EA · GW

Yeah, on a reflection framing of "working on a paper" is not quite right. So let me be more specific,

  • Prospecting for Gold's impact comes from promoting a certain established way of thinking [≈ econ 101 and ITN] within the EA community and, unclear, if intended or not, also providing local communities with an excellent discussion topic.
  • The expected value of cost-effectiveness of research seems to be dominated by chances of stumbling on considerations for the EA researchers, GiveWell, 80K's career recommendations, etc. 
  • The impact of work on moral uncertainty seems to primarily come from field-building. Doing EA-relevant research within a prestigious branch of philosophy increase odds that more pressing EA questions would be addressed by the next generation of academics.

There are other potentials reasons to do research, say, one might prefer to fully concentrate on mentoring but need to do research for the second-order effects: having prestige for hiring; having scholars' respect for better mentorship; having fresh meta-cognitive observations to emphasize with mentees for better advising). I am curious about which impact pathways do you prioritize?

I feel the most confused about moral uncertainty because it doesn't resonate with my taste and my knowledge of the subject and of field politics is very limited. I hope my oversimplification doesn't diminish/misrepresent your work too much.

Comment by misha_yagudin on Are there robustly good and disputable leadership practices? · 2020-09-02T03:01:44.469Z · EA · GW

re: Appendix — you might be interested in “The Impact of Consulting Services on Small and Medium Enterprises: Evidence from a Randomized Trial in Mexico”, Bruhn et al 2018. h/t Gwern's August 2020 newsletter.

Comment by misha_yagudin on Forecasting Newsletter: August 2020. · 2020-09-01T21:39:37.200Z · EA · GW

Another evidence that extremizing is useful?

We develop a model that predicts that the time until expiration of a prediction market should negatively affect the accuracy of prices as a forecasting tool in the direction of a ‘favourite/longshot bias’. That is, high‐likelihood events are underpriced, and low‐likelihood events are over‐priced.

Comment by misha_yagudin on AMA: Owen Cotton-Barratt, RSP Director · 2020-09-01T20:43:19.029Z · EA · GW

What intellectual progress did you make in the 2010s? (See SSC and Gwern's essays on the question.)

Comment by misha_yagudin on AMA: Owen Cotton-Barratt, RSP Director · 2020-09-01T20:37:18.193Z · EA · GW

Oh, even better! In your What Does (and Doesn’t) AI Mean for Effective Altruism? slide four speaks about different timelines: immediate (~5 years), this generation (~15), next-generation (~40), distant (~100). Which timelines are you optimizing RSP for?

Comment by misha_yagudin on AMA: Owen Cotton-Barratt, RSP Director · 2020-09-01T20:32:55.846Z · EA · GW

A related question: which fraction of your and RSP's impact do you expect to come from direct and from community/field-building?


  • When working on a paper, do you think value consists of field-building or from a small personal chance of, say, coming up with a crucial consideration?
  • Will most value of RSP will come from direct work done by scholars or by scholars [and program] indirectly influencing other people/organizations? [I would count consulting policy-makers as direct work.]
Comment by misha_yagudin on Some thoughts on the EA Munich // Robin Hanson incident · 2020-08-30T21:49:27.725Z · EA · GW

We probably wouldn't know and hence the issue wouldn't ger discussed.

It is plausible that if someone made widely known that they decided not to invite a speaker based on similar considerations it could have been discussed as well. As I expect "X is deplatformed by Y" to provoke a similar response to "X is canceled by Y" by people caring about the incident.

I am not sure it is a case of The Copenhagen Interpretation of Ethics as I doubt people who are arguing against would think that the decision is an improvement upon the status quo.

Comment by misha_yagudin on A tool to estimate COVID risk from common activities · 2020-08-30T21:33:33.421Z · EA · GW

I believe it is "borderline reckless" because 1000 μCoV per event = 0.1% Cov per event and their default risk tolerance is 1% per year [another available option is 0.1% per year]. So you can do such events about one once per month [or per year] before exhausting your tolerance.

Another question is whether 1% or .1% risk tolerance is reasonable. It might be for some age/health cohorts; or for someone really worried/confused about long-term effects [s.a. chronic fatigue from SARS or some unknown-unknowns].

On the other hand, while being cautious, one shouldn't neglect gradual negative effects on mental health and so on.

Comment by misha_yagudin on AMA: Owen Cotton-Barratt, RSP Director · 2020-08-28T19:34:44.042Z · EA · GW

Hey Owen, you have a background in mathematic. What is your favorite theorem/proof/object/definition/algorithm/conjecture/..?