## Posts

Misha_Yagudin's Shortform 2020-01-08T14:23:19.002Z · score: 3 (1 votes)

Comment by misha_yagudin on Aligning Recommender Systems as Cause Area · 2020-10-07T19:52:39.257Z · score: 1 (1 votes) · EA · GW

Partnership on AI now has a paper on the topic: What are you optimizing for? Aligning Recommender Systems with Human Values.

Comment by misha_yagudin on Introducing LEEP: Lead Exposure Elimination Project · 2020-10-06T17:40:48.711Z · score: 12 (7 votes) · EA · GW

I am curious about which other countries you identified as promising?

Listing them might be beneficial, as I can imagine that finding an experienced and well-connected candidate for a target location can change the outcome of cost-effectiveness calculation by increasing tractability. On other hand, good candidates might not be hard to find or be especially likely discovered via the EA network.

Comment by misha_yagudin on Singapore’s Technical AI Alignment Research Career Guide · 2020-10-03T12:47:04.352Z · score: 4 (3 votes) · EA · GW

This forecast suggests that extreme reputational risks are non-negligible.

Comment by misha_yagudin on Singapore’s Technical AI Alignment Research Career Guide · 2020-10-03T12:45:00.940Z · score: 6 (4 votes) · EA · GW

Working for SenseTime might be associated with reputational risks, according to FT:

The US blacklisted Megvii and SenseTime in October, along with voice recognition company iFlytek and AI unicorn Yitu, accusing the companies of aiding the “repression, mass arbitrary detention and high-technology surveillance” in the western Chinese region of Xinjiang.

At the same time, someone working for them might provide our community with cultural knowledge relevant to surveillance and robust totalitarianism.

Comment by misha_yagudin on Linch's Shortform · 2020-09-29T10:50:35.580Z · score: 2 (2 votes) · EA · GW

I think it is useful to separately deal with the parts of a disturbing event over which you have an internal or external locus of control. Let's take a look at riots:

• An external part is them happening in your country. External locus of control means that you need to accept the situation. Consider looking into Stoic literature and exercises (say, negative visualizations) to come to peace with that possibility.
• An internal part is being exposed to dangers associated with them. Internal locus of control means that you can take action to mitigate the risks. Consider having a plan to temporarily move to a likely peaceful area within your country or to another county.
Comment by misha_yagudin on AMA: Markus Anderljung (PM at GovAI, FHI) · 2020-09-21T22:15:42.287Z · score: 6 (5 votes) · EA · GW

Any insights into what constitutes good research management on the levels of (a) a facilitator helping a lab to succeed, and (b) an individual researcher managing himself (and occasional collaborators)?

Comment by misha_yagudin on The case for building more and better epistemic institutions in the effective altruism community · 2020-09-20T21:06:00.665Z · score: 18 (5 votes) · EA · GW

Roam Research is

> starting a fellowship program where we are giving grants to researchers to explore the space of Tools for Thought, Collective Intelligence, Augmenting The Human Intellect.

They recently raised $9M at a$200M seed evaluation and previously received two grants from EA LTFF.

Comment by misha_yagudin on New book: Moral Uncertainty by MacAskill, Ord & Bykvist · 2020-09-20T16:11:30.465Z · score: 2 (2 votes) · EA · GW

Now a thread from Toby Ord:

What is moral uncertainty?

Comment by misha_yagudin on Pablo Stafforini’s Forecasting System · 2020-09-17T12:46:07.561Z · score: 2 (2 votes) · EA · GW

I use Emacs for my personal forecasts because it is convenient: the questions are in the todo-list, I can resolve the question with a few keystrokes, TODO-states make questions look beautiful, a small python script gives me a calibration chart…

To be honest, all major forecasting platforms have quite bad UX for small personal things, it always takes to many clicks to make forecasting question and so on. I wish they'd popularize personal predictions by having sort of "very quick capture" like many todo-list apps have [e.g. Amazing Marvin].

I forecast much fewer questions on GJ Open and found Tab Snooze to be an easy way to remind me that I wanted to make updates/take a look at new data.

Comment by misha_yagudin on Judgement as a key need in EA · 2020-09-12T16:57:39.437Z · score: 1 (1 votes) · EA · GW

I like the list of resources you put together, another laconic source of wisdom is What can someone do now to become a stronger fit for future Open Philanthropy generalist RA openings?.

Comment by misha_yagudin on Judgement as a key need in EA · 2020-09-12T16:54:31.419Z · score: 11 (5 votes) · EA · GW

Hey Ben, what makes you think that judgment can be generally improved?

When Owen posted "Good judgement" and its components, I briefly reviewed the literature on  transfer of cognitive skills:

This makes me think that general training (e.g. calibration and to a lesser extent forecasting) might not translate to an overall improvement in judgment. OTOH, surely, getting skills broadly useful for decision making (e.g. spreadsheets, probabilistic reasoning, clear writing) should be good.

A bit of a tangent. Hanson's Reality TV MBAs is an interesting idea. Gaining experience via being a personal assistant to someone else seems to be beneficial², so maybe this could be scaled up by having a reality TV show. Maybe it is a good idea to invite people with good judgment/research taste to stream some of their working sessions and so on?

[1]: According to Wikipedia: Near transfer occurs when many elements overlap between the conditions in which the learner obtained the knowledge or skill and the new situation. Far transfer occurs when the new situation is very different from that in which learning occurred.

[2]: Moreover, it is one of the 80K's paths that may turn out to be very promising.

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-12T15:40:17.296Z · score: 1 (1 votes) · EA · GW

Thanks for challenging me :) I wrote my takes after this discussion above.

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-12T15:37:16.488Z · score: 1 (1 votes) · EA · GW

This example is somewhat flawed (because forecasting only once breaks the assumption I am making) but might challenge your intuitions a bit :)

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-12T15:35:46.236Z · score: 2 (2 votes) · EA · GW

Thanks, everyone, for engaging with me. I will summarize my thoughts and would likely not actively comment here anymore:

• I think the argument holds given the assumption [(a) probability to forecast on each day are proportional for the forecasters (previously we assumed uniformity) + (b) expected number of active days] I made.
• > I think intuition to use here is that the sample mean is an unbiased estimator of expectation (this doesn't depend on the frequency/number of samples). One complication here is that we are weighing samples potentially unequally, but if we expect each forecast to be active for an equal number of days this doesn't matter.
• The second assumption seems to be approximately correct assuming the uniformity but stops working on the edge [around the resolution date], which impacts the average score on the order of .
• This effect could be noticeable, this is an update.
• Overall, given the setup, I think that forecasting weekly vs. daily shouldn't differ much for forecasts with a resolution date in 1y.
• I intended to use this toy model to emphasize that the important difference between the active and semi-active forecasters is the distribution of days they forecast on.
• This difference, in my opinion, is mostly driven by the 'information gain' (e.g. breaking news, pull is published, etc).
• This makes me skeptical about features s.a. automatic decay and so on.
• This makes me curious about ways to integrate information sources automatically.
• And less so about notifications that community/followers forecasts have significantly changed. [It is already possible to sort by the magnitude of crowd update since your last forecast on GJO].

On a meta-level, I am

• I would encourage people to think more carefully through my argument.
• This makes me doubt I am correct, but still, I am quite certain. I undervalued the corner cases in the initial reasoning. I think I might undervalue other phenomena, where models don't capture reality well and hence triggers people's intuitions:
• E.g. randomness of the resolution day might magnify the effect of the second assumption not holding, but it seems like it shouldn't be given that in expectation one resolves the question exactly once.
• Confused about not being able to communicate my intuitions effectively.
• I would appreciate any feedback [not necessary on communication], I have a way to submit it anonymously: https://admonymous.co/misha
Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-12T11:39:12.138Z · score: 1 (1 votes) · EA · GW

I mildly disagree. I think intuition to use here is that the sample mean is an unbiased estimator of expectation (this doesn't depend on frequency/number of samples). One complication here is that we are weighing samples potentially unequally, but if we expect each forecast to be active for an equal number of days this doesn't matter.

ETA: I think the assumption of "forecasts have an equal expected number of active days" breaks around the closing date, which impacts things in the monotonical example (this effect is linear in the expected number of active days and could be quite big in extremes).

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-12T10:43:52.737Z · score: 2 (2 votes) · EA · GW

re: limit — a nice example. Please notice, that Bob makes a forecast on a (uniformly) random day, so when you take an expectation over the days he is making forecasts on you get the average of scores for all days as if he forecasted every day.

Let be the number of total days, be the probability Bob forecasted on a day , be the brier score of the forecast made on day :

I am a bit surprised that it worked out here because it breaks the assumption of the equality of the expected number of days forecast will be active. Lack of this assumption will play out if when aggregating over multiple questions [weighted by the number of active days]. Still, I hope this example gives helpful intuitions

.

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-11T23:16:27.744Z · score: 1 (1 votes) · EA · GW

Here is a sketch of a formal argument, which will show that freshness doesn't matter much.

Let's calculate the average Brier score of a forecaster. We can see the contribution of hypothetical forecasts on day toward sum: . If forecasts are sufficiently random the expected number of days forecasts are active should be equal. Because , expected average Brier score is equal to the average of Briers scores for all days.

Comment by misha_yagudin on New book: Moral Uncertainty by MacAskill, Ord & Bykvist · 2020-09-11T18:57:40.450Z · score: 9 (5 votes) · EA · GW

Here’s an informal history and summary, in tweet form.

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-11T15:28:22.599Z · score: 1 (1 votes) · EA · GW

Aha, of the top of my head one might go in the directions of (a) TD-learning type of reward; (b) variance reduction for policy evaluation.

After thinking for a few more minutes, it seems that forecasting more often but at random moments shouldn't impact the expected Brier score. But in practice people frequent forecasters are evaluated with respect to a different distribution (which favors information gain/"something relevant just happen") — so maybe some sort of importance sampling might help to equalize these two groups?

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-11T00:34:03.073Z · score: 2 (2 votes) · EA · GW

Also, because the Median score is the median of all Brier scores (and not Brier score of the median forecast) it might still be good for your Accuracy score to forecast something close to community's median.

Comment by misha_yagudin on Challenges in evaluating forecaster performance · 2020-09-11T00:28:09.585Z · score: 2 (2 votes) · EA · GW

https://www.gjopen.com/faq says:

To determine your accuracy over the lifetime of a question, we calculate a Brier score for every day on which you had an active forecast, then take the average of those daily Brier scores and report it on your profile page. On days before you make your first forecast on a question, you do not receive a Brier score. Once you make a forecast on a question, we carry that forecast forward each day until you update it by submitting a new forecast.

Comment by misha_yagudin on Are there any other pro athlete aspiring EAs? · 2020-09-09T19:16:46.552Z · score: 2 (2 votes) · EA · GW

Sure, I think your views are much more nuanced (sorry, I didn't make it clear). The items I listed are kinda my low-effort impression; in the same mode, I could be tricked into believing the post is written by a mediocre writer when it is actually written by GPT-3). These impressions caused annoyance.

[At this point, I might be overthinking it; forgot how I actually felt.]

Comment by misha_yagudin on AMA: Tobias Baumann, Center for Reducing Suffering · 2020-09-09T16:55:37.522Z · score: 1 (3 votes) · EA · GW

Jonas, I am curious how are you dealing with the above implication?

Comment by misha_yagudin on Are there any other pro athlete aspiring EAs? · 2020-09-09T16:52:13.239Z · score: 5 (3 votes) · EA · GW

I think outreach to some athletes might be easier than you think. As part of them rely on evidence-based advice from their coaches. It is plausible that personal experience will make it easier for them to see value in and relate to GiveWell's approach to giving.

Further, maybe, attitudes towards evidence-based medicine could be a proxy to guide outreach initially?

Comment by misha_yagudin on Are there any other pro athlete aspiring EAs? · 2020-09-09T16:50:21.213Z · score: 22 (21 votes) · EA · GW

Hey Ryan, I didn't downvote; but was somewhat annoyed after the first paragraph. I don't have anything against the second and the third; I actually like them especially the third one.

Intuitively, I didn't like your first reaction because it feels too stereotypical: "athletes are dumb." Also, your argument presupposes that high intelligence is needed to engage/understand EA ideas, which feels a bit cringy [as it is sort of self-praising].

I think these considerations might be valid, but they don't feel decisive. [I think they would be fine as a part of a larger discussion about pros/cons or how to do outreach to athletes better. Also, lately, I become much more confused about good conversational norms…]

Comment by misha_yagudin on How have you become more (or less) engaged with EA in the last year? · 2020-09-09T11:07:31.170Z · score: 19 (10 votes) · EA · GW

This comment is currently at 0 karma and 5 votes. I would appreciate it if someone would tell me why did they downvote. I am not questioning the decision; I am looking for a more nuanced perspective on how to have better norms around sensitive topics.

My uncertain guess is that, while the comment's story could improve discussion on conversational norms, being a devil's advocate in a thread about unpleasant and alienating interactions doesn't contribute much to it?

Comment by misha_yagudin on Misha_Yagudin's Shortform · 2020-09-07T22:35:25.290Z · score: 1 (1 votes) · EA · GW

The Bourgeois Virtues: Ethics for an Age of Commerce by Deirdre N. McCloskey

McCloskey’s sweeping, charming, and even humorous survey of ethical thought and economic realities—from Plato to Barbara Ehrenreich—overturns every assumption we have about being bourgeois. Can you be virtuous and bourgeois? Do markets improve ethics? Has capitalism made us better as well as richer? Yes, yes, and yes, argues McCloskey, who takes on centuries of capitalism’s critics with her erudition and sheer scope of knowledge. Applying a new tradition of “virtue ethics” to our lives in modern economies, she affirms American capitalism without ignoring its faults and celebrates the bourgeois lives we actually live, without supposing that they must be lives without ethical foundations.

h/t Gavin

Comment by misha_yagudin on Misha_Yagudin's Shortform · 2020-09-07T22:34:32.793Z · score: 1 (1 votes) · EA · GW

Any reference on the economic history of moral development? Seems like a potentially important topic for research on moral circle expansion.

Comment by misha_yagudin on Suggest a question for Peter Singer · 2020-09-07T15:48:04.051Z · score: 3 (3 votes) · EA · GW
• People switched from whale oil to electricity, not because of any ethical considerations. Do you think that without any moral advocacy humanity would eventually abolish meat?
• What are the positive and negative effects of animal advocacy on the adaptation of new food technologies (e.g. Beyond Meat's plant-based patties)?
Comment by misha_yagudin on AMA: Tobias Baumann, Center for Reducing Suffering · 2020-09-07T15:18:22.866Z · score: 11 (4 votes) · EA · GW

What grand futures do suffering-focused altruists tend to imagine? Or in other words, how plausible win conditions look like?

Comment by misha_yagudin on AMA: Tobias Baumann, Center for Reducing Suffering · 2020-09-06T14:04:41.166Z · score: 26 (11 votes) · EA · GW

What are some common misconceptions about the suffering-focused world-view within the EA community?

Comment by misha_yagudin on AMA: Owen Cotton-Barratt, RSP Director · 2020-09-03T14:26:51.529Z · score: 1 (1 votes) · EA · GW

Yeah, on a reflection framing of "working on a paper" is not quite right. So let me be more specific,

• Prospecting for Gold's impact comes from promoting a certain established way of thinking [≈ econ 101 and ITN] within the EA community and, unclear, if intended or not, also providing local communities with an excellent discussion topic.
• The expected value of cost-effectiveness of research seems to be dominated by chances of stumbling on considerations for the EA researchers, GiveWell, 80K's career recommendations, etc.
• The impact of work on moral uncertainty seems to primarily come from field-building. Doing EA-relevant research within a prestigious branch of philosophy increase odds that more pressing EA questions would be addressed by the next generation of academics.

There are other potentials reasons to do research, say, one might prefer to fully concentrate on mentoring but need to do research for the second-order effects: having prestige for hiring; having scholars' respect for better mentorship; having fresh meta-cognitive observations to emphasize with mentees for better advising). I am curious about which impact pathways do you prioritize?

I feel the most confused about moral uncertainty because it doesn't resonate with my taste and my knowledge of the subject and of field politics is very limited. I hope my oversimplification doesn't diminish/misrepresent your work too much.

Comment by misha_yagudin on Are there robustly good and disputable leadership practices? · 2020-09-02T03:01:44.469Z · score: 3 (2 votes) · EA · GW

re: Appendix — you might be interested in “The Impact of Consulting Services on Small and Medium Enterprises: Evidence from a Randomized Trial in Mexico”, Bruhn et al 2018. h/t Gwern's August 2020 newsletter.

Comment by misha_yagudin on Forecasting Newsletter: August 2020. · 2020-09-01T21:39:37.200Z · score: 3 (3 votes) · EA · GW

Another evidence that extremizing is useful?

We develop a model that predicts that the time until expiration of a prediction market should negatively affect the accuracy of prices as a forecasting tool in the direction of a ‘favourite/longshot bias’. That is, high‐likelihood events are underpriced, and low‐likelihood events are over‐priced.

Comment by misha_yagudin on AMA: Owen Cotton-Barratt, RSP Director · 2020-09-01T20:43:19.029Z · score: 8 (4 votes) · EA · GW

What intellectual progress did you make in the 2010s? (See SSC and Gwern's essays on the question.)

Comment by misha_yagudin on AMA: Owen Cotton-Barratt, RSP Director · 2020-09-01T20:37:18.193Z · score: 6 (4 votes) · EA · GW

Oh, even better! In your What Does (and Doesn’t) AI Mean for Effective Altruism? slide four speaks about different timelines: immediate (~5 years), this generation (~15), next-generation (~40), distant (~100). Which timelines are you optimizing RSP for?

Comment by misha_yagudin on AMA: Owen Cotton-Barratt, RSP Director · 2020-09-01T20:32:55.846Z · score: 3 (2 votes) · EA · GW

A related question: which fraction of your and RSP's impact do you expect to come from direct and from community/field-building?

E.g.

• When working on a paper, do you think value consists of field-building or from a small personal chance of, say, coming up with a crucial consideration?
• Will most value of RSP will come from direct work done by scholars or by scholars [and program] indirectly influencing other people/organizations? [I would count consulting policy-makers as direct work.]
Comment by misha_yagudin on Some thoughts on the EA Munich // Robin Hanson incident · 2020-08-30T21:49:27.725Z · score: 3 (2 votes) · EA · GW

We probably wouldn't know and hence the issue wouldn't ger discussed.

It is plausible that if someone made widely known that they decided not to invite a speaker based on similar considerations it could have been discussed as well. As I expect "X is deplatformed by Y" to provoke a similar response to "X is canceled by Y" by people caring about the incident.

I am not sure it is a case of The Copenhagen Interpretation of Ethics as I doubt people who are arguing against would think that the decision is an improvement upon the status quo.

Comment by misha_yagudin on microCOVID.org: A tool to estimate COVID risk from common activities · 2020-08-30T21:33:33.421Z · score: 4 (3 votes) · EA · GW

I believe it is "borderline reckless" because 1000 μCoV per event = 0.1% Cov per event and their default risk tolerance is 1% per year [another available option is 0.1% per year]. So you can do such events about one once per month [or per year] before exhausting your tolerance.

Another question is whether 1% or .1% risk tolerance is reasonable. It might be for some age/health cohorts; or for someone really worried/confused about long-term effects [s.a. chronic fatigue from SARS or some unknown-unknowns].

On the other hand, while being cautious, one shouldn't neglect gradual negative effects on mental health and so on.

Comment by misha_yagudin on AMA: Owen Cotton-Barratt, RSP Director · 2020-08-28T19:34:44.042Z · score: 12 (6 votes) · EA · GW

Hey Owen, you have a background in mathematic. What is your favorite theorem/proof/object/definition/algorithm/conjecture/..?

Comment by misha_yagudin on Misha_Yagudin's Shortform · 2020-08-19T13:58:35.040Z · score: 11 (4 votes) · EA · GW

Seems like EA Munich canceled a meetup with Hanson; here is their reasoning.

Comment by misha_yagudin on 2020 Top Charity Ideas - Charity Entrepreneurship · 2020-08-19T01:12:27.624Z · score: 8 (5 votes) · EA · GW

Could Guided self-help be turned into Task Y?

Giving out CBT books or apps and then checking in with people a few times for accountability and caring seems quite effective (based on your analysis) and potentially very satisfying (based on personal experience) endeavor.

It seems to score highly on the first and the third requirements. It might not work well with the second as EA network is rather homophilic, but I am not sure; also, reasons behind The Iron Law Of Evaluation furthers my skepticism.

• Task Y is something that can be performed usefully by people who are not currently able to choose their career path entirely based on EA concerns.
• Task Y is clearly effective and doesn't become much less effective the more people who are doing it.
• The positive effects of Task Y are obvious to the person doing the task.
Comment by misha_yagudin on vaidehi_agarwalla's Shortform · 2020-08-07T04:30:51.476Z · score: 1 (1 votes) · EA · GW

If longtermism is one of the latest stages of moral circle development than your anecdotal data suffers from major selection effects.

Anecdotally seems true from a number of EAs I've spoken to who've updated to longtermism over time.

Comment by misha_yagudin on Recommendations for increasing empathy? · 2020-08-02T09:17:14.686Z · score: 8 (4 votes) · EA · GW

You might want to read some essays from Effective Altruism Handbook: Motivation Series. I especially like 500 Million, But Not A Single One More, it is short and powerful.

Comment by misha_yagudin on The academic contribution to AI safety seems large · 2020-08-01T08:49:53.584Z · score: 1 (1 votes) · EA · GW

Thanks; fixed.

Comment by misha_yagudin on The academic contribution to AI safety seems large · 2020-07-31T11:22:35.233Z · score: 5 (3 votes) · EA · GW

On the other hand, in 2018's review MIRI wrote about new research directions, one of which feels ML adjacent. But from a few paragraphs, it doesn't seem that the direction is relevant for prosaic AI alignment.

Seeking entirely new low-level foundations for optimization, designed for transparency and alignability from the get-go, as an alternative to gradient-descent-style machine learning foundations.

Comment by misha_yagudin on The academic contribution to AI safety seems large · 2020-07-31T11:17:33.577Z · score: 14 (4 votes) · EA · GW

Indeed, Why I am not currently working on the AAMLS agenda is a year-later write up by the lead researcher. Moreover, they write:

That is, though I was officially lead on AAMLS, I mostly did other things in that time period.

Comment by misha_yagudin on AMA or discuss my 80K podcast episode: Ben Garfinkel, FHI researcher · 2020-07-19T22:49:16.322Z · score: 9 (4 votes) · EA · GW

Oh, I meant pessimistic. A reason for a weak update might similar to Gell-Man amnesia effect. After putting effort into classical arguments you noticed some important flaws. The fact that have not been articulated before suggests that collective EA epistemology is weaker than expected. Because of that one might get less certain about quality of arguments in other EA domains.

So, in short, the Gell-Mann Amnesia effect is when experts forget how badly their own subject is treated in media and believe that subjects they don't know much about are treated more competently by the same media.
Comment by misha_yagudin on Misha_Yagudin's Shortform · 2020-07-18T15:14:46.894Z · score: 4 (4 votes) · EA · GW

Estimates from The Precipice.

| Stellar explosion             | 1 in 1,000,000,000 || Asteroid or comet impact      | 1 in 1,000,000     || Supervolcanic eruption        | 1 in 10,000        || “Naturally” arising pandemics | 1 in 10,000        ||-------------------------------+--------------------|| Total natural risk            | 1 in 10,000        || Nuclear war                       | 1 in 1,000 || Climate change                    | 1 in 1,000 || Other environmental damage        | 1 in 1,000 || Engineered pandemics              | 1 in 30    || Unaligned artificial intelligence | 1 in 10    || Unforeseen anthropogenic risks    | 1 in 30    || Other anthropogenic risks         | 1 in 50    ||-----------------------------------+------------|| Total anthropogenic risk          | 1 in 6     ||------------------------+--------|| Total existential risk | 1 in 6 |
Comment by misha_yagudin on AMA or discuss my 80K podcast episode: Ben Garfinkel, FHI researcher · 2020-07-18T15:12:57.560Z · score: 1 (1 votes) · EA · GW

Have your become more uncertain/optimistic about the arguments in favour of importance of other x-risks as a result of scrutinising AI risk?