My personal cruxes for working on AI safety 2020-02-13T07:11:46.803Z · score: 114 (49 votes)
Thoughts on doing good through non-standard EA career pathways 2019-12-30T02:06:03.032Z · score: 137 (62 votes)
"EA residencies" as an outreach activity 2019-11-17T05:08:42.119Z · score: 85 (40 votes)
I'm Buck Shlegeris, I do research and outreach at MIRI, AMA 2019-11-15T22:44:17.606Z · score: 118 (60 votes)
A way of thinking about saving vs improving lives 2015-08-08T19:57:30.985Z · score: 2 (4 votes)


Comment by buck on The best places to donate for COVID-19 · 2020-03-28T02:11:06.977Z · score: 13 (3 votes) · EA · GW

You say that the impact/scale of COVID is "huge". I think this might mislead people who are used to thinking about the problems EAs think about. Here's why.

I think COVID is probably going to cause on the order of 100 million DALYs this year, based on predictions like this; I think that 50-95% the damage ever done by COVID will be done this year. On the scale that 80000 Hours uses to assess the scale of problems, this would be ranked as importance level 11 or so.

I think this is lower than most things EAs consider working on or funding. For example:

This is a logarithmic scale, so for example, according to this scale, health in poor countries is 100 times more important than COVID.

So given that COVID seems likely to be between 100x and 10000x less important than the main other cause areas EAs think about, I think it's misleading to describe its scale as "huge".

Comment by buck on What are the key ongoing debates in EA? · 2020-03-10T04:13:21.733Z · score: 13 (5 votes) · EA · GW

I'm interested in betting about whether 20% of EAs think psychedelics are a plausible top EA cause area. Eg we could sample 20 EAs from some group and ask them. Perhaps we could ask random attendees from last year's EAG. Or we could do a poll in EA Hangout.

Comment by buck on On Becoming World-Class · 2020-02-26T05:05:38.619Z · score: 8 (7 votes) · EA · GW

I think that it's important for EA to have a space where we can communicate efficiently, rather than phrase everything for the benefit of newcomers who might be reading, so I think that this is bad advice.

Comment by buck on My personal cruxes for working on AI safety · 2020-02-25T18:08:56.430Z · score: 6 (4 votes) · EA · GW

I'd prefer something like the weaker and less clear statement "we **can** think ahead, and it's potentially valuable to do so even given the fact that people might try to figure this all out later".

Comment by buck on My personal cruxes for working on AI safety · 2020-02-25T16:25:34.040Z · score: 2 (1 votes) · EA · GW

I think your summary of crux three is slightly wrong: I didn’t say that we need to think about it ahead of time, I just said that we can.

Comment by buck on My personal cruxes for working on AI safety · 2020-02-25T16:24:18.068Z · score: 7 (5 votes) · EA · GW

Yeah, for the record I also think those are pretty plausible and important sources of impact for AI safety research.

I think that either way, it’s useful for people to think about which of these paths to impact they’re going for with their research.

Comment by buck on Max_Daniel's Shortform · 2020-02-23T01:00:51.794Z · score: 14 (7 votes) · EA · GW
My guess is I consider the activities you mentioned less valuable than you do. Probably the difference is largest for programming at MIRI and smallest for Hubinger-style AI safety research. (This would probably be a bigger discussion.)

I don't think that peculiarities of what kinds of EA work we're most enthusiastic about lead to much of the disagreement. When I imagine myself taking on various different people's views about what work would be most helpful, most of the time I end up thinking that valuable contributions could be made to that work by sufficiently talented undergrads.

Independent of this, my guess would be that EA does have a decent number of unidentified people who would be about as good as people you've identified. E.g., I can think of ~5 people off the top of my head of whom I think they might be great at one of the things you listed, and if I had your view on their value I'd probably think they should stop doing what they're doing now and switch to trying one of these things. And I suspect if I thought hard about it, I could come up with 5-10 more people - and then there is the large number of people neither of us has any information about.

I am pretty skeptical of this. Eg I suspect that people like Evan (sorry Evan if you're reading this for using you as a running example) are extremely unlikely to remain unidentified, because one of the things that they do is think about things in their own time and put the results online. Could you name a profile of such a person, and which of the types of work I named you think they'd maybe be as good at as the people I named?

It might be quite relevant if "great people" refers only to talent or also to beliefs and values/preferences

I am not intending to include beliefs and preferences in my definition of "great person", except for preferences/beliefs like being not very altruistic, which I do count.

E.g. my guess is that there are several people who could be great at functional programming who either don't want to work for MIRI, or don't believe that this would be valuable. (This includes e.g. myself.)

I think my definition of great might be a higher bar than yours, based on the proportion of people who I think meet it? (To be clear I have no idea how good you'd be at programming for MIRI because I barely know you, and so I'm just talking about priors rather than specific guesses about you.)


For what it's worth, I think that you're not credulous enough of the possibility that the person you talked to actually disagreed with you--I think you might doing that thing whose name I forget where you steelman someone into saying the thing you think instead of the thing they think.

Comment by buck on My personal cruxes for working on AI safety · 2020-02-21T05:55:30.655Z · score: 7 (5 votes) · EA · GW
For the problems-that-solve-themselves arguments, I feel like your examples have very "good" qualities for solving themselves: both personal and economic incentives are against them, they are obvious when one is confronted with the situation, and at the point where the problems becomes obvious, you can still solve them. I would argue that not all these properties holds for AGI. What are your thoughts about that?

I agree that it's an important question whether AGI has the right qualities to "solve itself". To go through the ones you named:

  • "Personal and economic incentives are aligned against them"--I think AI safety has somewhat good properties here. Basically no-one wants to kill everyone, and AI systems that aren't aligned with their users are much less useful. On the other hand, it might be the case that people are strongly incentivised to be reckless and deploy things quickly.
  • "they are obvious when one is confronted with the situation"--I think that alignment problems might be fairly obvious, especially if there's a long process of continuous AI progress where unaligned non-superintelligent AI systems do non-catastrophic damage. So this comes down to questions about how rapid AI progress will be.
  • "at the point where the problems become obvious, you can still solve them"--If the problems become obvious because non-superintelligent AI systems are behaving badly, then we can still maybe put more effort into aligning increasingly powerful AI systems after that and hopefully we won't lose that much of the value of the future.
Comment by buck on Max_Daniel's Shortform · 2020-02-21T05:36:33.990Z · score: 15 (6 votes) · EA · GW

I'm not quite sure how high your bar is for "experience", but many of the tasks that I'm most enthusiastic about in EA are ones which could plausibly be done by someone in their early 20s who eg just graduated university. Various tasks of this type:

  • Work at MIRI on various programming tasks which require being really smart and good at math and programming and able to work with type theory and Haskell. Eg we recently hired Seraphina Nix to do this right out of college. There are other people who are recent college graduates who we offered this job to who didn't accept. These people are unusually good programmers for their age, but they're not unique. I'm more enthusiastic about hiring older and more experienced people, but that's not a hard requirement. We could probably hire several more of these people before we became bottlenecked on management capacity.
  • Generalist AI safety research that Evan Hubinger does--he led the writing of "Risks from Learned Optimization" during a summer internship at MIRI; before that internship he hadn't had much contact with the AI safety community in person (though he'd read stuff online).
    • Richard Ngo is another young AI safety researcher doing lots of great self-directed stuff; I don't think he consumed an enormous amount of outside resources while becoming good at thinking about this stuff.
  • I think that there are inexperienced people who could do really helpful work with me on EA movement building; to be good at this you need to have read a lot about EA and be friendly and know how to talk to lots of people.

My guess is that EA does not have a lot of unidentified people who are as good at these things as the people I've identified.

I think that the "EA doesn't have enough great people" problem feels more important to me than the "EA has trouble using the people we have" problem.

Comment by buck on My personal cruxes for working on AI safety · 2020-02-20T17:25:14.538Z · score: 5 (3 votes) · EA · GW
One underlying hypothesis that was not explicitly pointed out, I think, was that you are looking for priority arguments. That is, part of your argument is about whether AI safety research is the most important thing you could do (It might be so obvious in an EA meeting or the EA forum that it's not worth exploring, but I like expliciting the obvious hypotheses).

This is a good point.

Whereas you could argue that without pure mathematics, almost all the positive technological progress we have now (from quantum mechanics to computer science) would not exist.

I feel pretty unsure on this point; for a contradictory perspective you might enjoy this article.

Comment by buck on Do impact certificates help if you're not sure your work is effective? · 2020-02-14T06:47:46.405Z · score: 6 (3 votes) · EA · GW

[for context, I've talked to Eli about this in person]

I'm interpreting you as having two concerns here.

Firstly, you're asking why this is different than you deferring to people about the impact of the two orgs.

From my perspective, the nice thing about the impact certificate setup is that if you get paid in org B impact certificates, you're making the person at orgs A and B put their money where their mouth is. Analogously, suppose Google is trying to hire me, but I'm actually unsure about Google's long term profitability, and I'd rather be paid in Facebook stock than Google stock. If Google pays me in Facebook stock, I'm not deferring to them about the relative values of these stocks, I'm just getting paid in Facebook stock, such that if Google is overvalued it's no longer my problem, it's the problem of whoever traded their Facebook stock for Google stock.

The reason why I think that the policy of maximizing impact certificates is better for the world in this case is that I think that people are more likely to give careful answers to the question "how relatively valuable is the work orgs A and B are doing" if they're thinking about it in terms of trying to make trades than if some random EA is asking for their quick advice.


Secondly, you're worrying that people might end up seeming like they're endorsing an org that they don't endorse, and that this might harm community epistemics. This is an interesting objection that I haven't thought much about. A few possible responses:

  • It's already currently an issue that people have different amounts of optimism about their workplaces, and people don't very often publicly state how much they agree and disagree with their employer (though I personally try to be clear about this). It's unlikely that impact equity trades will exacerbate this problem much.
  • Also, people often work at places for reasons that aren't "I think this is literally the best org", eg:
    • comparative advantage
    • thinking that the job is fun
    • the job paying them a high salary (this is exactly analogous to them paying in impact equity of a different org)
    • thinking that the job will give you useful experience
    • random fluke of who happened to offer you a job at a particular point
    • thinking the org is particularly flawed and so you can do unusual amounts of good by pushing it in a good direction
  • Also, if there were liquid markets in the impact equity of different orgs, then we'd have access to much higher-quality information about the community's guess about the relative promisingness of different orgs. So pushing in this direction would probably be overall helpful.
Comment by buck on My personal cruxes for working on AI safety · 2020-02-13T19:27:26.015Z · score: 15 (9 votes) · EA · GW
This was nice to read, because I'm not sure I've ever seen anyone actually admit this before.

Not everyone agrees with me on this point. Many safety researchers think that their path to impact is by establishing a strong research community around safety, which seems more plausible as a mechanism to affect the world 50 years out than the "my work is actually relevant" plan. (And partially for this reason, these people tend to do different research to me.)

You say you think there's a 70% chance of AGI in the next 50 years. How low would that probability have to be before you'd say, "Okay, we've got a reasonable number of people to work on this risk, we don't really need to recruit new people into AI safety"?

I don't know what the size of the AI safety field is such that marginal effort is better spent elsewhere. Presumably this is a continuous thing rather than a discrete thing. Eg it seems to me that now compared to five years ago, there are way more people in AI safety and so if your comparative advantage is in some other way of positively influencing the future, you should more strongly consider that other thing.

Comment by buck on Thoughts on doing good through non-standard EA career pathways · 2020-01-09T08:05:54.014Z · score: 6 (4 votes) · EA · GW
What do you think about participating in a forecasting platform, e.g. Good Judgement Open or Metaculus? It seems to cover all ingredients, and even be a good signal for others to evaluate your judgement quality.

Seems pretty good for predicting things about the world that get resolved on short timescales. Sadly it seems less helpful for practicing judgement about things like the following:

  • judging arguments about things like the moral importance of wild animal suffering, plausibility of AI existential risk, and existence of mental illness
  • long-term predictions
  • predictions about small-scale things like how a project should be organized (though you can train calibration on this kind of question)

Re my own judgement: I appreciate your confidence in me. I spend a lot of time talking to people who have IMO better judgement than me; most of the things I say in this post (and a reasonable chunk of things I say other places) are my rephrasings of their ideas. I think that people whose judgement I trust would agree with my assessment of my judgement quality as "good in some ways" (this was the assessment of one person I asked about this in response to your comment).

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-22T05:24:25.466Z · score: 10 (4 votes) · EA · GW
It seems that your current strategy is to focus on training, hiring and outreaching to the most promising talented individuals.

This seems like a pretty good summary of the strategy I work on, and it's the strategy that I'm most optimistic about.

Other alternatives might include more engagement with amatures, and providing more assistance for groups and individuals that want to learn and conduct independent research.

I think that it would be quite costly and difficult for more experienced AI safety researchers to try to cause more good research to happen by engaging more with amateurs or providing more assistance to independent research. So I think that experienced AI safety researchers are probably going to do more good by spending more time on their own research than by trying to help other people with theirs. This is because I think that experienced and skilled AI safety researchers are much more productive than other people, and because I think that a reasonably large number of very talented math/CS people become interested in AI safety every year, so we can set a pretty high bar for which people to spend a lot of time with.

Also, what would change if you had 10 times the amount of management and mentorship capacity?

If I had ten times as many copies of various top AI safety researchers and I could only use them for management and mentorship capacity, I'd try to get them to talk to many more AI safety researchers, through things like weekly hour-long calls with PhD students, or running more workshops like MSFP.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-22T05:16:19.716Z · score: 18 (7 votes) · EA · GW
I’m a fairly good ML student who wants to decide on a research direction for AI Safety.

I'm not actually sure whether I think it's a good idea for ML students to try to work on AI safety. I am pretty skeptical of most of the research done by pretty good ML students who try to make their research relevant to AI safety--it usually feels to me like their work ends up not contributing to one of the core difficulties, and I think that they might have been better off if they'd instead spent their effort trying to become really good at ML in the hope of being better skilled up with the goal of working on AI safety later.

I don't have very much better advice for how to get started on AI safety; I think the "recommend to apply to AIRCS and point at 80K and maybe the Alignment Newsletter" path is pretty reasonable.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-22T05:12:48.818Z · score: 11 (5 votes) · EA · GW

It was a good time; I appreciate all the thoughtful questions.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-22T05:08:39.326Z · score: 12 (6 votes) · EA · GW

Most of them are related to AI alignment problems, but it's possible that I should work specifically on them rather than other parts of AI alignment.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-22T05:07:53.499Z · score: 4 (2 votes) · EA · GW
I suppose that the latter goes a long way towards explaining the former.

Yeah, I suspect you're right.

Personally, there are few technologies that I think are likely to radically change the world within the next 100 years (assuming that your definition of radical is similar to mine). Maybe the only ones that would really qualify are bioengineering and nanotech. Even in those fields, though, I expect the pace of change to be fairly slow if AI isn't heavily involved.

I think there are a couple more radically transformative technologies which I think are reasonably likely over the next hundred years, eg whole brain emulation. And I suspect we disagree about the expected pace of change with bioengineering and maybe nanotech.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-21T21:12:37.844Z · score: 4 (2 votes) · EA · GW

Yeah, makes sense; I didn’t mean “unintentional” by “incidental”.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-21T02:16:18.724Z · score: 31 (11 votes) · EA · GW

I think of myself as making a lot of gambles with my career choices. And I suspect that regardless of which way the propositions turn out, I'll have an inclination to think that I was an idiot for not realizing them sooner. For example, I often have both the following thoughts:

  • "I have a bunch of comparative advantage at helping MIRI with their stuff, and I'm not going to be able to quickly reduce my confidence in their research directions. So I should stop worrying about it and just do as much as I can."
  • "I am not sure whether the MIRI research directions are good. Maybe I should spend more time evaluating whether I should do a different thing instead."

But even if it feels obvious in hindsight, it sure doesn't feel obvious now.

So I have big gambles that I'm making, which might turn out to be wrong, but which feel now like they will have been reasonable-in-hindsight gambles either way. The main two such gambles are thinking AI alignment might be really important in the next couple decades and working on MIRI's approaches to AI alignment instead of some other approach.

When I ask myself "what things have I not really considered as much as I should have", I get answers that change over time (because I ask myself that question pretty often and then try to consider the things that are important). At the moment, my answers are:

  • Maybe I should think about/work on s-risks much more
  • Maybe I spend too much time inventing my own ways of solving design problems in Haskell and I should study other people's more.
  • Maybe I am much more productive working on outreach stuff and I should do that full time.
  • (This one is only on my mind this week and will probably go away pretty soon) Maybe I'm not seriously enough engaging with questions about whether the world will look really different in a hundred years from how it looks today; perhaps I'm subject to some bias towards sensationalism and actually the world will look similar in 100 years.
Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-21T01:34:26.972Z · score: 10 (3 votes) · EA · GW

I hadn't actually noticed that.

One factor here is that a lot of AI safety research seems to need ML expertise, which is one of my least favorite types of CS/engineering.

Another is that compared to many EAs I think I have a comparative advantage at roles which require technical knowledge but not doing technical research day-to-day.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-21T01:32:41.002Z · score: 11 (4 votes) · EA · GW

I'm emphasizing strategy 1 because I think that there are EA jobs for software engineers where the skill ceiling is extremely high, so if you're really good it's still worth it for you to try to become much better. For example, AI safety research needs really great engineers at AI safety research orgs.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-21T01:24:26.602Z · score: 24 (9 votes) · EA · GW

I worry very little about losing the opportunity to get external criticism from people who wouldn't engage very deeply with our work if they did have access to it. I worry more about us doing worse research because it's harder for extremely engaged outsiders to contribute to our work.

A few years ago, Holden had a great post where he wrote:

For nearly a decade now, we've been putting a huge amount of work into putting the details of our reasoning out in public, and yet I am hard-pressed to think of cases (especially in more recent years) where a public comment from an unexpected source raised novel important considerations, leading to a change in views. This isn't because nobody has raised novel important considerations, and it certainly isn't because we haven't changed our views. Rather, it seems to be the case that we get a large amount of valuable and important criticism from a relatively small number of highly engaged, highly informed people. Such people tend to spend a lot of time reading, thinking and writing about relevant topics, to follow our work closely, and to have a great deal of context. They also tend to be people who form relationships of some sort with us beyond public discourse.
The feedback and questions we get from outside of this set of people are often reasonable but familiar, seemingly unreasonable, or difficult for us to make sense of. In many cases, it may be that we're wrong and our external critics are right; our lack of learning from these external critics may reflect our own flaws, or difficulties inherent to a situation where people who have thought about a topic at length, forming their own intellectual frameworks and presuppositions, try to learn from people who bring very different communication styles and presuppositions.
The dynamic seems quite similar to that of academia: academics tend to get very deep into their topics and intellectual frameworks, and it is quite unusual for them to be moved by the arguments of those unfamiliar with their field. I think it is sometimes justified and sometimes unjustified to be so unmoved by arguments from outsiders.
Regardless of the underlying reasons, we have put a lot of effort over a long period of time into public discourse, and have reaped very little of this particular kind of benefit (though we have reaped other benefits - more below). I'm aware that this claim may strike some as unlikely and/or disappointing, but it is my lived experience, and I think at this point it would be hard to argue that it is simply explained by a lack of effort or interest in public discourse.

My sense is pretty similar to Holden's, though we've put much less effort into explaining ourselves publicly. When we're thinking about topics like decision theory which have a whole academic field, we seem to get very little out of interacting with the field. This might be because we're actually interested in different questions and academic decision theory doesn't have much to offer us (eg see this Paul Christiano quote and this comment).

I think that MIRI also empirically doesn't change its strategy much as a result of talking to highly engaged people who have very different world views (eg Paul Christiano), though individual researchers (eg me) often change their minds from talking to these people. (Personally, I also change my mind from talking to non-very-engaged people.)

Maybe talking to outsiders doesn't shift MIRI strategy because we're totally confused about how to think about all of this. But I'd be surprised if we figured this out soon given that we haven't figured it so far. So I'm pretty willing to say "look, either MIRI's onto something or not; if we're onto something, we should go for it wholeheartedly, and I don't seriously think that we're going to update our beliefs much from more public discourse, so it doesn't that seem costly to have our public discourse become costlier".

I guess I generally don't feel that convinced that external criticism is very helpful for situations like ours where there isn't an established research community with taste that is relevant to our work. Physicists have had a lot of time to develop a reasonably healthy research culture where they notice what kinds of arguments are wrong; I don't think AI alignment has that resource to draw on. And in cases where you don't have an established base of knowledge about what kinds of arguments are helpful (sometimes people call this "being in a preparadigmatic field"; I don't know if that's correct usage), I think it's plausible that people with different intuitions should do divergent work for a while and hope that eventually some of them make progress that's persuasive to the others.

By not engaging with critics as much as we could, I think MIRI is probably increasing the probability that we're barking completely up the wrong tree. I just think that this gamble is worth taking.

I'm more concerned about costs incurred because we're more careful about sharing research with highly engaged outsiders who could help us with it. Eg Paul has made some significant contributions to MIRI's research, and it's a shame to have less access to his ideas about our problems.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-21T01:00:31.176Z · score: 17 (7 votes) · EA · GW

I think it's plausible that "solving the alignment problem" isn't a very clear way of phrasing the goal of technical AI safety research. Consider the question "will we solve the rocket alignment problem before we launch the first rocket to the moon"--to me the interesting question is whether the first rocket to the moon will indeed get there. The problem isn't really "solved" or "not solved", the rocket just gets to the moon or not. And it's not even obvious whether the goal is to align the first AGI; maybe the question is "what proportion of resources controlled by AI systems end up being used for human purposes", where we care about a weighted proportion of AI systems which are aligned.

I am not sure whether I'd bet for or against the proposition that humans will go extinct for AGI-misalignment-related-reasons within the next 100 years.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T07:19:30.685Z · score: 8 (4 votes) · EA · GW

It's getting late and it feels hard to answer this question, so I'm only going to say briefly:

  • for something MIRI wrote re this, see the "strategic background" section here
  • I think there are cases where alignment is non-trivial but prosaic AI alignment is possible, and some people who are cautious about AGI alignment are influential in the groups that are working on AGI development and cause them to put lots of effort into alignment (eg maybe the only way to align the thing involves spending an extra billion dollars on human feedback). Because of these cases, I am excited for the leading AI orgs having many people in important positions who are concerned about and knowledgeable about these issues.
Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T07:02:55.651Z · score: 12 (5 votes) · EA · GW

I don't think you can prep that effectively for x-risk-level AI outcomes, obviously.

I think you can prep for various transformative technologies; you could for example buy shares of computer hardware manufacturers if you think that they'll be worth more due to increased value of computation as AI productivity increases. I haven't thought much about this, and I'm sure this is dumb for some reason, but maybe you could try to buy land in cheap places in the hope that in a transhuman utopia the land will be extremely valuable (the property rights might not carry through, but it might be worth the gamble for sufficiently cheap land).

I think it's probably at least slightly worthwhile to do good and hope that you can sell some of your impact certificates after good AI outcomes.

You should ask Carl Shulman, I'm sure he'd have a good answer.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T06:30:45.650Z · score: 10 (5 votes) · EA · GW

"Do you have any advice for people who want to be involved in EA, but do not think that they are smart or committed enough to be engaging at your level?"--I just want to say that I wouldn't have phrased it quite like that.

One role that I've been excited about recently is making local groups be good. I think that having better local EA communities might be really helpful for outreach, and lots of different people can do great work with this.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T06:02:54.573Z · score: 9 (4 votes) · EA · GW

(I've spent a few hours talking to people about the LTFF but I'm not sure about things like "what order of magnitude of funding did they allocate last year" (my guess without looking it up is $1M, (which turns out to be correct!)), so take all this with a grain of salt.)

Re Q1: I don't know, I don't think that we coordinate very carefully.

Re Q2: I don't really know. When I look at the list of things the LTFF funded in August or April (excluding regrants to orgs like MIRI, CFAR, and Ought), about 40% look meh (~0.5x MIRI), about 40% look like things which I'm reasonably glad someone funded (~1x MIRI), about 7% are things that I'm really glad someone funded (~3x MIRI), and 3% are things that I wish that they hadn't funded (-1x MIRI). Note that my mean outcome of the meh, good, and great categories are much higher than the median outcomes--a lot of them are "I think this is probably useless but seems worth trying for value of information". Apparently this adds up to thinking that they're 78% as good as MIRI.

Q3: I don't really know. My median outcome is that they turn out to do less well than my estimation above, but I think there's a reasonable probability that they turn out to be much better than my estimate above, and I'm excited to see them try to do good. This isn't really tied up with AI capability or safety progressing though.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T05:41:19.773Z · score: 3 (2 votes) · EA · GW

Idk. A couple percent? I'm very unsure about this.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T05:37:07.529Z · score: 18 (9 votes) · EA · GW

I think your sense is correct. I think that plenty of people have short docs on why their approach is good; I think basically no-one has long docs engaging thoroughly with the criticisms of their paths (I don't think Paul's published arguments defending his perspective count as complete; Paul has arguments that I hear him make in person that I haven't seen written up.)

My guess is that it's developed because various groups decided that it was pretty unlikely that they were going to be able to convince other groups of their work, and so they decided to just go their own ways. This is exacerbated by the fact that several AI safety groups have beliefs which are based on arguments which they're reluctant to share with each other.

(I was having a conversation with an AI safety researcher at a different org recently, and they couldn't tell me about some things that they knew from their job, and I couldn't tell them about things from my job. We were reflecting on the situation, and then one of us proposed the metaphor that we're like two people who were sliding on ice next to each other and then pushed away and have now chosen our paths and can't interact anymore to course correct.)

Should we be concerned? Idk, seems kind of concerning. I kind of agree with MIRI that it's not clearly worth it for MIRI leadership to spend time talking to people like Paul who disagree with them a lot.

Also, sometimes fields should fracture a bit while they work on their own stuff; maybe we'll develop our own separate ideas for the next five years, and then come talk to each other more when we have clearer ideas.

I suspect that things like the Alignment Newsletter are causing AI safety researchers to understand and engage with each other's work more; this seems good.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T05:13:25.387Z · score: 7 (2 votes) · EA · GW

I'm not sure; my guess is that it's somewhat harder, because we're enthusiastic about our new research directions and have moved some management capacity towards those, and those directions have relatively more room for engineering skillsets vs pure math skillsets.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T05:07:38.865Z · score: 13 (7 votes) · EA · GW

I think they're OK. I think some CFAR staff are really great. I think that their incidental effect of causing people to have more social ties to the rationalist/EA Bay Area community is probably pretty good.

I've done CFAR-esque exercises at AIRCS workshops which were very helpful to me. I think my general sense is that a bunch of CFAR material has a "true form" which is pretty great, but I didn't get the true form from my CFAR workshop, I got it from talking to Anna Salamon (and somewhat from working with other CFAR staff).

I think that for (possibly dumb) personal reasons I get more annoyed by them than some people, which prevents me from getting as much value out of them.

I generally am glad to hear that an EA has done a CFAR workshop, and normally recommend that EAs do them, especially if they don’t have as much social connection to the EA/rationalist scene, or if they don’t have high opportunity cost to their time.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T05:02:38.856Z · score: 9 (4 votes) · EA · GW

I still feel this way, and I've been trying to think of ways to reduce this problem. I think the AIRCS workshops help a bit, I think that my SSC trip was helpful and EA residencies might be helpful.

A few helpful conversations that I've had recently with people who are strongly connected to the professional EA community, which I think would be harder to have without information gained from these strong connections:

  • I enjoyed a document about AI timelines that someone from another org shared me on.
  • Discussions about how EA outreach should go--how rapidly should we try to grow, what proportion of people who are good fits for EA are we already reaching, what types of people are we going to be able to reach with what outreach mechanisms.
Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T02:40:12.279Z · score: 5 (4 votes) · EA · GW

Probably less than two.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T22:24:15.719Z · score: 11 (1 votes) · EA · GW

I think Paul is probably right about the causes of the disagreement between him and many researchers, and the summary of his beliefs in the AI Impacts interview you linked matches my impression of his beliefs about this.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T22:19:54.800Z · score: 16 (9 votes) · EA · GW

The most common reason that someone who I would be excited to work with at MIRI chooses not to work on AI alignment is that they decide to work on some other important thing instead, eg other x-risk or other EA stuff.

But here are some anonymized recent stories of talented people who decided to do non-EA work instead of taking opportunities to do important technical work related to x-risk (for context, I think all of these people are more technically competent than me):

  • One was very comfortable in a cushy, highly paid job which they already had, and thought it would be too inconvenient to move to an EA job (which would have also been highly paid).
  • One felt that AGI timelines are probably relatively long (eg they thought that the probability of AGI in the next 30 years felt pretty small to them), which made AI safety feel not very urgent. So they decided to take an opportunity which they thought would be really fun and exciting, rather than working at MIRI, which they thought would be less of a good fit for a particular skill set which they'd been developing for years; this person thinks that they might come back and work on x-risk after they've had another job for a few years.
  • One was in the middle of a PhD and didn't want to leave.
  • One felt unsure about whether it's reasonable to believe all the unusual things that the EA community believes, and didn't believe the arguments enough that they felt morally compelled to leave their current lucrative job.

I feel sympathetic to the last three but not to the first.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T22:15:29.706Z · score: 11 (6 votes) · EA · GW

Here are several, together with the percentage of my productivity which I think they cost me over the last year:

  • I’ve lost a couple months of productivity over the last year due to some weird health issues--I was really fatigued and couldn’t think properly for several months. This was terrible, but the problem seems to have gone away for now. This had a 25% cost.
  • I am a worse researcher because I spend half my time doing other things than research. It’s unclear to me how much efficiency this costs me. Potential considerations:
    • When I don’t feel like doing technical work, I can do other work. This should increase my productivity. But maybe it lets me procrastinate on important work.
    • I remember less of the context of what I’m working on, because my memory is spaced out.
    • My nontechnical work often feels like social drama and is really attention-grabbing and distracting.
    • Overall this costs me maybe 15%; I’m really unsure about this though.
  • I’d be a better researcher if I were smarter and more knowledgeable. I’m working on the knowledgeableness problem with the help of tutors and by spending some of my time studying. It’s unclear how to figure out how costly this is. If I’d spent a year working as a Haskell programmer in industry, I’d probably be like 15% more effective now.
Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T22:14:05.045Z · score: 3 (2 votes) · EA · GW

Yeah, I am not targeting that kind of person. Someone who is excited about ML and skeptical of AI safety but interested in engaging a lot with AI safety arguments for a few months might be a good fit.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T21:18:12.014Z · score: 23 (9 votes) · EA · GW

As an appendix to the above, some of my best learning experiences as a programmer were the following (starting from when I started programming properly as a freshman in 2012). (Many of these aren’t that objectively hard (and would fit in well as projects in a CS undergrad course); they were much harder for me because I didn’t have the structure of a university course to tell me what design decisions were reasonable and when I was going down blind alleys. I think that this difficulty created some great learning experiences for me.)

  • I translated the proof of equivalence between regular expressions and finite state machines from “Introduction to Automata Theory, Languages, and Computation” into Haskell.
  • I wrote a program which would take a graph describing a circuit built from resistors and batteries and then solve for the currents and potential drops.
  • I wrote a GUI for a certain subset of physics problems; this involved a lot of deconfusion-style thinking as well as learning how to write GUIs.
  • I went to App Academy and learned to write full stack web applications.
  • I wrote a compiler from C to assembly in Scala. It took a long time for me to figure out that I should eg separate out the compiling to an intermediate output that didn’t have registers allocated.
  • I wrote the first version of my data structure searcher. (Not linking because I’m embarrassed by how much worse it was than my second attempt.)
  • I wrote the second version of my data structure searcher, which involved putting a lot of effort into deconfusing myself about what data structures are and how they connect to each other.
    • One example of something I mean by deconfusion here: when you have a composite data structure (eg a binary search tree and a heap representing the same data, with pointers into each other), when you’re figuring out how quickly your reads happen, you take the union of all the things you can do with all your structures--eg you can read in any of the structures. But when you want to do a write, you need to do it in all of the structures, so you take the maximum write time. This feels obvious when I write it now, but wasn’t obvious until I’d figured out exactly what question was interesting to me. And it’s somewhat more complicated--for example, we actually want to take the least upper bound rather than the maximum.
  • At Triplebyte, I wrote a variety of web app features. The one I learned the most from was building a UI for composing structured emails quickly--the UI concept was original to me, and it was great practice at designing front end web widgets and building things to solve business problems. My original design kind of sucked so I had to rewrite it; this was also very helpful. I learned the most from trying to design complicated frontend UI components for business logic, because that involves more design work than backend Rails programming does.
  • I wrote another version of my physics problem GUI, which taught me about designing UIs in front of complicated functional backends.
  • I then shifted my programming efforts to MIRI work; my learning here has mostly been a result of learning more Haskell and trying to design good abstractions for some of the things we’re doing; I’ve also recently had to think about the performance of my code, which has been interesting.

I learned basically nothing useful from my year at PayPal.

I have opinions about how to choose jobs in order to maximize how much programming you learn and I might write them up at some point.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T20:46:42.734Z · score: 10 (4 votes) · EA · GW

I don't know about what you need to know in order to do agent foundations research and trust Scott's answer.

If you're seriously considering autodidacting to prepare for a non-agent-foundations job at MIRI, you should email me ( about your particular situation and I'll try to give you personal advice. If too many people email me asking about this, I'll end up writing something publicly.

In general, I'd rather that people talk to me before they study a lot for a MIRI job rather than after, so that I can point them in the right direction and they don't waste effort learning things that aren't going to make the difference to whether we want to hire them.

And if you want to autodidact to work on agent foundations at MIRI, consider emailing someone on the agent foundations team. Or you could try emailing me and I can try to help.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T20:43:19.685Z · score: 4 (3 votes) · EA · GW

I feel reluctant to answer this question because it feels like it would involve casting judgement on lots of people publicly. I think that there are a bunch of different orgs and people doing good work on AI safety.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T20:41:29.388Z · score: 25 (10 votes) · EA · GW

For the record, I think that I had mediocre judgement in the past and did not reliably believe true things, and I sometimes had made really foolish decisions. I think my experience is mostly that I felt extremely alienated from society, which meant that I looked more critically on many common beliefs than most people do. This meant I was weird in lots of ways, many of which were bad and some of which were good. And in some cases this meant that I believed some weird things that feel like easy wins, eg by thinking that people were absurdly callous about causing animal suffering.

My judgement improved a lot from spending a lot of time in places with people with good judgement who I could learn from, eg Stanford EA, Triplebyte, the more general EA and rationalist community, and now MIRI.

I feel pretty unqualified to give advice on critical thinking, but here are some possible ideas, which probably aren't actually good:

  • Try to learn simple models of the world and practice applying them to claims you hear, and then being confused when they don't match. Eg learn introductory microeconomics and then whenever you hear a claim about the world that intro micro has an opinion on, try to figure out what the simple intro micro model would claim, and then inasmuch as the world doesn't seem to look like intro micro would predict, think "hmm this is confusing" and then try to figure out what about the world might have caused this. When I developed this habit, I started noticing that lots of claims people make about the world are extremely implausible, and when I looked into the facts more I found that intro micro seemed to back me up. To learn intro economics, I enjoyed the Cowen and Tabarrok textbook.
    • I think Katja Grace is a master of the "make simple models and then get confused when the world doesn't match them" technique. See her novel opinions page for many examples.
    • Another subject where I've been doing this recently is evolutionary biology--I've learned to feel confused whenever anyone makes any claims about group selection, and I plan to learn how group selection works, so that when people make claims about it I can assess them accurately.
  • Try to find the simplest questions whose answers you don't know, in order to practice noticing when you believe things for bad reasons.
    • For example, some of my favorite physics questions:
      • Why isn't the Sun blurry?
      • What is the fundamental physical difference between blue and green objects? Like, what equations do I solve to find out that an object is blue?
      • If energy is conserved, why we so often make predictions about the world by assuming that energy is minimized?
    • I think reading Thinking Physics might be helpful at practicing noticing your own ignorance, but I'm not sure.
  • Try to learn a lot about specific subjects sometimes, so that you learn what it's like to have detailed domain knowledge.
Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T20:28:44.051Z · score: 12 (7 votes) · EA · GW

I no longer feel annoyed about this. I'm not quite sure why. Part of it is probably that I'm a lot more sympathetic when EAs don't know things about AI safety than global poverty, because learning about AI safety seems much harder, and I think I hear relatively more discussion of AI safety now compared to three years ago.

One hypothesis is that 80000 Hours has made various EA ideas more accessible and well-known within the community, via their podcast and maybe their articles.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T20:23:53.161Z · score: 25 (15 votes) · EA · GW

I feel really sad about it. I think EA should probably have a communication strategy where we say relatively simple messages like "we think talented college graduates should do X and Y", but this causes collateral damage where people who don't succeed at doing X and Y feel bad about themselves. I don't know what to do about this, except to say that I have the utmost respect in my heart for people who really want to do the right thing and are trying their best.

I don't think I have very coherent or reasoned thoughts on how we should handle this, and I try to defer to people who I trust whose judgement on these topics I think is better.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T20:17:32.491Z · score: 4 (3 votes) · EA · GW

I don't consume information online very intentionally.

Blogs I often read:

  • Slate Star Codex
  • The Unit of Caring
  • Bryan Caplan (despite disagreeing with him a lot, obviously)
  • Meteuphoric (Katja Grace)
  • Paul Christiano's various blogs

I often read the Alignment Newsletter. I mostly learn things from hearing about them from friends.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T19:27:18.356Z · score: 23 (10 votes) · EA · GW

(This is true of all my answers but feels particularly relevant for this one: I’m speaking just for myself, not for MIRI as a whole)

We’ve actually made around five engineer hires since then; we’ll announce some of them in a few weeks. So I was off by a factor of two.

Before you read my more detailed thoughts: please don’t read the below and then get put off from applying to MIRI. I think that many people who are in fact good MIRI fits might not realize they’re good fits. If you’re unsure whether it’s worth your time to apply to MIRI, you can email me at and I’ll (eventually) reply telling you whether I think you might plausibly be a fit. Even if it doesn't go further than that, there is great honor in applying to jobs from which you get rejected, and I feel warmly towards almost everyone I reject.

With that said, here are some of my thoughts on the discrepancy between my prediction and how much we’ve hired:

  • Since I started doing recruiting work for MIRI in late 2017, I’ve updated towards thinking that we need to be pickier with the technical caliber of engineering hires than I originally thought. I’ve updated towards thinking that we’re working in a domain where relatively small increases in competence translate into relatively large differences in output.
    • A few reasons for this:
      • Our work involves dealing a lot with pretty abstract computer science and software engineering considerations; this increases variance in productivity.
      • We use a lot of crazy abstractions (eg various Haskell stuff) that are good for correctness and productivity for programmers who understand them and bad for people who have more trouble learning and understanding all that stuff.
      • We have a pretty flat management structure where engineers need to make a lot of judgement calls for themselves about what to work on and how to do it. As a result, it’s more important for people doing programming work to have a good understanding of everything we’re doing and how their work fits into this.
    • I think it’s plausible that if we increased our management capacity, we’d be able to hire engineers who are great in many ways but who don’t happen to be good in some of the specific ways we require at the moment.
  • Our recruiting process involves a reasonable amount of lag between meeting people and hiring them, because we often want to get to know people fairly well before offering them a job. So I think it’s plausible that the number of full time offers we make is somewhat of a trailing indicator. Over time I’ve updated towards thinking that it’s worth it to take more time before giving people full time offers, by offering internships or trials, so the number of engineering hires lags more than I expected given the number of candidates I’d met who I was reasonably excited by.
  • I’ve also updated towards the importance of being selective on culture fit, eg wanting people who I’m more confident will do well in the relatively self-directed MIRI research environment.

A few notes based on this that are relevant to people who might want to work at MIRI:

  • As I said, our engineering requirements might change in the future, and when that happens I’d like to have a list of people who might be good fits. So please feel free to apply even if you think you’re a bad fit right now.
  • We think about who to hire on an extremely flexible, case-by-case basis. Eg we hire people who know no Haskell at all.
  • If you want to be more likely to be a good fit for MIRI, learning Haskell and some dependent type theory is pretty helpful. I think that it might even be worth it to try to get a job programming in Haskell just in case you get really good at it and then MIRI wants to hire you, though I feel slightly awkward to give this advice because I feel like it’s pushing a pretty specific strategy which is very targeted at only a single opportunity. As I said earlier, if you’re considering taking a Haskell job based on this, please feel free to email me to talk about it.
Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T18:29:32.478Z · score: 21 (11 votes) · EA · GW

Yeah, I think that a lot of EAs working on AI safety feel similarly to me about this.

I expect the world to change pretty radically over the next 100 years, and I probably want to work on the radical change that's going to matter first. So compared to the average educated American I have shorter AI timelines but also shorter timelines to the world becoming radically different for other reasons.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T18:01:09.602Z · score: 6 (5 votes) · EA · GW

I see this and appreciate it; the problem is that I want to bet on something like "your overall theory is wrong", but I don't know enough neuroscience to know whether the claims you're making are things that are probably true for reasons unrelated to your overall theory. If you could find someone who I trusted who knew neuroscience and who thought your predictions seemed unlikely, then I'd bet with them against you.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T17:54:14.981Z · score: 17 (19 votes) · EA · GW

Most things that look crankish are crankish.

I think that MIRI looks kind of crankish from the outside, and this should indeed make people initially more skeptical of us. I think that we have a few other external markers of legitimacy now, such as the fact that MIRI people were thinking and writing about AI safety from the early 2000s and many smart people have now been persuaded that this is indeed an issue to be concerned with. (It's not totally obvious to me that these markers of legitimacy mean that anyone should take us seriously on the question "what AI safety research is promising".) When I first ran across MIRI, I was kind of skeptical because of the signs of crankery; I updated towards them substantially because I found their arguments and ideas compelling, and people whose judgement I respected also found them compelling.

I think that the signs of crankery in QRI are somewhat worse than 2008 MIRI's signs of crankery.

I also think that I'm somewhat qualified to assess QRI's work (as someone who's spent ~100 paid hours thinking about philosophy of mind in the last few years), and when I look at it, I think it looks pretty crankish and wrong.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T07:40:49.158Z · score: 12 (7 votes) · EA · GW

It’s not clear what effect this has had, if any. I am personally somewhat surprised by this--I would have expected more people to stop donating to us.

I asked Rob Bensinger about this; he summarized it as “We announced nondisclosed-by-default in April 2017, and we suspected that this would make fundraising harder. In fact, though, we received significantly more funding in 2017 (, and have continued to receive strong support since then. I don't know that there's any causal relationship between those two facts; e.g., the obvious thing to look at in understanding the 2017 spike was the cryptocurrency price spike that year. And there are other factors that changed around the same time too, e.g., Colm [who works at MIRI on fundraising among other things] joining MIRI in late 2016.“