Some thoughts on EA outreach to high schoolers 2020-09-13T22:51:24.200Z · score: 87 (42 votes)
Buck's Shortform 2020-09-13T17:29:42.117Z · score: 6 (1 votes)
Some thoughts on deference and inside-view models 2020-05-28T05:37:14.979Z · score: 109 (45 votes)
My personal cruxes for working on AI safety 2020-02-13T07:11:46.803Z · score: 119 (51 votes)
Thoughts on doing good through non-standard EA career pathways 2019-12-30T02:06:03.032Z · score: 141 (66 votes)
"EA residencies" as an outreach activity 2019-11-17T05:08:42.119Z · score: 86 (41 votes)
I'm Buck Shlegeris, I do research and outreach at MIRI, AMA 2019-11-15T22:44:17.606Z · score: 122 (64 votes)
A way of thinking about saving vs improving lives 2015-08-08T19:57:30.985Z · score: 2 (4 votes)


Comment by buck on Evidence on correlation between making less than parents and welfare/happiness? · 2020-10-14T05:14:02.148Z · score: 7 (4 votes) · EA · GW

Inasmuch as you expect people to keep getting richer, it seems reasonable to hope that no generation has to be more frugal than the previous.

Comment by buck on In defence of epistemic modesty · 2020-10-11T19:27:11.541Z · score: 14 (5 votes) · EA · GW

when domain experts look at the 'answer according to the rationalist community re. X', they're usually very unimpressed, even if they're sympathetic to the view themselves. I'm pretty Atheist, but I find the 'answer' to the theism question per LW or similar woefully rudimentary compared to state of the art discussion in the field. I see similar experts on animal consciousness, quantum mechanics, free will, and so on similarly be deeply unimpressed with the sophistication of argument offered.

I would love to see better evidence about this. Eg it doesn't match my experience of talking to physicists.

Comment by buck on What is the increase in expected value of effective altruist Wayne Hsiung being mayor of Berkeley instead of its current incumbent? · 2020-10-11T04:44:24.666Z · score: 13 (8 votes) · EA · GW

I think he wouldn't have thought of this as "throwing the community under the bus". I'm also pretty skeptical that this consideration is strong enough to be the main consideration here (as opposed to eg the consideration that Wayne seems way more interested in making the world better from a cosmopolitan perspective than other candidates for mayor).

Comment by buck on What is the increase in expected value of effective altruist Wayne Hsiung being mayor of Berkeley instead of its current incumbent? · 2020-10-11T04:42:08.887Z · score: 2 (1 votes) · EA · GW

Wayne at least sort-of identified as an EA in 2015, eg hosting EA meetups at his house. And he's been claiming to be interested in evidence-based approaches to making the world better since at least then.

Comment by buck on EA Uni Group Forecasting Tournament! · 2020-09-20T03:45:43.384Z · score: 10 (7 votes) · EA · GW

I think this is a great idea, and I'm excited that you're doing it.

Comment by buck on Buck's Shortform · 2020-09-19T05:00:39.146Z · score: 64 (19 votes) · EA · GW

I’ve recently been thinking about medieval alchemy as a metaphor for longtermist EA.

I think there’s a sense in which it was an extremely reasonable choice to study alchemy. The basic hope of alchemy was that by fiddling around in various ways with substances you had, you’d be able to turn them into other things which had various helpful properties. It would be a really big deal if humans were able to do this.

And it seems a priori pretty reasonable to expect that humanity could get way better at manipulating substances, because there was an established history of people figuring out ways that you could do useful things by fiddling around with substances in weird ways, for example metallurgy or glassmaking, and we have lots of examples of materials having different and useful properties. If you had been particularly forward thinking, you might even have noted that it seems plausible that we’ll eventually be able to do the full range of manipulations of materials that life is able to do.

So I think that alchemists deserve a lot of points for spotting a really big and important consideration about the future. (I actually have no idea if any alchemists were thinking about it this way; that’s why I billed this as a metaphor rather than an analogy.) But they weren’t really very correct about how anything worked, and so most of their work before 1650 was pretty useless. 

It’s interesting to think about whether EA is in a similar spot. I think EA has done a great job of identifying crucial and underrated considerations about how to do good and what the future will be like, eg x-risk and AI alignment. But I think our ideas for acting on these considerations seem much more tenuous. And it wouldn’t be super shocking to find out that later generations of longtermists think that our plans and ideas about the world are similarly inaccurate.

So what should you have done if you were an alchemist in the 1500s who agreed with this argument that you had some really underrated considerations but didn’t have great ideas for what to do about them? 

I think that you should probably have done some of the following things:

  • Try to establish the limits of your knowledge and be clear about the fact that you’re in possession of good questions rather than good answers.
  • Do lots of measurements, write down your experiments clearly, and disseminate the results widely, so that other alchemists could make faster progress.
  • Push for better scientific norms. (Scientific norms were in fact invented in large part by Robert Boyle for the sake of making chemistry a better field.)
  • Work on building devices which would enable people to do experiments better.

Overall I feel like the alchemists did pretty well at making the world better, and if they’d been more altruistically motivated they would have been even better.

There are some reasons to think that pushing early chemistry forward is easier than working on improving the long term future, In particular, you might think that it’s only possible to work on x-risk stuff around the time of the hinge of history.

Comment by buck on Some thoughts on EA outreach to high schoolers · 2020-09-15T17:19:07.536Z · score: 4 (7 votes) · EA · GW

Yeah, I thought about this; it’s standard marketing terminology, and concise, which is why I ended up using it. Thanks though.

Comment by buck on Buck's Shortform · 2020-09-13T22:45:59.196Z · score: 21 (9 votes) · EA · GW

I thought this post was really bad, basically for the reasons described by Rohin in his comment. I think it's pretty sad that that post has positive karma.

Comment by buck on Deliberate Consumption of Emotional Content to Increase Altruistic Motivation · 2020-09-13T17:54:38.060Z · score: 18 (9 votes) · EA · GW

When I was 18 I watched a lot of videos of animal suffering, eg linked from Brian Tomasik's list of distressing videos of suffering (extremely obvious content warning: extreme suffering).  I am not sure whether I'd recommend this to others.

As a result, I felt a lot of hatred for people who were knowingly complicit in causing extreme animal suffering, which was basically everyone I knew. At the time I lived in a catered college university, where every day I'd see people around me eating animal products; I felt deeply alienated and angry and hateful.

This was good in some ways. I think it's plausibly healthy to feel a lot of hatred for society. I think that this caused me to care even less about what people thought of me, which made it easier for me to do various weird things like dropping out of university (temporarily) and moving to America.

I told a lot of people to their faces that I thought they were contemptible. I don't feel like I'm in the wrong for saying this, but this probably didn't lead to me making many more friends than I otherwise would have. And on one occasion I was very cruel to someone who didn't deserve it; I felt more bad about this than about basically anything else I'd done in my life.

I don't know whether I'd recommend this to other people. Probably some people should feel more alienated and others should feel less alienated.

Comment by buck on Are there any other pro athlete aspiring EAs? · 2020-09-13T17:34:49.245Z · score: 19 (6 votes) · EA · GW

For what it's worth, I think that EA related outreach to heirs seems much less promising than to founders or pro poker players. 

Successful founders are often extremely smart in my experience; I expect pro poker players are also pretty smart on average.

Comment by buck on Are there any other pro athlete aspiring EAs? · 2020-09-13T17:33:17.391Z · score: 7 (4 votes) · EA · GW

It seems likely that pro athletes are more intelligent than average, but I'd be very surprised if they were as intelligent as pro poker players on average.

Comment by buck on Buck's Shortform · 2020-09-13T17:29:42.563Z · score: 59 (25 votes) · EA · GW

Edited to add: I think that I phrased this post misleadingly; I meant to complain mostly about low quality criticism of EA rather than eg criticism of comments. Sorry to be so unclear. I suspect most commenters misunderstood me.

I think that EAs, especially on the EA Forum, are too welcoming to low quality criticism [EDIT: of EA]. I feel like an easy way to get lots of upvotes is to make lots of vague critical comments about how EA isn’t intellectually rigorous enough, or inclusive enough, or whatever. This makes me feel less enthusiastic about engaging with the EA Forum, because it makes me feel like everything I’m saying is being read by a jeering crowd who just want excuses to call me a moron.

I’m not sure how to have a forum where people will listen to criticism open mindedly which doesn’t lead to this bias towards low quality criticism.

Comment by buck on Judgement as a key need in EA · 2020-09-13T15:52:06.910Z · score: 25 (8 votes) · EA · GW

I would be pretty surprised if most of the people from the EALF survey thought that forecasting is "very closely related" to good judgement.

Comment by buck on The academic contribution to AI safety seems large · 2020-07-31T16:20:15.536Z · score: 30 (13 votes) · EA · GW

Thanks for writing this post--it was useful to see the argument written out so I could see exactly where I agreed and disagreed. I think lots of people agree with this but I've never seen it written up clearly before.

I think I place substantial weight (30% or something) on you being roughly right about the relative contributions of EA safety and non-EA safety. But I think it's more likely that the penalty on non-EA safety work is larger than you think. 

I think the crux here is that I think AI alignment probably requires really focused attention, and research done by people who are trying to do something else will probably end up not being very helpful for some of the core problems.

It's a little hard to evaluate the counterfactuals here, but I'd much rather have the contributions from EA safety than from non EA safety over the last ten years.

I think that it might be easier to assign a value to the discount factor by assessing the total contributions of EA safety and non-EA safety. I think that EA safety does something like 70% of the value-weighted work, which suggests a much bigger discount factor than 80%.


Assorted minor comments:

But this is only half of the ledger. One of the big advantages of academic work is the much better distribution of senior researchers: EA Safety seems bottlenecked on people able to guide and train juniors

Yes, but those senior researchers won't necessarily have useful things to say about how to do safety research. (In fact, my impression is that most people doing safety research in academia have advisors who don't have very smart thoughts on long term AI alignment.)

None of those parameters is obvious, but I make an attempt in the model (bottom-left corner).

I think the link is to the wrong model?

A cursory check of the model

In this section you count nine safety-relevant things done by academia over two decades, and then note that there were two things from within EA safety last year that seem more important. This doesn't seem to mesh with your claim about their relative productivity.

Comment by buck on The academic contribution to AI safety seems large · 2020-07-31T15:57:43.901Z · score: 8 (5 votes) · EA · GW

MIRI is not optimistic about prosaic AGI alignment and doesn't put much time into it.

Comment by buck on How strong is the evidence of unaligned AI systems causing harm? · 2020-07-23T03:15:29.565Z · score: 12 (7 votes) · EA · GW

I don’t think the evidence is very good; I haven’t found it more than slightly convincing. I don’t think that the harms of current systems are a very good line of argument for potential dangers of much more powerful systems.

Comment by buck on Intellectual Diversity in AI Safety · 2020-07-22T22:22:42.374Z · score: 10 (7 votes) · EA · GW

I'm curious what your experience was like when you started talking to AI safety people after already coming to come of your own conclusions. Eg I'm curious if you think that you missed major points that the AI safety people had spotted which felt obvious in hindsight, or if you had topics on which you disagreed with the AI safety people and think you turned out right.

Comment by buck on Are there lists of causes (that seemed promising but are) known to be ineffective? · 2020-07-09T05:25:08.184Z · score: 8 (5 votes) · EA · GW

In an old post, Michael Dickens writes:

The closest thing we can make to a hedonium shockwave with current technology is a farm of many small animals that are made as happy as possible. Presumably the animals are cared for by people who know a lot about their psychology and welfare and can make sure they’re happy. One plausible species choice is rats, because rats are small (and therefore easy to take care of and don’t consume a lot of resources), definitively sentient, and we have a reasonable idea of how to make them happy.
Thus creating 1 rat QALY costs $120 per year, which is $240 per human QALY per year.
This is just a rough back-of-the-envelope calculation so it should not be taken literally, but I’m still surprised by how cost-inefficient this looks. I expected rat farms to be highly cost-effective based on the fact that most people don’t care about rats, and generally the less people care about some group, the easier it is to help that group. (It’s easier to help developing-world humans than developed-world humans, and easier still to help factory-farmed animals.) Again, I could be completely wrong about these calculations, but rat farms look less promising than I had expected.

I think this is a good example of something seeming like a plausible idea for making the world better, but which turned out to seem pretty ineffective.

Comment by buck on Concern, and hope · 2020-07-07T16:48:26.251Z · score: 11 (4 votes) · EA · GW

What current controversy are you saying might make moderate pro-SJ EAs more wary of SSC?

Comment by buck on Concern, and hope · 2020-07-07T16:14:34.863Z · score: 6 (7 votes) · EA · GW

I have two complaints: linking to a post which I think was made in bad faith in an attempt to harm EA, and seeming to endorse it by using it as an example of a perspective that some EAs have.

I think you shouldn't update much on what EAs think based on that post, because I think it was probably written in an attempt to harm EA by starting flamewars.

EDIT: Also, I kind of think of that post as trying to start nasty rumors about someone; I think we should generally avoid signal boosting that type of thing.

Comment by buck on KR's Shortform · 2020-07-07T16:05:28.629Z · score: 2 (1 votes) · EA · GW

I'd be interested to see a list of what kinds of systematic mistakes previous attempts at long-term forecasting made.

Also, I think that many longtermists (eg me) think it's much more plausible to successfully influence the long run future now than in the 1920s, because of the hinge of history argument.

Comment by buck on Concern, and hope · 2020-07-06T02:41:46.375Z · score: 34 (16 votes) · EA · GW

Many other people who are personally connected to the Chinese Cultural Revolution are the people making the comparisons, though. Eg the EA who I see posting the most about this (who I don't think would want to be named here) is Chinese.

Comment by buck on Concern, and hope · 2020-07-06T02:40:04.579Z · score: 26 (10 votes) · EA · GW

I think that both the Cultural Revolution comparisons and the complaints about Cultural Revolution comparisons are way less bad than that post.

Comment by buck on Concern, and hope · 2020-07-05T18:50:00.477Z · score: 21 (11 votes) · EA · GW
culminating in the Slate Star Codex controversy of the past two weeks

I don't think that the SSC kerfuffle is that related to the events that have caused people to worry about cultural revolutions. In particular, most of the complaints about the NYT plan haven't been related to the particular opinions Scott has written about.

Comment by buck on Concern, and hope · 2020-07-05T18:11:41.303Z · score: 23 (14 votes) · EA · GW

Edit: the OP has removed the link I’m complaining about.

I think it's quite bad to link to that piece. The piece makes extremely aggressive accusations and presents very little evidence to back them up; it was extensively criticised in the comments. I think that that piece isn't an example of people being legitimately concerned, it was an example of someone behaving extremely badly.

Another edit: I am 80% confident that the author of that piece is not actually a current member of the EA community, and I am more than 50% confident that the piece was written mostly with an intention of harming EA. This is a lot of why I think it's bad to link to it. I didn't say this in my initial comment, sorry.

Comment by buck on - A Petition · 2020-06-29T16:15:18.253Z · score: 23 (15 votes) · EA · GW

In addition to what Aaron said, I’d guess Scott is responsible for probably 10% of EA recruiting over the last few years.

Comment by buck on KR's Shortform · 2020-06-19T04:56:15.357Z · score: 14 (6 votes) · EA · GW

I think there are many examples of EAs thinking about the possibility that AI might be sentient by default. Some examples I can think of off the top of my head:

I don't think people are disputing that it would be theoretically possible for AIs to be conscious, I think that they're making the claim that AI systems we find won't be.

Comment by buck on Some thoughts on deference and inside-view models · 2020-06-03T14:50:41.952Z · score: 6 (4 votes) · EA · GW
My view is roughly that EAs were equally disposed to be deferential then as they are now (if there were a clear EA consensus then, most of these EAs would have deferred to it, as they do now), but that "because the 'official EA consensus' (i.e. longtermism) is more readily apparent" now, people's disposition to defer is more apparent.

This is an interesting possibility. I still think there's a difference. For example, there's a lot of disagreement within AI safety about what kind of problems are important and how to work on them, and most EAs (and AI safety people) seem much less inclined to try to argue with each other about this than I think we were at Stanford EA.

Agreed, but I can't remember the last time I saw someone try to argue that you should donate to AMF rather than longtermism.

I think this is probably a mixture of longtermism winning over most people who'd write this kind of post, and also that people are less enthusiastic about arguing about cause prio these days for whatever reason. I think the post would be recieved well inasmuch as it was good. Maybe we're agreeing here?

Whenever I do see near-termism come up, people don't seem afraid to communicate that they think that it is obviously indefensible, or that they think even a third-rate longtermist intervention is probably incomparably better than AMF because at least it's longtermist.

I don't see people say that very often. Eg I almost never see people say this in response to posts about neartermism on the EA Facebook group, or on posts here.

Comment by buck on Some thoughts on deference and inside-view models · 2020-06-03T05:14:15.783Z · score: 9 (5 votes) · EA · GW

I just looked up the proof of Fermat's Last Theorem, and it came about from Andrew Wiles spotting that someone else had recently proven something which could plausibly be turned into a proof, and then working on it for seven years. This seems like a data point in favor of the end-to-end models approach.

Comment by buck on Some thoughts on deference and inside-view models · 2020-06-03T05:12:33.646Z · score: 8 (4 votes) · EA · GW
I think the history of maths also provides some suggestive examples of the dangers of requiring end-to-end stories. E.g., consider some famous open questions in Ancient mathematics that were phrased in the language of geometric constructions with ruler and compass, such as whether it's possible to 'square the circle'. It was solved 2,000 years after it was posed using modern number theory. But if you had insisted that everyone working on it has an end-to-end story for how what they're doing contributes to solving that problem, I think there would have been a real risk that people continue thinking purely in ruler-and-compass terms and we never develop modern number theory in the first place.

I think you're interpreting me to say that people ought to have an externally validated end-to-end story; I'm actually just saying that they should have an approach which they think might be useful, which is weaker.

Comment by buck on Some thoughts on deference and inside-view models · 2020-06-03T05:00:41.387Z · score: 13 (5 votes) · EA · GW
I've heard this impression from several people, but it's unclear to me whether EAs have become more deferential, although it is my impression that many EAs are currently highly deferential

Here's what leads me to think EA seems more deferential now.

I spent a lot of time with the Stanford EA club in 2015 and 2016, and was close friends with many of the people there. We related to EA very differently to how I relate to EA now, and how most newer/younger EAs I talk to seem to relate to it.

The common attitude was something like "we're utilitarians, and we want to do as much good as we can. EA has some interesting people and interesting ideas in it. However, it's not clear who we can trust; there's lots of fiery debate about cause prioritization, and we just don't at all know whether we should donate to AMF or the Humane League or MIRI. There are EA orgs like CEA, 80K, MIRI, GiveWell, but it's not clear which of those people we should trust, given that the things they say don't always make sense to us, and they have different enough bottom line beliefs that some of them must be wrong."

It's much rarer nowadays for me to hear people have an attitude where they're wholeheartedly excited about utilitarianism but openly skeptical to the EA "establishment".

Part of this is that I think the arguments around cause prioritization are much better understood and less contentious now.

I think it's never been clearer or more acceptable to communicate implicitly or explicitly, that you think that people who support AMF (or other near-termist) probably just 'don't get' longtermism and aren't worth engaging with.

I feel like there are many fewer EA forum posts and facebook posts where people argue back and forth about whether to donate to AMF or more speculative things than there used to be.

Comment by buck on Some thoughts on deference and inside-view models · 2020-06-03T04:47:50.358Z · score: 23 (7 votes) · EA · GW

This comment is a general reply to this whole thread.

Some clarifications:

  • I don't think that we should require that people working in AI safety have arguments for their research which are persuasive to anyone else. I'm saying I think they should have arguments which are persuasive to them.
  • I think that good plans involve doing things like playing around with ideas that excite you, and learning subjects which are only plausibly related if you have a hunch it could be helpful; I do these things a lot myself.
  • I think there's a distinction between having an end-to-end story for your solution strategy vs the problem you're trying to solve--I think it's much more tractable to choose unusually important problems than to choose unusually effective research strategies.
    • In most fields, the reason you can pick more socially important problems is that people aren't trying very hard to do useful work. It's a little more surprising that you can beat the average in AI safety by trying intentionally to do useful work, but my anecdotal impression is that people who choose what problems to work on based on a model of what problems would be important to solve are still noticeably more effective.

Here's my summary of my position here:

  • I think that being goal directed is very helpful to making progress on problems on a week-by-week or month-by-month scale.
  • I think that within most fields, some directions are much more promising than others, and backchaining is required in order to work on the promising directions. AI safety is a field like this. Math is another--if I decided to try to do good by going into math, I'd end up doing research which was really different from normal mathematicians. I agree with Paul Christiano's old post about this.
  • If I wanted to maximize my probability of solving the Riemann hypothesis, I'd probably try to pursue some crazy plan involving weird strengths of mine and my impression of blind spots of the field. However, I don't think this is actually that relevant, because I think that the important work in AI safety (and most other fields of relevance to EA) is less competitive than solving the Riemann hypothesis, and also a less challenging mathematical problem.
  • I think that in my experience, people who do the best work on AI safety generally have a clear end-to-end picture of the story for what work they need to do, and people who don't have such a clear picture rarely do work I'm very excited about. Eg I think Nate Soares and Paul Christiano are both really good AI safety researchers, and both choose their research directions very carefully based on their sense of what problems are important to solve.

Sometimes I talk to people who are skeptical of EA because they have a stronger version of the position you're presenting here--they think that nothing useful ever comes of people intentionally pursuing research that they think is important, and the right strategy is to pursue what you're most interested in.

One way of thinking about this is to imagine that there are different problems in a field, and different researchers have different comparative advantages at the problems. In one extreme case, the problems vary wildly in importance, and so the comparative advantage basically doesn't matter and you should work on what's most important. In the other extreme, it's really hard to get a sense of which things are likely to be more useful than other things, and your choices should be dominated by comparative advantage.

(Incidentally, you could also apply this to the more general problem of deciding what to work on as an EA. My personal sense is that the differences in values between different cause areas are big enough to basically dwarf comparative advantage arguments, but within a cause area comparative advantage is the dominant consideration.)

I would love to see a high quality investigation of historical examples here.

Comment by buck on Some thoughts on deference and inside-view models · 2020-05-29T02:15:17.892Z · score: 10 (3 votes) · EA · GW

I mean on average; obviously you're right that our opinions are correlated. Do you think there's anything important about this correlation?

Comment by buck on The best places to donate for COVID-19 · 2020-03-28T02:11:06.977Z · score: 18 (7 votes) · EA · GW

You say that the impact/scale of COVID is "huge". I think this might mislead people who are used to thinking about the problems EAs think about. Here's why.

I think COVID is probably going to cause on the order of 100 million DALYs this year, based on predictions like this; I think that 50-95% the damage ever done by COVID will be done this year. On the scale that 80000 Hours uses to assess the scale of problems, this would be ranked as importance level 11 or so.

I think this is lower than most things EAs consider working on or funding. For example:

This is a logarithmic scale, so for example, according to this scale, health in poor countries is 100 times more important than COVID.

So given that COVID seems likely to be between 100x and 10000x less important than the main other cause areas EAs think about, I think it's misleading to describe its scale as "huge".

Comment by buck on What are the key ongoing debates in EA? · 2020-03-10T04:13:21.733Z · score: 14 (6 votes) · EA · GW

I'm interested in betting about whether 20% of EAs think psychedelics are a plausible top EA cause area. Eg we could sample 20 EAs from some group and ask them. Perhaps we could ask random attendees from last year's EAG. Or we could do a poll in EA Hangout.

Comment by buck on On Becoming World-Class · 2020-02-26T05:05:38.619Z · score: 9 (8 votes) · EA · GW

I think that it's important for EA to have a space where we can communicate efficiently, rather than phrase everything for the benefit of newcomers who might be reading, so I think that this is bad advice.

Comment by buck on My personal cruxes for working on AI safety · 2020-02-25T18:08:56.430Z · score: 6 (4 votes) · EA · GW

I'd prefer something like the weaker and less clear statement "we **can** think ahead, and it's potentially valuable to do so even given the fact that people might try to figure this all out later".

Comment by buck on My personal cruxes for working on AI safety · 2020-02-25T16:25:34.040Z · score: 2 (1 votes) · EA · GW

I think your summary of crux three is slightly wrong: I didn’t say that we need to think about it ahead of time, I just said that we can.

Comment by buck on My personal cruxes for working on AI safety · 2020-02-25T16:24:18.068Z · score: 7 (5 votes) · EA · GW

Yeah, for the record I also think those are pretty plausible and important sources of impact for AI safety research.

I think that either way, it’s useful for people to think about which of these paths to impact they’re going for with their research.

Comment by buck on Max_Daniel's Shortform · 2020-02-23T01:00:51.794Z · score: 14 (7 votes) · EA · GW
My guess is I consider the activities you mentioned less valuable than you do. Probably the difference is largest for programming at MIRI and smallest for Hubinger-style AI safety research. (This would probably be a bigger discussion.)

I don't think that peculiarities of what kinds of EA work we're most enthusiastic about lead to much of the disagreement. When I imagine myself taking on various different people's views about what work would be most helpful, most of the time I end up thinking that valuable contributions could be made to that work by sufficiently talented undergrads.

Independent of this, my guess would be that EA does have a decent number of unidentified people who would be about as good as people you've identified. E.g., I can think of ~5 people off the top of my head of whom I think they might be great at one of the things you listed, and if I had your view on their value I'd probably think they should stop doing what they're doing now and switch to trying one of these things. And I suspect if I thought hard about it, I could come up with 5-10 more people - and then there is the large number of people neither of us has any information about.

I am pretty skeptical of this. Eg I suspect that people like Evan (sorry Evan if you're reading this for using you as a running example) are extremely unlikely to remain unidentified, because one of the things that they do is think about things in their own time and put the results online. Could you name a profile of such a person, and which of the types of work I named you think they'd maybe be as good at as the people I named?

It might be quite relevant if "great people" refers only to talent or also to beliefs and values/preferences

I am not intending to include beliefs and preferences in my definition of "great person", except for preferences/beliefs like being not very altruistic, which I do count.

E.g. my guess is that there are several people who could be great at functional programming who either don't want to work for MIRI, or don't believe that this would be valuable. (This includes e.g. myself.)

I think my definition of great might be a higher bar than yours, based on the proportion of people who I think meet it? (To be clear I have no idea how good you'd be at programming for MIRI because I barely know you, and so I'm just talking about priors rather than specific guesses about you.)


For what it's worth, I think that you're not credulous enough of the possibility that the person you talked to actually disagreed with you--I think you might doing that thing whose name I forget where you steelman someone into saying the thing you think instead of the thing they think.

Comment by buck on My personal cruxes for working on AI safety · 2020-02-21T05:55:30.655Z · score: 7 (5 votes) · EA · GW
For the problems-that-solve-themselves arguments, I feel like your examples have very "good" qualities for solving themselves: both personal and economic incentives are against them, they are obvious when one is confronted with the situation, and at the point where the problems becomes obvious, you can still solve them. I would argue that not all these properties holds for AGI. What are your thoughts about that?

I agree that it's an important question whether AGI has the right qualities to "solve itself". To go through the ones you named:

  • "Personal and economic incentives are aligned against them"--I think AI safety has somewhat good properties here. Basically no-one wants to kill everyone, and AI systems that aren't aligned with their users are much less useful. On the other hand, it might be the case that people are strongly incentivised to be reckless and deploy things quickly.
  • "they are obvious when one is confronted with the situation"--I think that alignment problems might be fairly obvious, especially if there's a long process of continuous AI progress where unaligned non-superintelligent AI systems do non-catastrophic damage. So this comes down to questions about how rapid AI progress will be.
  • "at the point where the problems become obvious, you can still solve them"--If the problems become obvious because non-superintelligent AI systems are behaving badly, then we can still maybe put more effort into aligning increasingly powerful AI systems after that and hopefully we won't lose that much of the value of the future.
Comment by buck on Max_Daniel's Shortform · 2020-02-21T05:36:33.990Z · score: 21 (7 votes) · EA · GW

I'm not quite sure how high your bar is for "experience", but many of the tasks that I'm most enthusiastic about in EA are ones which could plausibly be done by someone in their early 20s who eg just graduated university. Various tasks of this type:

  • Work at MIRI on various programming tasks which require being really smart and good at math and programming and able to work with type theory and Haskell. Eg we recently hired Seraphina Nix to do this right out of college. There are other people who are recent college graduates who we offered this job to who didn't accept. These people are unusually good programmers for their age, but they're not unique. I'm more enthusiastic about hiring older and more experienced people, but that's not a hard requirement. We could probably hire several more of these people before we became bottlenecked on management capacity.
  • Generalist AI safety research that Evan Hubinger does--he led the writing of "Risks from Learned Optimization" during a summer internship at MIRI; before that internship he hadn't had much contact with the AI safety community in person (though he'd read stuff online).
    • Richard Ngo is another young AI safety researcher doing lots of great self-directed stuff; I don't think he consumed an enormous amount of outside resources while becoming good at thinking about this stuff.
  • I think that there are inexperienced people who could do really helpful work with me on EA movement building; to be good at this you need to have read a lot about EA and be friendly and know how to talk to lots of people.

My guess is that EA does not have a lot of unidentified people who are as good at these things as the people I've identified.

I think that the "EA doesn't have enough great people" problem feels more important to me than the "EA has trouble using the people we have" problem.

Comment by buck on My personal cruxes for working on AI safety · 2020-02-20T17:25:14.538Z · score: 5 (3 votes) · EA · GW
One underlying hypothesis that was not explicitly pointed out, I think, was that you are looking for priority arguments. That is, part of your argument is about whether AI safety research is the most important thing you could do (It might be so obvious in an EA meeting or the EA forum that it's not worth exploring, but I like expliciting the obvious hypotheses).

This is a good point.

Whereas you could argue that without pure mathematics, almost all the positive technological progress we have now (from quantum mechanics to computer science) would not exist.

I feel pretty unsure on this point; for a contradictory perspective you might enjoy this article.

Comment by buck on Do impact certificates help if you're not sure your work is effective? · 2020-02-14T06:47:46.405Z · score: 6 (3 votes) · EA · GW

[for context, I've talked to Eli about this in person]

I'm interpreting you as having two concerns here.

Firstly, you're asking why this is different than you deferring to people about the impact of the two orgs.

From my perspective, the nice thing about the impact certificate setup is that if you get paid in org B impact certificates, you're making the person at orgs A and B put their money where their mouth is. Analogously, suppose Google is trying to hire me, but I'm actually unsure about Google's long term profitability, and I'd rather be paid in Facebook stock than Google stock. If Google pays me in Facebook stock, I'm not deferring to them about the relative values of these stocks, I'm just getting paid in Facebook stock, such that if Google is overvalued it's no longer my problem, it's the problem of whoever traded their Facebook stock for Google stock.

The reason why I think that the policy of maximizing impact certificates is better for the world in this case is that I think that people are more likely to give careful answers to the question "how relatively valuable is the work orgs A and B are doing" if they're thinking about it in terms of trying to make trades than if some random EA is asking for their quick advice.


Secondly, you're worrying that people might end up seeming like they're endorsing an org that they don't endorse, and that this might harm community epistemics. This is an interesting objection that I haven't thought much about. A few possible responses:

  • It's already currently an issue that people have different amounts of optimism about their workplaces, and people don't very often publicly state how much they agree and disagree with their employer (though I personally try to be clear about this). It's unlikely that impact equity trades will exacerbate this problem much.
  • Also, people often work at places for reasons that aren't "I think this is literally the best org", eg:
    • comparative advantage
    • thinking that the job is fun
    • the job paying them a high salary (this is exactly analogous to them paying in impact equity of a different org)
    • thinking that the job will give you useful experience
    • random fluke of who happened to offer you a job at a particular point
    • thinking the org is particularly flawed and so you can do unusual amounts of good by pushing it in a good direction
  • Also, if there were liquid markets in the impact equity of different orgs, then we'd have access to much higher-quality information about the community's guess about the relative promisingness of different orgs. So pushing in this direction would probably be overall helpful.
Comment by buck on My personal cruxes for working on AI safety · 2020-02-13T19:27:26.015Z · score: 16 (10 votes) · EA · GW
This was nice to read, because I'm not sure I've ever seen anyone actually admit this before.

Not everyone agrees with me on this point. Many safety researchers think that their path to impact is by establishing a strong research community around safety, which seems more plausible as a mechanism to affect the world 50 years out than the "my work is actually relevant" plan. (And partially for this reason, these people tend to do different research to me.)

You say you think there's a 70% chance of AGI in the next 50 years. How low would that probability have to be before you'd say, "Okay, we've got a reasonable number of people to work on this risk, we don't really need to recruit new people into AI safety"?

I don't know what the size of the AI safety field is such that marginal effort is better spent elsewhere. Presumably this is a continuous thing rather than a discrete thing. Eg it seems to me that now compared to five years ago, there are way more people in AI safety and so if your comparative advantage is in some other way of positively influencing the future, you should more strongly consider that other thing.

Comment by buck on Thoughts on doing good through non-standard EA career pathways · 2020-01-09T08:05:54.014Z · score: 6 (4 votes) · EA · GW
What do you think about participating in a forecasting platform, e.g. Good Judgement Open or Metaculus? It seems to cover all ingredients, and even be a good signal for others to evaluate your judgement quality.

Seems pretty good for predicting things about the world that get resolved on short timescales. Sadly it seems less helpful for practicing judgement about things like the following:

  • judging arguments about things like the moral importance of wild animal suffering, plausibility of AI existential risk, and existence of mental illness
  • long-term predictions
  • predictions about small-scale things like how a project should be organized (though you can train calibration on this kind of question)

Re my own judgement: I appreciate your confidence in me. I spend a lot of time talking to people who have IMO better judgement than me; most of the things I say in this post (and a reasonable chunk of things I say other places) are my rephrasings of their ideas. I think that people whose judgement I trust would agree with my assessment of my judgement quality as "good in some ways" (this was the assessment of one person I asked about this in response to your comment).

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-22T05:24:25.466Z · score: 10 (4 votes) · EA · GW
It seems that your current strategy is to focus on training, hiring and outreaching to the most promising talented individuals.

This seems like a pretty good summary of the strategy I work on, and it's the strategy that I'm most optimistic about.

Other alternatives might include more engagement with amatures, and providing more assistance for groups and individuals that want to learn and conduct independent research.

I think that it would be quite costly and difficult for more experienced AI safety researchers to try to cause more good research to happen by engaging more with amateurs or providing more assistance to independent research. So I think that experienced AI safety researchers are probably going to do more good by spending more time on their own research than by trying to help other people with theirs. This is because I think that experienced and skilled AI safety researchers are much more productive than other people, and because I think that a reasonably large number of very talented math/CS people become interested in AI safety every year, so we can set a pretty high bar for which people to spend a lot of time with.

Also, what would change if you had 10 times the amount of management and mentorship capacity?

If I had ten times as many copies of various top AI safety researchers and I could only use them for management and mentorship capacity, I'd try to get them to talk to many more AI safety researchers, through things like weekly hour-long calls with PhD students, or running more workshops like MSFP.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-22T05:16:19.716Z · score: 18 (7 votes) · EA · GW
I’m a fairly good ML student who wants to decide on a research direction for AI Safety.

I'm not actually sure whether I think it's a good idea for ML students to try to work on AI safety. I am pretty skeptical of most of the research done by pretty good ML students who try to make their research relevant to AI safety--it usually feels to me like their work ends up not contributing to one of the core difficulties, and I think that they might have been better off if they'd instead spent their effort trying to become really good at ML in the hope of being better skilled up with the goal of working on AI safety later.

I don't have very much better advice for how to get started on AI safety; I think the "recommend to apply to AIRCS and point at 80K and maybe the Alignment Newsletter" path is pretty reasonable.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-22T05:12:48.818Z · score: 11 (5 votes) · EA · GW

It was a good time; I appreciate all the thoughtful questions.

Comment by buck on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-22T05:08:39.326Z · score: 12 (6 votes) · EA · GW

Most of them are related to AI alignment problems, but it's possible that I should work specifically on them rather than other parts of AI alignment.