New series of posts answering one of Holden's "Important, actionable research questions" 2022-05-12T21:22:33.705Z
Action: Help expand funding for AI Safety by coordinating on NSF response 2022-01-20T20:48:24.534Z
People in bunkers, "sardines" and why biorisks may be overrated as a global priority 2021-10-23T00:19:13.392Z
Evan R. Murphy's Shortform 2021-10-22T00:32:33.528Z


Comment by Evan R. Murphy on Important, actionable research questions for the most important century · 2022-05-12T22:37:47.602Z · EA · GW

My first 2 posts for this project went live on the Alignment Forum today:

1. Introduction to the sequence: Interpretability Research for the Most Important Century
2. (main post) Interpretability’s Alignment-Solving Potential: Analysis of 7 Scenarios

Comment by Evan R. Murphy on EA is more than longtermism · 2022-05-05T02:09:39.793Z · EA · GW

I learned a lot from reading this post and some of the top comments, thanks for the useful analysis.

Throughout the post and comments people are tending to classify AI safety as a "longtermist" cause. This isn't wrong, but for anyone less familiar with the topic, I just want to point out that there are many of us who work in the field and consider AI to be a near-to-medium term existential risk.

Just in case "longtermism" gave anyone the wrong impression that AI x-risk is something we definitely won't be confronted with for 100+ years. Many of us think it will be much sooner than that (though there is still considerable uncertainty and disagreement about timelines).

See the related post "Long-Termism" vs. "Existential Risk" by Scott Alexander.

Comment by Evan R. Murphy on aogara's Shortform · 2022-04-08T21:50:10.399Z · EA · GW

You're right, that paragraph was confusing. I just edited it to try and make it more clear.

Comment by Evan R. Murphy on aogara's Shortform · 2022-04-08T20:57:57.450Z · EA · GW

These are thoughtful data points, but consider that they may just be good evidence for hard takeoff rather than soft takeoff.

What I mean is that most of these examples show a failure of narrow AIs to deliver on some economic goals. In soft takeoff, we expect to see things like broad deployment of AIs contributing to massive economic gains and GDP doublings in short periods of time well before we get to anything like AGI.

But in hard takeoff, failure to see massive success from narrow AIs could happen due to regulations and other barriers (or it could just be limitations of the narrow AI). In fact, these limitations could even point more forcefully to the massive benefits of an AI that can generalize. And having the recipe for that AGI discovered and deployed in a lab doesn't depend on the success of prior narrow AIs in the regulated marketplace. AGI is a different breed and may also become powerful enough that it doesn't have to play by the rules of the regulated marketplace and national legal systems.

Machines will need to learn in open-ended play with the world, where today they mostly learn from labeled examples. 

Have you seen DeepMind's Generally capable agents emerge from open-ended play? I think it is a powerful demonstration of learning from open-ended play actually working in a lab (not just a possible future approach). Though it is still in a virtual environment rather than the real physical world.

Comment by Evan R. Murphy on Why don't governments seem to mind that companies are explicitly trying to make AGIs? · 2022-04-08T20:10:21.710Z · EA · GW

I think very, very few people really believe that superintelligence systems will be that influential.


A lot of prominent scientists, technologists and intellectuals outside of EA have warned about advanced artificial intelligence too. Stephen Hawking, Elon Musk, Bill Gates, Sam Harris, everyone on this open letter back in 2015 etc.

I agree that the number of people really concerned about this is strikingly small given the emphasis longtermist EAs put on it. But I think these many counter-examples warn us that it's not just EAs and the AGI labs being overconfident or out of left field. 

Comment by Evan R. Murphy on Important, actionable research questions for the most important century · 2022-03-15T21:46:55.857Z · EA · GW

What relatively well-scoped research activities are particularly likely to be useful for longtermism-oriented AI alignment?
(3)    Activity that is likely to be relevant for the hardest and most important parts of the problem, while also being the sort of thing that researchers can get up to speed on and contribute to relatively straightforwardly (without having to take on an unusual worldview, match other researchers’ unarticulated intuitions to too great a degree, etc.)


I'm planning to spend some time working on this question, or rather part of it. In particular I'm going to explore the argument that interpretability research falls into this category, with some attention to which specific aspects or angles of interpretability research seem most useful.

Since I don't plan to spend much time thoroughly examining other research directions besides interpretability, I don't expect to have a complete comparative answer to the question. But by answering the question for interpretability, I hope to at least put together a fairly comprehensive argument for (or perhaps against, we'll see after I look at the evidence!) interpretability research that could be used by those considering it as a target for their funding or their time. I also hope that then someone trying to answer the larger question could use my work on interpretability as part of a comparative analysis across different research activities.

If someone is already working on this particular question and I'm duplicating effort, please let me know and perhaps we can sync up. Otherwise, I hope to have something to show on this question in a few/several weeks!

Comment by Evan R. Murphy on What are effective ways to help Ukrainians right now? · 2022-02-28T04:28:05.892Z · EA · GW

Why is this being severely downvoted? Is it because it's a scam or something, or because Red Cross is just not considered an effective charity?

Comment by Evan R. Murphy on Some thoughts on vegetarianism and veganism · 2022-02-19T06:41:23.384Z · EA · GW

Fourthly, I am kinda worried about health effects, especially on short-to-medium-term energy levels.

I've been mostly (~98%) vegan since 2013. This concern really surprised me, because the health benefits of eating plant based are clearly extraordinary in my view. It was actually the primary driver for me going from vegetarian to vegan, with concern for the farmed animals growing later.

I would say that the very short term effects (1-2 weeks) can be disruptive as the bacteria in your gut become re-selected from bacteria that feed on meat or dairy to ones that feed on fibrous foods - veggies, fruit, legumes, nuts etc. But the 2-weeks-to-rest-of-life health benefits are massive. Ranging from higher energy levels to prevention of many kinds of diseases, greatly lower risk of heart disease, lower cholesterol levels, lower blood pressure, easier to lose/maintain weight... the list goes on. It does require actually eating more cruciferous and other veg, berries and other fruits, legumes, spices to etc. though and not just being a junk food vegan who eats cookies and crackers all the time. And you do need to supplement with vitamin B12 because this essential nutrient gets stripped from the fruits and veg when we wash them (and it is a good idea to still wash them).

Highly recommend exploring Dr. Michael Greger's work ( if anyone is curious about health questions around plant-based/vegan diet. He's an evidence-based/altruistic M.D. nutritionist and does great work separating out the quality nutrition research from ones with methodological errors, bias from funding sources etc. (I am in no way affiliated and his work as I understand it is not for profit.)

Comment by Evan R. Murphy on Action: Help expand funding for AI Safety by coordinating on NSF response · 2022-01-25T20:31:36.610Z · EA · GW

Habryka, I appreciate you sharing your outputs. Do you have a few minutes to follow up with a little explanation of your models yet? It's ok if it's a rough/incomplete explanation. But it would help to know a bit more about what you've seen with government-funded research etc.  that makes you think this would be net-negative for the world.

Comment by Evan R. Murphy on Action: Help expand funding for AI Safety by coordinating on NSF response · 2022-01-22T01:21:04.248Z · EA · GW

Judging by the voting and comments so far (both here as well as on the LessWrong crosspost), my sense is that many here support this effort, but some definitely have concerns. A few of the concerns are based in hardcore skepticism about academic research that I'm not sure are compatible with responding to the RfI. Many concerns though seems to be about this generating vague NSF grants that are in the name of AI safety but don't actually contribute to the field.

For these latter concerns, I wonder is there a way we could resolve them by limiting the scope of topics in our NSF responses or giving them enough specificity? For example, what if we convinced the NSF that all they should make grants for is mechanistic interpretability projects like the Circuits Thread. This is an area that most researchers in the alignment community seem to agree is useful, we just need a lot more people doing it to make substantial progress. And maybe there is less room to go adrift or mess up this kind of concrete and empirical research compared to some of the more theoretical research directions.

It doesn't have to be just mechanistic interpretability, but my point is, are there ways we could shape or constrain our responses to the NSF like this that would help address your concerns?

Comment by Evan R. Murphy on Action: Help expand funding for AI Safety by coordinating on NSF response · 2022-01-21T20:49:55.556Z · EA · GW

I agree that we need to care with high fidelity idea transmission, and there is some risk of diluting the field. But I think the reasonable chance of this spurring some more good research in AI safety is worth it, even if there will also be some wasted money.

One thing that's interesting in the RfI is that it links to something called THE NATIONAL ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT STRATEGIC PLAN: 2019 UPDATE. This PDF outlines a federal committee's strategic plan for dealing with AI. Strategy #4 is Ensure the Safety and Security of AI Systems and they are saying a lot of the "right things". For example, it includes discussion of emergent behavior, goal misspecification, explainability/transparency and long-term AI safety and value-alignment. Whether this will help translate into useful actions isn't certain, but it's heartening to see some acknowledgment of AI concerns from the US government besides just "develop it before China does".

As for the current funding situation in the EA/AI risk community, I have also heard about this issue of there being too much funding for not enough good researchers/applicants right now. I don't think we should get to used to this dynamic though. The situation could easily reverse in a short time if awareness about AI risk causes a wave of new research interest, or if 80,000 Hours, AGI Safety Fundamentals Curriculum, AI Safety Camp and related programs are able to introduce more people into the field. So just because we have a funding glut now doesn't mean we should assume that will continue through 2023 which is the time period that this NSF RfI pertains to.

Comment by Evan R. Murphy on Why don't governments seem to mind that companies are explicitly trying to make AGIs? · 2021-12-30T19:59:37.793Z · EA · GW

There are two governance-related proposals in the second EA megaprojects thread. One is to create a really large EA-oriented think tank. The other is essentially EA lobbying, i.e. to put major funding behind political parties and candidates who agree to take EA concerns seriously.

Making one of these megaprojects a reality could get officials in governments to take AGI more seriously and/or get it more into the mainstream political discourse.

Comment by Evan R. Murphy on Why don't governments seem to mind that companies are explicitly trying to make AGIs? · 2021-12-30T19:51:37.724Z · EA · GW

Indeed, doesn't have even a single mention of the term "AGI".

Comment by Evan R. Murphy on Why don't governments seem to mind that companies are explicitly trying to make AGIs? · 2021-12-30T19:28:45.293Z · EA · GW

Andrew Yang made transformative AI a fairly central part of his 2020 presidential campaign. To the OP's point though, I don't recall him raising any alarms about the existential risks of AGI.

Comment by Evan R. Murphy on Disagreeables and Assessors: Two Intellectual Archetypes · 2021-11-12T00:44:44.378Z · EA · GW

Albert Einstein also comes to mind as an agreeable generator. I haven't read his biography or anything, but based on the collage of stories I've heard about him, he never seemed like a very disagreeable person but obviously generated important new ideas.

Comment by Evan R. Murphy on Disagreeables and Assessors: Two Intellectual Archetypes · 2021-11-12T00:41:05.516Z · EA · GW

Dr. Greger from also seems like an agreeable generator. Actually he may be disagreeable in that he's not shy about pointing out flaws in studies and others' conceptions, but he does it in an enthusiastic, silly and not particularly abrasive way.

It's interesting that some people may still disagree often but not be doing it in a disagreeable manner.

Comment by Evan R. Murphy on People in bunkers, "sardines" and why biorisks may be overrated as a global priority · 2021-10-23T21:49:09.865Z · EA · GW

Thanks for sharing your expertise and in-depth reply!

Comment by Evan R. Murphy on Evan R. Murphy's Shortform · 2021-10-23T07:23:31.551Z · EA · GW

Thanks, Linch. I didn't realize I might be treading near information hazards. It's good to know and an interesting point about the pros and cons of having such conversations openly.  

Comment by Evan R. Murphy on Introducing High Impact Professionals · 2021-10-23T00:30:07.721Z · EA · GW

Love to see new efforts like this. One question/thought is how does HIP compare to or fit in with 80,000 Hours? Do you see it as filling a different niche/need or the same one and that more work was just needed in this direction?

Comment by Evan R. Murphy on Evan R. Murphy's Shortform · 2021-10-22T00:32:33.744Z · EA · GW

People in bunkers, "sardines" and why biorisks may be overrated as a global priority

I'm going to make the case here that certain problem areas currently prioritized highly in the longtermist EA community are overweighted in their importance/scale. In particular I'll focus on biorisks, but this could also apply to other risks such as non-nuclear global war and perhaps other areas as well.

I'll focus on biorisks because that is currently highly prioritized by both Open Philanthropy and 80,000 Hours and probably other EA groups as well. If I'm right that biotechnology risks should be deprioritized, that would relatively increase the priority of other issues like AI, growing Effective Altruism, global priorities research, nanotechnology risks and others by a significant amount. So it could help allocate more resources to those areas which still pose existential threats to humanity.

I won't be taking issue with the longtermist worldview here. In fact, I'll assume the longtermist worldview is correct. Rather, I'm questioning whether biorisks really pose a significant existential/extinction risk to humanity. I don't doubt that they could lead to major global catastrophes which it would be really good to avert. I just think that it's extremely unlikely for them to lead to total human extinction or permanent civilization collapse.

This started when I was reading about disaster shelters. Nick Beckstead has a paper considering whether they could be a useful avenue for mitigating existential risks [1]. He concludes there could be a couple of special scenarios where they are that need further research, but by and large new refuges don't seem like a great investment because there are already so many existing shelters and other things which could serve to protect people from many global catastrophes. Specifically, the world already has a lot of government bunkers, private shelters, people working on submarines, and 100-200 uncontacted peoples which are likely to produce survivors from certain otherwise devastating events. [1]

A highly lethal engineered pandemic is among the biggest risks considered from biotechnology. This could potentially wipe out billions of people and lead to a collapse of civilization. But it's extremely unlikely not to spare at least a few hundred or thousand people among those who have access to existing bunkers or other disaster shelters, people who are working on submarines, and among the dozens of tribes and other peoples living in remote isolation. Repopulating the Earth and rebuilding civilization would not be fast or easy, but these survivors could probably do it over many generations.

So are humans immune then from  all existential risks thanks to preppers, "sardines" [2] and uncontacted peoples? No. There are certain globally catastrophic events which would likely spare no one. A superintelligent malevolent AI could probably hunt everyone down. The feared nanotechnological "gray goo" scenario could wreck all matter on the planet. A nuclear war extreme enough that it contaminated all land on the planet with radioactivity - even though it would likely have immediate survivors - might create such a mess that no humans would last long-term. There are probably others as well.

I've gone out on a bit of a limb here to claim that biorisks aren't an existential risk. I'm not a biotech expert, so there could be some biorisks that I'm not aware of. For example, could there be some kind of engineered virus that contaminates all food sources on the planet? I don't know and would be interested to hear from folks about that. This could be similar to a long-lasting global nuclear fallout in that it would have immediate survivors but not long-term survivors.  However, mostly the biorisks I have seen people focus on seem to be lethal virulent engineered pandemics that target humans. As I've said, it seems unlikely this would kill all the humans in bunkers/shelters, submarines and on remote parts of the planet.

Even if there is some kind of lesser-known biotech risk which could be existential, my bottom-line claim is that there seems to be an important line between real existential risks that would annihilate all humans and near-existential risks that would spare some people in disaster shelters and shelter-like situations. I haven't seen this line discussed much and I think it could help with better prioritizing global problem areas for the EA community.


[1]: "How much could refuges help us recover from a global catastrophe?"

[2]: I just learned that sailors use this term for submariners which is pretty fun.

Comment by Evan R. Murphy on Update on Cause Prioritization at Open Philanthropy · 2021-10-19T00:13:00.399Z · EA · GW
  • We may try to create something similar to what GiveWell uses for its cost-effectiveness analysis: a spreadsheet where different people can fill in their values for key parameters (such as relative credence in different worldviews, and which ones they think should benefit from various fairness agreements), with explanations and links to writeups with more detail and argumentation for each parameter, and basic analytics on the distribution of inputs (for example, what the median allocation is to each worldview, across all staff members).


This would be very helpful. I'm having trouble even finding a sheet that prioritizes causes using a static worldview, i.e. one that lists causes with scores for Importance, Neglectedness and Tractability/Solvability and has notes to explain.

Comment by Evan R. Murphy on The Duplicator: Instant Cloning Would Make the World Economy Explode · 2021-10-16T00:41:49.979Z · EA · GW

I'm not familiar with predictions about this Duplicator technology. Is this a real prospective technology that would duplicate people's minds and bodies, or is it just their minds? Or is the actual predicted technology you're concerned about really digital people or advanced AI and it was just easier to explain the ideas with this similar fictional Duplicator technology?