Posts

What are your main reservations about identifying as an effective altruist? 2021-03-30T09:55:03.249Z
Some thoughts on risks from narrow, non-agentic AI 2021-01-19T00:07:23.075Z
My evaluations of different domains of Effective Altruism 2021-01-15T23:15:17.010Z
Clarifying the core of Effective Altruism 2021-01-15T23:02:27.500Z
Lessons from my time in Effective Altruism 2021-01-15T21:54:54.565Z
Scope-sensitive ethics: capturing the core intuition motivating utilitarianism 2021-01-15T16:22:14.094Z
What foundational science would help produce clean meat? 2020-11-13T13:55:27.566Z
AGI safety from first principles 2020-10-21T17:42:53.460Z
EA reading list: utilitarianism and consciousness 2020-08-07T19:32:02.050Z
EA reading list: other reading lists 2020-08-04T14:56:28.422Z
EA reading list: miscellaneous 2020-08-04T14:42:44.119Z
EA reading list: futurism and transhumanism 2020-08-04T14:29:52.883Z
EA reading list: Paul Christiano 2020-08-04T13:36:51.331Z
EA reading list: global development and mental health 2020-08-03T11:53:10.890Z
EA reading list: Scott Alexander 2020-08-03T11:46:17.315Z
EA reading list: replaceability and discounting 2020-08-03T10:10:54.968Z
EA reading list: longtermism and existential risks 2020-08-03T09:52:41.256Z
EA reading list: suffering-focused ethics 2020-08-03T09:40:38.142Z
EA reading list: EA motivations and psychology 2020-08-03T09:24:07.430Z
EA reading list: cluelessness and epistemic modesty 2020-08-03T09:23:44.124Z
EA reading list: population ethics, infinite ethics, anthropic ethics 2020-08-03T09:22:15.461Z
EA reading list: moral uncertainty, moral cooperation, and values spreading 2020-08-03T09:21:36.288Z
richard_ngo's Shortform 2020-06-13T10:46:26.847Z
What are the key ongoing debates in EA? 2020-03-08T16:12:34.683Z
Characterising utopia 2020-01-02T00:24:23.248Z
Technical AGI safety research outside AI 2019-10-18T15:02:20.718Z
Does any thorough discussion of moral parliaments exist? 2019-09-06T15:33:02.478Z
How much EA analysis of AI safety as a cause area exists? 2019-09-06T11:15:48.665Z
How do most utilitarians feel about "replacement" thought experiments? 2019-09-06T11:14:20.764Z
Why has poverty worldwide fallen so little in recent decades outside China? 2019-08-07T22:24:11.239Z
Which scientific discovery was most ahead of its time? 2019-05-16T12:28:54.437Z
Why doesn't the EA forum have curated posts or sequences? 2019-03-21T13:52:58.807Z
The career and the community 2019-03-21T12:35:23.073Z
Arguments for moral indefinability 2019-02-08T11:09:25.547Z
Disentangling arguments for the importance of AI safety 2019-01-23T14:58:27.881Z
How democracy ends: a review and reevaluation 2018-11-24T17:41:53.594Z
Some cruxes on impactful alternatives to AI policy work 2018-11-22T13:43:40.684Z

Comments

Comment by richard_ngo on AGI safety from first principles · 2021-06-27T23:01:24.628Z · EA · GW

Ah, I like the multiagent example. So to summarise: I agree that we have some intuitive notion of what cognitive processes we think of as intelligent, and it would be useful to have a definition of intelligence phrased in terms of those. I also agree that Legg's behavioural definition might diverge from our implicit cognitive definition in non-trivial ways.

I guess the reason why I've been pushing back on your point is that I think that possible divergences between the two aren't the main thing going on here. Even if it turned out that the behavioural definition and the cognitive definition ranked all possible agents the same, I think the latter would be much more insightful and much more valuable for helping us think about AGI.

But this is probably not an important disagreement.

Comment by richard_ngo on AGI safety from first principles · 2021-06-27T01:12:39.945Z · EA · GW

Ah, I see. I thought you meant "situations" as in "individual environments", but it seems like you meant "situations" as in "possible ways that all environments could be".

In that case, I think you're right, but I don't consider it a problem. Why might it be the case that adding more compute, or more memory, or something like that, would be net negative across all environments? It seems like either we'd have to define the set of environments in a very gerrymandered way, or else there's something about the change we made that lands us in a valley of bad thinking. In the former case, we should use a wider set of environments; in the latter case, it seems easier to bite the bullet and say "Yeah, turns out that adding more of this usually-valuable trait makes agents less intelligent."

Comment by richard_ngo on AGI safety from first principles · 2021-06-26T05:03:26.113Z · EA · GW

One thing I'm confused about is whether Legg's definition (or your rephrasing) allows for situations where it's in principle possible that being smarter is ex ante worse for an agent (obviously ex post it's possible to follow the correct decision procedure and be unlucky).

There definitely are such cases - e.g. Omega penalises all smart agents. Or environments where there are several crucial considerations which you're able to identify at different levels of intelligence, so that as intelligence increases, your success increases and decreases.

But in general I agree with your complaint about Legg's definition being defined in behavioural terms, and how it'd be better to have a good definition of intelligence in terms of the cognitive processes involved (e.g. planning, abstraction, etc). I do think that starting off in behaviourist terms was a good move, back when people were much more allergic to talking about AGI/superintelligence. But now that we're past that point, I think we can do better. (I don't think I've written about this yet in much detail, but it's quite high on my list of priorities.)

Comment by richard_ngo on AGI safety from first principles · 2021-06-25T21:28:40.548Z · EA · GW

I intended mine to be a slight rephrasing of Legg and Hutter's definition to make it more accessible to people without RL backgrounds. One thing that's not obvious from the way they use "environments" is that the goal is actually built into the environment via a reward function, so describing each environment as a "task" seems accurate.

A second non-obvious thing is that the body the agent uses is also defined as part of the environment, so that the agent only performs the abstract task of sending instructions to that body. A naive reading of Legg and Hutter's definition would interpret a stronger agent as being more intelligent. Adding "cognitive" I think rules this out, while also remaining true to the spirit of the original definition.

Curious if you still disagree, and if so why - I don't really see what you're pointing at with the Raven's Matrices example.

Comment by richard_ngo on On the limits of idealized values · 2021-06-22T10:44:24.554Z · EA · GW

Fantastic post. A few scattered thoughts inspired by it:

If you aren’t trying to conform to some standard, than how can you truly, and non-arbitrarily, choose?

Why does our choice need to be non-arbitrary? If we take certain intuitions/desires/instincts as primitives, they may be fundamentally arbitrary, but that's because we  are unavoidably arbitrary. Yet this arbitrary initial state is all we have to work from.

What’s needed, here, is a type of choice that is creating, rather than trying to conform — and which hence, in a sense, is “infallible.”

It feels like infallible is the wrong type of description here, for the same reason that it would be odd to say that my taste in food is infallible. At a certain level the predicate "correct" will stop making sense. (Maybe that level isn't the level of choices, though; maybe it's instincts, or desires, or intuitions, or tastes - things that we don't see ourselves as having control over.)

Comment by richard_ngo on richard_ngo's Shortform · 2021-05-20T13:21:43.514Z · EA · GW

There's an old EA forum post called Effective Altruism is a question (not an ideology) by Helen Toner, which I think has been pretty influential.*

But I was recently thinking about how the post rings false for me personally. I know that many people in EA are strongly motivated by the idea of doing the most good. But I was personally first attracted to an underlying worldview composed of stories about humanity's origins, the rapid progress we've made, the potential for the world to be much better, and the power of individuals to contribute to that; from there, given potentially astronomical stakes, altruism is a natural corollary.

I think that leaders in EA organisations are more likely to belong to the former category, of people inspired by EA as a question. But as I discussed in this post, there can be a tradeoff between interest in EA itself versus interest in the things EA deems important. Personally I prioritise making others care about the worldview more than making them care about the question: caring about the question pushes you to do the right thing in the abstract, but caring about the worldview seems better at pushing you towards its most productive frontiers. This seems analogous to how the best scientists are more obsessed with the thing they're studying than the downstream effects of their research.

Anyway, take all this with a grain of salt; it's not a particularly firm opinion, just one personal perspective. But one longstanding EA I was talking to recently found it surprising, so I thought it'd be worth sharing in case others do too. 


* As one datapoint: since the EA forum has been getting more users over time, a given karma score is more impressive the older a post is. Helen's post is twice as old as any other post with comparable or higher karma, making it a strong outlier.

Comment by richard_ngo on Why should we *not* put effort into AI safety research? · 2021-05-16T15:14:18.593Z · EA · GW

Drexler's CAIS framework attacks several of the premises underlying standard AI risk arguments (although iirc he also argues that CAIS-specific safety work would be valuable). Since his original report is rather long, here are two summaries.

Comment by richard_ngo on Should you do a PhD in science? · 2021-05-09T10:02:53.141Z · EA · GW

I suspect 1/3 is a significant overestimate since US universities attract people who did their PhDs all across the world.

Comment by richard_ngo on Why AI is Harder Than We Think - Melanie Mitchell · 2021-04-28T11:15:19.902Z · EA · GW

I was pleasantly surprised by this paper (given how much dross has been written on this topic). My thoughts on the four fallacies Mitchell identifies:

Fallacy 1: Narrow intelligence is on a continuum with general intelligence

This is hard to evaluate, since Mitchell only discusses it very briefly. I do think that people underestimate the gap between solving tasks with near-infinite data (like Starcraft) vs low-data tasks. But saying that GPT-3 isn't a step towards general intelligence also seems misguided, given the importance of few-shot learning.

Fallacy 2: Easy things are easy and hard things are hard

I agree that Moravec's paradox is important and underrated. But this also cuts the other way: if chess and Go were easy, then we should be open to the possibility that maths and physics are too.

Fallacy 3: The lure of wishful mnemonics

This is true and important. My favourite example is artificial planning. Tree search algorithms are radically different from human planning, which operates over abstractions. Yet this is hard to see because we use the same word for both.

Fallacy 4: Intelligence is all in the brain

This is the one I disagree with most, because "embodied cognition" is a very slippery concept. What does it mean? "The representation of conceptual knowledge is ... multimodal" - okay, but CLIP is multimodal.

"Thoughts are inextricably associated with perception, action, and emotion." Okay, but RL agents have perceptions and actions. And even if the body plays a crucial role in human emotions, it's a big leap to claim that disembodied agents therefore can't develop emotions.

Under this fallacy, Mitchell also discusses AI safety arguments by Bostrom and Russell. I agree that early characterisations of AIs as "purely rational" were misguided. Mitchell argues that AIs will likely also have emotions, cultural biases, a strong sense of selfhood and autonomy, and a commonsense understanding of the world. This seems plausible! But note that none of these directly solves the problem of misaligned goals. Sociopaths have all these traits, but we wouldn't want them to have superhuman intelligence.

This does raise the question: can early arguments for AI risk be reformulated to rely less on this "purely rational" characterisation? I think so - in fact, that's what I tried to do in this report.

Comment by richard_ngo on Some quick notes on "effective altruism" · 2021-03-30T09:48:04.405Z · EA · GW

Well, my default opinion is that we should keep things as they are;  I don't find the arguments against "effective altruism" particularly persuasive, and name changes at this scale are pretty costly.

Insofar as people want to keep their identities small, there are already a bunch of other terms they can use - like longtermist, or environmentalist, or animal rights advocate. So it seems like the point of having a term like EA on top of that is to identify a community. And saying "I'm part of the effective altruism community" softens the term a bit.

around half of the participants (including key figures in EA) said that they don’t self-identify as "effective altruists"

This seems like the most important point to think about; relatedly, I remember being surprised when I interned at FHI and learned how many people there don't identify as effective altruists. It seems indicative of some problem, which seems worth pursuing directly. As a first step, it'd be good to hear more from people who have reservations about identifying as an effective altruist. I've just made a top-level question about it, plus an anonymous version - if that describes you, I'd be interested to see your responses!

Comment by richard_ngo on Some quick notes on "effective altruism" · 2021-03-27T22:54:19.543Z · EA · GW

I think the "global priorities" label fails to escape several of the problems that Jonas argued the EA brand has. In particular, it sounds arrogant for someone to say that they're trying to figure out global priorities. If I heard of a global priorities forum or conference, I'd expect it to have pretty strong links with the people actually responsible for implementing global decisions; if it were actually just organised by a bunch of students, then they'd seem pretty self-aggrandizing.

The "priorities" part may also suggest to others that they're not a priority. I expect "the global priorities movement has decided that X is not a priority" seems just as unpleasant to people pursuing X as "the effective altruism movement has decided that X is not effective".

Lastly, "effective altruism" to me suggests both figuring out what to do, and then doing it. Whereas "global priorities" only has connotations of the former.

Comment by richard_ngo on Proposed Longtermist Flag · 2021-03-24T11:20:06.116Z · EA · GW

What would you think about the same flag with the sun removed?

Might make it look a little unbalanced, but I kinda like that - longtermism is itself unbalanced in its focus on the future.

Comment by richard_ngo on Some preliminaries and a claim · 2021-02-26T14:31:38.841Z · EA · GW

I didn't phrase this as clearly as I should have, but it seems to me that there are two separate issues here: firstly whether group X's views are correct, and secondly whether group X uses a methodology that is tightly coupled to reality (in the sense of having tight feedback loops, or making clear predictions, or drawing on a lot of empirical evidence).

I interpret your critique of EA roughly as the claim that a lack of a tight methodological coupling to reality leads to a lack of correctness. My critique of the posts you linked is also that they lack tight methodological coupling to reality, in particular because they rely on high-level abstractions. I'm not confident about whether this means that they're actually wrong, but it still seems like a problem.

Comment by richard_ngo on Some preliminaries and a claim · 2021-02-25T22:20:33.816Z · EA · GW

I claim that the Effective Altruism and Bay Area Rationality communities have collectively decided that they do not need to participate in tight feedback loops with reality in order to have a huge, positive impact.

I am somewhat sympathetic to this complaint. However, I also think that many of the posts you linked are themselves phrased in terms of very high-level abstractions which aren't closely coupled to reality, and in some ways exacerbate the sort of epistemic problems they discuss. So I'd rather like to see a more careful version of these critiques.

Comment by richard_ngo on Contact with reality · 2021-02-19T07:57:09.988Z · EA · GW

Yes, I think I still have these concerns; if I had extreme cognitive biases all along, then I would want them removed even if it didn't improve my understanding of the world. It feels similar to if you told me that I'd lived my whole life in a (pleasant) dreamlike fog, and I had the opportunity to wake up. Perhaps this is the same instinct that motivates meditation? I'm not sure.

Comment by richard_ngo on Contact with reality · 2021-02-16T10:36:48.590Z · EA · GW

This post is great, and I think it frames the idea very well.

My only disagreement is with the following part of the scenario you give:

Every time you try to think things through, the machine will cause you to make mistakes of reasoning that you won’t notice: indeed, you’ve already been making lots of these. You’re hopelessly confused on a basic level, and you’ll stay that way for the rest of your life.

The inclusion of this seems unhelpful to me, because it makes me wonder about the extent to which a version of me whose internal thought processes are systematically manipulated is really the same person (in the sense that I care about). Insofar as the ways I think and reason are part of my personality and identity, then I might have additional reasons to not want them to be changed (in addition to wanting my beliefs to be accurate).

As you identify, it may still be necessary to interfere with my beliefs for the purposes of maintaining social fictions, but this could plausibly require only minor distortions. Whereas losing control of my mind in the way you describe above seems quite different from just having false beliefs.

Comment by richard_ngo on Scope-sensitive ethics: capturing the core intuition motivating utilitarianism · 2021-01-31T13:28:27.977Z · EA · GW

I'd say scope sensitive ethics is a reinvention of EA.

This doesn't seem quite right, because ethical theories and movements/ideologies are two different types of things. If you mean to say that scope sensitive ethics is a reinvention of the ethical intuitions which inspired EA, then I'm happy to agree; but the whole point of coining the term is to separate the ethical position from other empirical/methodological/community connotations that EA currently possesses, and which to me also seem like "core ideas" of EA.

Comment by richard_ngo on Clarifying the core of Effective Altruism · 2021-01-31T12:45:18.274Z · EA · GW

Thanks for the kind words and feedback! Some responses:

I wonder if there are examples?

The sort of examples which come to mind are things like new religions, or startup, or cults - all of which make heavy demands on early participants, and thereby foster a strong group bond and  sense of shared identity which allows them greater long-term success. 

since the antecedent "if you want to contribute to the common good" is so minimal, ben's def feels kind of near-normative to me

Consider someone who only cares about the lives of people in their own town. Do they want to contribute to the common good? In one sense yes, because the good of the town is a part of the common good. But in another sense no; they care about something different from the common good, which just happens to partially overlap with it.

Using the first definition, "if you want to contribute to the common good" is too minimal to imply that not pursuing effective altruism is a mistake.

Using the second definition, "if you want to contribute to the common good" is too demanding - because many people care about individual components of the common good (e.g. human flourishing) without being totally on board with "welfare from an impartial perspective".

I think I disagree about the maximising point. Basically I read your proposed definition as near-maximising, becuase when you iterate on 'contributing much more' over and over again you get a maximum or a near-maximum.

Yeah, I agree that it's tricky to dodge maximalism. I give some more intuitions for what I'm trying to do in this post. On the 2nd worry: I think we're much more radically uncertain about the (ex ante) best option available to us out of the space of all possible actions, than we are radically uncertain about a direct comparison between current options vs a new proposed option which might do "much more" good. On the 3rd worry: we should still encourage people not to let their personal preferences stand in the way of doing much more good. But this is consistent with (for example) people spending 20% of their charity budget in less effective ways. (I'm implicitly thinking of "much more" in relative terms, not absolute - so a 25% increase is not "much more" good.)

Comment by richard_ngo on AMA: Ajeya Cotra, researcher at Open Phil · 2021-01-29T08:54:35.656Z · EA · GW

An extension of Daniel's bonus question:

If I condition on your report being wrong in an important way (either in its numerical predictions, or via conceptual flaws) and think about how we might figure that out today, it seems like two salient possibilities are inside-view arguments and outside-view arguments.

The former are things like "this explicit assumption in your model is wrong". E.g. I count my concern about the infeasibility of building AGI using algorithms available in 2020 as an inside-view argument.

The latter are arguments that, based on the general difficulty of forecasting the future, there's probably some upcoming paradigm shift or crucial consideration which will have a big effect on your conclusions (even if nobody currently knows what it will be).

Are you more worried about the inside-view arguments of current ML researchers, or outside-view arguments?

Comment by richard_ngo on richard_ngo's Shortform · 2021-01-21T12:50:11.568Z · EA · GW

Hmm, I agree that we're talking past each other. I don't intend to focus on ex post evaluations over ex ante evaluations. What I intend to focus on is the question: "when an EA make the claim that GiveWell charities are the charities with the strongest case for impact in near-term human-centric terms, how justified are they?" Or, relatedly, "How likely is it that somebody who is motivated to find the best near-term human-centric charities possible, but takes a very different approach than EA does (in particular by focusing much more on hard-to-measure political effects) will do better than EA?"

In my previous comment, I used a lot of phrases which you took to indicate the high uncertainty of political interventions. My main point was that it's plausible that a bunch of them exist which will wildly outperform GiveWell charities. I agree I don't know which one, and you don't know which one, and GiveWell doesn't know which one. But for the purposes of my questions above, that's not the relevant factor; the relevant factor is: does someone know, and have they made those arguments publicly, in a way that we could learn from if we were more open to less quantitative analysis? (Alternatively, could someone know if they tried? But let's go with the former for now.)

In other words, consider two possible worlds. In one world GiveWell charities are in fact the most cost-effective, and all the people doing political advocacy are less cost-effective than GiveWell ex ante (given publicly available information). In the other world there's a bunch of people doing political advocacy work which EA hasn't supported even though they have strong, well-justified arguments that their work is very impactful (more impactful than GiveWell's top charities), because that impact is hard to quantitatively estimate. What evidence do we have that we're not in the second world? In both worlds GiveWell would be saying roughly the same thing (because they have a high bar for rigour). Would OpenPhil be saying different things in different worlds? Insofar as their arguments in favour of GiveWell are based on back-of-the-envelope calculations like the ones I just saw, then they'd be saying the same thing in both worlds, because those calculations seem insufficient to capture most of the value of the most cost-effective political advocacy. Insofar as their belief that it's hard to beat GiveWell is based on other evidence which might distinguish between these two worlds, they don't explain this in their blog post - which means I don't think the post is strong evidence in favour of GiveWell top charities for people who don't already trust OpenPhil a lot.

Comment by richard_ngo on richard_ngo's Shortform · 2021-01-20T21:44:50.944Z · EA · GW

it leaves me very baseline skeptical that most 'systemic change' charities people suggest would also outperform, given the amount of time Open Phil has put into this question relative to the average donor. 

I have now read OpenPhil's sample of the back-of-the-envelope calculations on which their conclusion that it's hard to beat GiveWell was based. They were much rougher than I expected. Most of them are literally just an estimate of the direct benefits and costs, with no accounting for second-order benefits or harms, movement-building effects, political effects, etc. For example, the harm of a year of jail time is calculated as 0.5 QALYs plus the financial cost to the government - nothing about long-term effects of spending time in jail, or effects on subsequent crime rates, or community effects. I'm not saying that OpenPhil should have included these effects, they are clear that these are only intended as very rough estimates, but it means that I now don't think it's justified to treat this blog post as strong evidence in favour of GiveWell.

Here's just a basic (low-confidence) case for the cost-efficacy of political advocacy: governmental policies can have enormous effects, even when they attract little mainstream attention (e.g. PEPFAR). But actually campaigning for a specific policy is often only the last step in the long chain of getting the cause into the Overton Window, building a movement, nurturing relationships with politicians, identifying tractable targets, and so on, all of which are very hard to measure, and  which wouldn't show up at all in these calculations by OpenPhil. Given this, what evidence is there that funding these steps wouldn't outperform GiveWell for many policies?

(See also Scott Alexander 's rough calculations on the effects of FDA regulations, which I'm not very confident in, but which have always stuck in my head as an argument that how dull-sounding policies might have wildly large impacts.)

Your other points make sense, although I'm now worried that abstaining about near-term human-centric charities will count as implicit endorsement. I don't know very much about quantitatively analysing interventions though, so it's plausible that my claims in this comment are wrong.

Comment by richard_ngo on Lessons from my time in Effective Altruism · 2021-01-18T05:43:54.312Z · EA · GW

I get the impression that you (Richard) think it would've been better if you'd skipped trying out engineering-style roles and gone straight into philosophy-style roles. Do you indeed think that?

I don't think this; learning about technical ideas in AI, and other aspects of working at DeepMind, have been valuable for me; so it's hard to point to things which I should have changed. But as I say in the post, in worlds where I wasn't so lucky, then I expect it would have been useful to weight personal fit more. For example, if I'd had the option of committing to a ML PhD instead of a research engineering role, then I might have done so despite uncertainty about the personal fit; this would probably have gone badly.

Comment by richard_ngo on Clarifying the core of Effective Altruism · 2021-01-17T21:08:34.178Z · EA · GW

For future reference, Linch's comment was in response to a comment of mine which I deleted before Linch replied, in which I used the example of saying "Federer is the best tennis player". Sorry about that! I replaced it with a comment that tried to point at the heart of the objection; but since I disagree with the things in your reply, I'll respond here too.

I think I just disagree with your intuitions here. When someone says Obama is the best person to be president, they are presumably taking into account factors like existing political support and desire to lead, which make it plausible that Obama actually is the best person.

And when people say "X is the best fiction author ever", I think they do mean to make a claim about the literal probability that this person is, out of all the authors who ever wrote fiction, the best one. In that context, the threshold at which I'd call something a "belief" is much lower than in most contexts, but nevertheless I think that when (for example) a Shakespeare fan says it, they are talking about the proposition that nobody was greater than Shakespeare. And this is not an implausible claim, given how much more we study Shakespeare than anyone else.

(By contrast, if they said: nobody had as much latent talent as Shakespeare, then this would be clearly false).

Anyway, it seems to me that judging the best charitable intervention is much harder than judging the best author, because for the latter you only need to look at books that have already been written, whereas in the former you need to evaluate the space of all interventions, including ones that nobody has proposed yet.

Comment by richard_ngo on Clarifying the core of Effective Altruism · 2021-01-17T20:37:39.412Z · EA · GW

I think the main problem with your definition is that it doesn't allow you to be wrong. If you say "X is the best bet", then how can I disagree if you're accurately reporting information about your subjective credences? Of course, I could respond by saying "Y is the best bet", but that's just me reporting my credences back to you. And maybe we'll change our credences, but at no point in time was either of us wrong, because we weren't actually talking about the world itself.

Which seems odd, and out of line with how we use this type of language in other contexts. If I say "Mathematics is the best field to study, ex ante" then it seems like I'm making a claim not just about my own beliefs, but also about what can be reasonably inferred from other knowledge that's available; a claim which might be wrong. In order to use this interpretation, we do need some sort of implicit notion of what knowledge is available, and what can be reasonably inferred from it, but that saves us from making claims that are only about our own beliefs. (In other words, not the local map, nor the territory, but some sort of intermediate "things that we should be able to infer from human knowledge" map.)

Comment by richard_ngo on Clarifying the core of Effective Altruism · 2021-01-16T23:32:15.924Z · EA · GW

No, I'm using the common language meaning. Put it this way: there are seven billion people in the world, and only one of them is the best person to fund (ex ante). If you pick one person, and say "I believe that this is the BEST person to fund, given the information available in 2021", then there's a very high chance that you're wrong, and so this claim isn't justified. Whereas you can justifiably claim that this person is a very good person to fund.

Comment by richard_ngo on Scope-sensitive ethics: capturing the core intuition motivating utilitarianism · 2021-01-16T17:59:33.433Z · EA · GW

I'm pretty suspicious about approaches which rely on personal identity across counterfactual worlds; it seems pretty clear that either there's no fact of the matter here, or else almost everything you can do leads to different people being born (e.g. by changing which sperm leads to their conception).

And secondly, this leads us to the conclusion that unless we quickly reach a utopia where everyone has positive lives forever, then the best thing to do is end the world as soon as possible. Which I don't see a good reason to accept.

Comment by richard_ngo on Scope-sensitive ethics: capturing the core intuition motivating utilitarianism · 2021-01-16T17:49:41.164Z · EA · GW

The problem is that one man's modus ponens is another man's modus tollens. Lots of people take the fact that utilitarianism says that you shouldn't care about your family more than a stranger as a rebuttal to utilitarianism.

Now, we could try to persuade them otherwise, but what's the point? Even amongst utilitarians, almost nobody gets anywhere near placing as much moral value on a spouse as a stranger. If there's a part of a theory that is of very little practical use, but is still seen as a strong point against the theory, we should try find a version without it. That's what I intend scope-sensitive ethics to be.

In other words, we go from "my moral theory says you should do X and Y, but everyone agrees that it's okay to ignore X, and Y is much more important" to "my moral theory says you should do Y", which seems better. Here X is "don't give your family special treatment" and Y is "spend your career helping the world".

Comment by richard_ngo on Lessons from my time in Effective Altruism · 2021-01-16T17:39:56.521Z · EA · GW

Yepp, I agree with this. On the other hand, since AI safety is mentorship-constrained, if you have good opportunities to upskill in mainstream ML, then that frees up some resources for other people. And it also involves building up wider networks. So maybe "similar expected value" is a bit too strong, but not that much.

Comment by richard_ngo on Clarifying the core of Effective Altruism · 2021-01-16T12:32:58.875Z · EA · GW

Thanks; done.

Comment by richard_ngo on Lessons from my time in Effective Altruism · 2021-01-16T12:29:24.338Z · EA · GW

+1 for providing a counterpoint. All this sort of advice is inherently overfitted to my experience, so it's good to have additional data.

I'm curious if you disagree with my arguments for these conclusions, or you just think that there are other considerations which outweigh them for you?

Comment by richard_ngo on Lessons from my time in Effective Altruism · 2021-01-16T12:26:28.648Z · EA · GW

I do think that this turned out well for me, and that I would have been significantly worse off if I hadn't started working in safety directly. But this was partly a lucky coincidence, since I didn't intend to become a philosopher three years ago when making this decision. If I hadn't gotten a job at DeepMind, then my underestimate of the usefulness of upskilling might have led me astray.

Comment by richard_ngo on Lessons from my time in Effective Altruism · 2021-01-16T12:22:46.533Z · EA · GW

"Don't have much time for X" is an idiom which roughly means "have a low tolerance for X". I'm not saying that their time actually gets wasted, just that they get a bad impression. Might edit to clarify.

And yes, it's partly about silly questions, partly about negative vibes from being too ideological, partly about general lack of understanding about how organisations work. On balance, I'm happy that EAs are enthusiastic about doing good and open to weird ideas; I'm just noting that this can sometimes play out badly for people without experience of "normal" jobs when interacting in more hierarchical contexts.

Comment by richard_ngo on Alienation and meta-ethics (or: is it possible you should maximize helium?) · 2021-01-16T12:09:14.445Z · EA · GW

It's not truth-apt. It has a truth-apt component (that my moral theory endorses creating flourishing). But it also has a non-truth-apt component, namely "hooray my moral theory". I think this gets you a lot of the benefits of cognitivism, while also distinguishing moral talk from standard truth-apt claims about my or other people's preferences (which seems important, because agreeing that "Clippy was right when it said it should clip" feels very different from agreeing that"Clippy wants to clip").

I can see how this was confusing in the original comment; sorry about that.

I think the intuition that Clippy's position is very different from ours starts to weaken if Clippy has a moral theory. For example, at that point we might be able to reason with Clippy and say things like "well, would you want to be in pain?", etc. It may even (optimistically) be the case that properties like non-specificity and universality are strong enough that any rational agent which strongly subscribes to them will end up with a reasonable moral system. But you're right that it's somewhat non-central, in that the main thrust of my argument doesn't depend on it.

Comment by richard_ngo on Alienation and meta-ethics (or: is it possible you should maximize helium?) · 2021-01-15T21:25:51.380Z · EA · GW

Interesting post. I interpret the considerations you outline as forming a pretty good argument against moral realism. Partly that's because I think that there's a stronger approach to internalist anti-realism than the one you suggest. In particular, I interpret a statement that "X should do Y" as something like: "the moral theory which I endorse would tell X to do Y" (as discussed further here). And by "moral theory" I mean something like: a subset of my preferences which has certain properties, such as being concerned with how to treat others, not being specific to details of my life, being universalisable, etc. (Although the specific properties aren't that important; it's more of a familial resemblence.)

So I certainly wouldn't say that Clippy should clip. And even if Clippy says that, I don't need to agree that it's true even from Clippy's perspective. Firstly because Clippy doesn't endorse any moral theory; and secondly because endorsements aren't truth-apt. On this position, the confusion comes from saying "X should do Y" without treating it as a shorthand for "X should do Y according to Z; hooray for Z".

From the perspective of morality-as-coordination-technology, this makes a lot of sense: there's no point having moral discussions with an entity that doesn't endorse any moral theory! (Or proto-theory, or set of inchoate intuitions; these are acceptable substitutes.) But within a community of people who have sufficiently similar moral views, we still have the ability to say "X is wrong" in a way that everyone agrees with.

Comment by richard_ngo on richard_ngo's Shortform · 2021-01-06T19:25:09.926Z · EA · GW

I don't think that the moral differences between longtermists and most people in similar circles (e.g. WEIRD) are that relevant, actually. You don't need to be a longtermist to care about massive technological change happening over the next century. So I think it's straightforward to say things like "We should try to have a large-scale moral impact. One very relevant large-scale harm is humans going extinct; so we should work on things which prevent it".

This is what I plan to use as a default pitch for EA from now on.

Comment by richard_ngo on richard_ngo's Shortform · 2020-12-28T12:53:05.513Z · EA · GW

After chatting with Alex Gordon-Brown, I updated significantly towards his position (as laid out in his comments below). Many thanks to him for taking the time to talk; I've done my best to accurately represent the conversation, but there may be mistakes. All of the following are conditional on focusing on near-term, human-centric charities.

Three key things I changed my mind on:

  1. I had mentally characterised EA as starting with Givewell-style reasoning, and then moving on to less quantifiable things. Whereas Alex (who was around at the time) pointed out that there were originally significant disagreements between EAs and Givewell, in particular with EAs arguing for less quantifiable approaches. EA and Givewell then ended up converging more over time, both as EAs found that it was surprisingly hard to beat Givewell charities even allowing for less rigorous analysis, and also as people at Givewell (e.g. the ones now running OpenPhil) became more convinced in less-quantifiable EA methodologies.
    1. Insofar as the wider world has the impression of EA as synonymous with Givewell-style reasoning, a lot of that comes from media reports focusing on it in ways we weren't responsible for.
    2. Alex claims that Doing Good Better, which also leans in this direction, wasn't fully representative of the beliefs of core EAs at the time it was published.
  2. Alex says that OpenPhil has found Givewell charities surprisingly hard to beat, and that this (along with other EA knowledge and arguments, such as the 100x multiplier) is sufficient to make a "compelling case" for them.
    1. Alex acknowledges that not many people who recommend Givewell are doing so because of this evidence; in some sense, it's a "happy coincidence" that the thing people were already recommending has been vindicated. But he thinks that there are enough careful EAs who pay attention to OpenPhil's reports that, if their conclusions had been the opposite, I would have heard people publicly making this case.
  3. Alex argues that I'm overly steelmanning the criticism that EA has received. EA spent a lot of time responding to criticisms that it's impossible to know that any charities are doing a lot of good (e.g. because of potential corruption, and so on), and criticisms that we should care more about people near us, and so on. Even when it came to "systemic change" critiques, these usually weren't principled critiques about the importance of systemic change in general, but rather just "you should focus on my personal cause", in particular highly politicised causes.

Alex also notes that the Givewell headline claim "We search for the charities that save or improve lives the most per dollar" is relatively new (here's an earlier version) and has already received criticism.

Things I haven't changed my mind about:

  1. I still think that most individual EAs should be much more careful in recommending Givewell charities. OpenPhil's conclusions are based primarily off (in their words) "back-of-the-envelope calculations", the details of which we don't know. I think that, even if this is enough to satisfy people who trust OpenPhil's researchers and their methodologies, it's far less legible and rigorous than most people who hear about EA endorsement of Givewell  charities would expect. Indeed, they still conclude that (in expectation) their hits-based portfolio will moderately outperform Givewell.
  2. OpenPhil's claims are personally not enough to satisfy me. I think by default I won't endorse Givewell charities. Instead I'll abstain from having an opinion on what the best near-term human-centric charities are, and push for something more longtermist like pandemic prevention as a "default" outreach cause area instead. But I also don't think it's unreasonable for other people to endorse Givewell charities under the EA name.
  3. I still think that the 100x multiplier argument is (roughly) cancelled out by the multiplier going the other way, of wealthy countries having at least 100x more influence over the world. So, while it's still a good argument for trying to help the poorest people, it doesn't seem like a compelling argument for trying to help the poorest people via direct interventions in poor countries.

Overall lessons: I overestimated the extent to which my bubble was representative of EA, and also the extent to which I understood the history of EA accurately.

Alex and I finished by briefly discussing AI safety, where I'm quite concerned about a lack of justification for many of the claims EAs make. I'm hoping to address this more elsewhere.

Comment by richard_ngo on richard_ngo's Shortform · 2020-12-16T18:21:41.758Z · EA · GW

Or, to be more concrete, I believe (with relatively low confidence, though) that:

  • Most of the people whose donations have been influenced by EA would, if they were trying to donate to do as much good as possible without any knowledge of EA, give money to mainstream systemic change (e.g. political activism, climate change charities).
  • Most of those people believe that there's a consensus within EA that donations to Givewell's top charities do more good than these systemic change donations, to a greater degree than there actually is.
  • Most of those people would then be surprised to learn how little analysis EA has done on this question, e.g. they'd be surprised at how limited the scope of charities Givewell considers actually is.
  • A significant part of these confusions is due to EA simplifying its message in order to attract more people - for example, by claiming to have identified the charities that "do the most good per dollar", or by comparing our top charities to typical mainstream charities instead of the mainstream charities that people in EA's target audience previously believed did the most good per dollar (before hearing about EA).
Comment by richard_ngo on richard_ngo's Shortform · 2020-12-16T17:17:22.670Z · EA · GW

Hey Alex, thanks for the response! To clarify, I didn't mean to ask whether no case has been made, or imply that they've "never been looked at", but rather ask whether a compelling case has been made - which I interpret as arguments which seem strong enough to justify the claims made about Givewell charities, as understood by the donors influenced by EA.

I think that the 100x multiplier is a powerful intuition, but that there's a similarly powerful intuition going the other way: that wealthy countries are many times more influential than developing countries (e.g. as measured in technological progress), which is reason to think that interventions in wealthy countries can do comparable amounts of good overall.

On the specific links you gave: the one on climate change (Global development interventions are generally more effective than climate change interventions) starts as follows:

Previously titled “Climate change interventions are generally more effective than global development interventions”.  Because of an error the conclusions have significantly changed. I have extended the analysis and now provide a more detailed spreadsheet model below. In the comments below, Benjamin_Todd uses a different guesstimate model and found the climate change came out ~80x better than global health (even though the point estimate found that global health is better).

I haven't read the full thing, but based on this, it seems like there's still a lot of uncertainty about the overall conclusion reached, even when the model is focused on direct quantifiable effects, rather than broader effects like movement-building, etc. Meanwhile the 80k article says that "when political campaigns are the best use of someone’s charitable giving is beyond the scope of this article". I appreciate that these's more work on these questions which might make the case much more strongly. But given that Givewell is moving over $100M a year from a wide range of people, and that one of the most common criticisms EA receives is that it doesn't account enough for systemic change, my overall expectation is still that EA's case against donating to mainstream systemic-change interventions is not strong enough to justify the set of claims that people understand us to be making.

I suspect that our disagreement might be less about what research exists,  and more about what standard to apply for justification. Some reasons I think that we should have a pretty high threshold for thinking that claims about Givewell top charities doing the most good are justified:

  1. If we think of EA as an ethical claim (you should care about doing a lot of good) and an empirical claim (if you care about that, then listening to us increases your ability to do so) then the empirical claim should be evaluated against the donations made by people who want to do a lot of good, but aren't familiar with EA. My guess is that climate change and politics are fairly central examples of such donations.
  2. (As mentioned in a reply to Denise): "Doing the most good per dollar" and "doing the most good that can be verified using a certain class of methodologies" can be very different claims. And the more different that class is methodologies is from most people's intuitive conception of how to evaluate things, the more important it is to clarify that point. Yet it seems like types of evidence that we have for these charities are very different from the types of evidence that most people rely on to form judgements about e.g. how good it would be if a given political party got elected, which often rely on effects that are much harder to quantify.
  3. Givewell charities are still (I think) the main way that most outsiders perceive EA. We're now a sizeable movement with many full-time researchers. So I expect that outsiders overestimate how much research backs up the claims they hear about doing the most good per dollar, especially with respect to the comparisons I mentioned. I expect they also underestimate the level of internal disagreement within EA about how much good these charities do.
  4. EA funds a lot of internal movement-building that is hard to quantify. So when our evaluations of other causes exclude factors that we consider important when funding ourselves, we should be very careful.
Comment by richard_ngo on richard_ngo's Shortform · 2020-12-16T15:50:19.072Z · EA · GW

This seems reasonable. On the other hand, it's hard to give references to a broad pattern of discourse.

Maybe the key contention I'm making here is that "doing the most good per dollar" and "doing the most good that can be verified using a certain class of methodologies" are very different claims. And the more different that class is methodologies is from most people's intuitive conception of how to evaluate things, the more important it is to clarify that point.

Comment by richard_ngo on richard_ngo's Shortform · 2020-12-15T12:08:36.635Z · EA · GW

Here's Rob Wiblin:

We have been taking on the enormous problem of ‘how to help others do the most good’ and had to start somewhere. The natural place for us, GiveWell and other research groups to ‘cut our teeth’ was by looking at the cause areas and approaches where the empirical evidence was strongest, such as the health improvement from anti-malarial bednets, or determining in which careers people could best ‘earn to give’.
Having learned from that research experience we are in a better position to evaluate approaches to systemic change, which are usually less transparent or experimental, and compare them to non-systemic options.

From my perspective at least, this seems like political spin. If advocacy for anti-malarial bednets was mainly intended as a way to "cut our teeth", rather than a set of literal claims about how to do the most good, then EA has been systematically misleading people for years.

Nor does it seem to me that we're actually in a significantly better position to evaluate approaches to systemic change now, except insofar as we've attracted more people. But if those people were attracted because of our misleading claims, then this is not a defence.

Comment by richard_ngo on richard_ngo's Shortform · 2020-12-15T00:47:27.352Z · EA · GW

What is the strongest argument, or the best existing analysis, that Givewell top charities actually do more good per dollar than good mainstream charities focusing on big-picture issues (e.g. a typical climate change charity, or the US Democratic party)?

If the answer is "no compelling case has been made", then does the typical person who hears about and donates to Givewell top charities via EA understand that?

If the case hasn't been made [edit: by which I mean, if the arguments that have been made are not compelling enough to justify the claims being made], and most donors don't understand that, then the way EAs talk about those charities is actively misleading, and we should apologise and try hard to fix that.

Comment by richard_ngo on Careers Questions Open Thread · 2020-12-14T21:01:49.977Z · EA · GW

So there are a few different sources of optionality from a PhD:
- Academic credentials
- Technical skills
- Research skills

Software engineer at a quant firm plausibly builds more general technical skills, but I expect many SWEs there work on infrastructure that has little to do with AI. I also don't have a good sense for how fast quant firms are switching over to deep learning - I assume they're on the leading edge, but maybe not all of them, or maybe they value interpretability too much to switch fully.

But I also think PhDs are pretty valuable for learning how to do innovative research at the frontiers of knowledge, and for the credentials. So it seems like one important question is: what's the optionality for? If it's for potentially switching to a different academic field, then PhD seems better. If it's for leading a research organisation, same.  Going into policy work, same. If it's for founding a startup, harder to tell; depends on whether it's an AI startup I guess.

Whereas I have more trouble picturing how a few years at a quant firm is helpful in switching to a different field, apart from the cash buffer. And I also had the impression that engineers at these places are usually compensated much worse than quants (although your earlier comment implies that this isn't always the case?).

Actually, one other thing is that I was implicitly thinking about UK PhDs. My concern with US PhDs is that they can be so long. Which makes me more optimistic about getting some external work experience first, to get a better perspective from which to make that commitment (which is what I did).

Comment by richard_ngo on Careers Questions Open Thread · 2020-12-14T12:13:36.162Z · EA · GW

AI PhDs tend to be very well-compensated after graduating, so I don't think personal financial constraints should be a big concern on that path.

More generally, skill in AI is going to be upstream of basically everything pretty soon; purely in terms of skill optionality, this seems much more valuable than being a quant.

Comment by richard_ngo on 80k hrs #88 - Response to criticism · 2020-12-12T17:37:04.868Z · EA · GW

I expect that people interpreted the "You are clearly deciding against both of these" as an unkind/uncharitable phrase, since it reads like an accusation of deliberate wrongdoing. I expect that, if you'd instead said something like "Parts of your post seem unnecessarily inflammatory", then it wouldn't have received such a negative response.

I also personally tend to interpret the kindness guidelines as being primarily about how to engage with people who are on the forum, or who are likely to read forum posts. Of course we shouldn't be rude in general, but it seems significantly less bad to critique external literature harshly than to directly critique people harshly.

Comment by richard_ngo on My mistakes on the path to impact · 2020-12-10T19:36:06.568Z · EA · GW

Good point, I agree this weakens my argument.

Comment by richard_ngo on AMA: Jason Crawford, The Roots of Progress · 2020-12-10T14:25:32.336Z · EA · GW

Not a direct response to your question, but I do think progress studies is very complementary to longtermism. In particular, it seems to me that longtermists are often much more interested in big ethical ideas rather than big empirical ideas. Yet, if anything, the latter are more important.

So I expect that most of the high-level research in progress studies (e.g. about the industrial revolution, or about principles for institutional reform) will be useful in informing longtermist's empirical ideas about the future.

This will be less true for research into specific interventions.

Comment by richard_ngo on richard_ngo's Shortform · 2020-12-09T15:29:57.325Z · EA · GW

The concept of cluelessness seems like it's pointing at something interesting (radical uncertainty about the future) but has largely been derailed by being interpreted in the context of formal epistemology. Whether or not we can technically "take the expected value" even under radical uncertainty is both a confused question (human cognition doesn't fit any of these formalisms!), and also much less interesting than the question of how to escape from radical uncertainty. In order to address the latter, I'd love to see more work that starts from Bostrom's framing in terms of crucial considerations.

Comment by richard_ngo on My mistakes on the path to impact · 2020-12-09T14:47:57.570Z · EA · GW

I think that in theory Max is right, that there's some optimal way to have the best of both worlds. But in practice I think that there are pretty strong biases towards conformity, such that it's probably worthwhile to shift the community as a whole indiscriminately towards being less epistemic modest.

As one example, people might think "I'll make up my mind on small decisions, and defer on big decisions." But then they'll evaluate what feels big to them , rather than to the EA community overall, and thereby the community as a whole will end up being strongly correlated even on relatively small-scale bets. I think your comment itself actually makes this mistake - there's now enough money in EA that, in my opinion, there should be many $5M grants which aren't strongly correlated with the views of EA as a whole.

In particular, I note that venture capitalists allocate much larger amounts of money explicitly on anti-conformist principles. Maybe that's because startups are a more heavy-tailed domain than altruism, and one where conformity is more harmful, but I'm not confident about that; the hypothesis that we just haven't internalised the "hits-based" mentality as well as venture capitalists have also seems plausible.

Comment by richard_ngo on What actually is the argument for effective altruism? · 2020-12-08T02:00:34.596Z · EA · GW

I define the ‘common good’ in the same way Will MacAskill defines the good in “The definition of effective altruism”, as what most increases welfare from an impartial perspective. This is only intended as a tentative and approximate definition, which might be revised.

I'm a bit confused by this, because "what most increases welfare" is describing an action, which seems like the wrong type of thing for "the common good" to be.  Do you instead mean that the common good is impartial welfare, or similar? This also seems more in line with Will.

One other quibble:

the search for the actions that do the most to contribute to the common good (relative to their cost).

I'm not sure we actually want "relative to their cost" here. On one hand, it might be the case that the actions which are most cost-effective at doing good actually do very little good, but are also very cheap (e.g. see this post by Hanson). Alternatively, maybe the most cost-effective actions are absurdly expensive, so that knowing what they are doesn't help us.

Rather, it seems better to just say "the search for the actions that do the most to contribute to the common good, given limited resources". Or even just leave implicit that there are resource constraints.

Comment by richard_ngo on richard_ngo's Shortform · 2020-12-03T12:23:27.352Z · EA · GW

I see. Yeah, Phil and Rob do discuss it, but focused on movement-building via fundraising/recruitment/advocacy/etc, rather than via publicly doing amazing direct work. Perhaps they were implicitly thinking about the latter as well, though. But I suspect the choice of examples shapes people's impression of the argument pretty significantly.

E.g. when it comes to your individual career, you'll think of "investing in yourself" very differently if the central examples are attending training programs and going to university, versus if the central example is trying to do more excellent and eye-catching work.