Posts

Optimisation-focused introduction to EA podcast episode 2021-01-15T09:59:29.416Z
Retrospective on Teaching Rationality Workshops 2021-01-03T17:15:06.154Z
Local Group Event Idea: EA Community Talks 2020-12-20T17:12:29.251Z
Make a Public Commitment to Writing EA Forum Posts 2020-11-18T18:23:11.468Z
Helping each other become more effective 2020-10-30T21:33:47.382Z
What altruism means to me 2020-08-15T08:25:28.386Z
The world is full of wasted motion 2020-08-05T20:41:23.710Z

Comments

Comment by Neel Nanda on What are some key numbers that (almost) every EA should know? · 2021-06-18T13:05:21.157Z · EA · GW

I'd love to use such an Anki deck!

Comment by Neel Nanda on New skilled volunteering board for effective animal advocacy · 2021-06-18T13:03:15.175Z · EA · GW

Nice initiative! I'd have found the post title more informative if you replaced 'high priority EA cause area' with 'Animal Advocacy'/'Animal Welfare'. Is there a reason you went with the first one?

Comment by Neel Nanda on Event-driven mission hedging and the 2020 US election · 2021-06-15T11:21:22.040Z · EA · GW

Interesting idea, and thought-provoking post, thanks! 

I find it odd to call this mission hedging though. It feels more like mission anti-hedging - I want to be maximally risk seeking and go all in to have more money in the world where my cause is doing better.

Comment by Neel Nanda on How Should Free Will Theories Impact Effective Altruism? · 2021-06-15T11:18:22.106Z · EA · GW

If free will doesn't exist, does that ruin/render void the EA endeavour?

Can you say more about why free will not existing is relevant to morality? 

My personal take is that free will seems like a pretty meaningless and confused concept, and probably doesn't exist (whatever that means). But that I want to do what I can to make the world a better place anyway, in the same way that I clearly want and value things in my normal life, regardless of whether I'm doing this with free will.

Comment by Neel Nanda on Are there any 'maximum egoism' pledges? · 2021-06-15T11:09:27.043Z · EA · GW

Ah, interesting! I didn't know this was the original conception of GWWC. I'm glad that got changed! 

Comment by Neel Nanda on Are there any 'maximum egoism' pledges? · 2021-06-13T19:50:25.099Z · EA · GW

Huh, that is not what I thought you meant by maximal egoism

Giving What We Can have the Further Pledge, to donate everything above a certain threshold.

Comment by Neel Nanda on Which non-EA-funded organisations did well on Covid? · 2021-06-10T23:43:28.738Z · EA · GW

To me the shining example of this is Jacob Falkovich's Seeing the Smoke, which somehow helped convince the UK to lockdown in March 2020.

Comment by Neel Nanda on Which non-EA-funded organisations did well on Covid? · 2021-06-10T23:41:32.330Z · EA · GW

even if he has been somewhat (in my opinion unfairly) shunned by the EA community

What's this referring to? I know he consumes a bunch of rationalist content, but wasn't aware of much interaction with EA, or of any action of the community towards him.

Comment by Neel Nanda on EA Infrastructure Fund: Ask us anything! · 2021-06-07T00:14:54.072Z · EA · GW

If I know an organisation is applying to EAIF, and have an inside view that the org is important, how valuable is donating $1000 to the org compared to donating $1000 to EAIF? More generally, how should medium sized but risk-neutral donors coordinate with the fund?

Comment by Neel Nanda on EA Infrastructure Fund: Ask us anything! · 2021-06-06T21:37:48.638Z · EA · GW

What were the most important practices you transferred?

Comment by Neel Nanda on EA Infrastructure Fund: May 2021 grant recommendations · 2021-06-04T07:19:48.787Z · EA · GW

Thanks a lot for the write-up! Seems like there's a bunch of extremely promising grants in here. And I'm really happy to see that the EAIF is scaling up grantmaking so much. I'm particularly excited about the grants to HIA, CLTR and to James Aung & Emma Abele's project.

And thanks for putting in so much effort into the write-up, it's really valuable to see the detailed thought process behind grants and makes me feel much more comfortable with future donations to EAIF. I particularly appreciated this for the children's book grant, I went from being strongly skeptical to tentatively excited by the write-up.

Comment by Neel Nanda on Long-Term Future Fund: May 2021 grant recommendations · 2021-05-31T18:16:50.347Z · EA · GW

Yes, agreed. My argument is that if cases are sufficiently low in the US, then deploying it now won't get much data, and the app likely won't get much uptake

Comment by Neel Nanda on Introducing Rational Animations · 2021-05-31T11:19:08.488Z · EA · GW

As a single point of anecdata, I got interested in EA via being part of the rationality community, and think it's plausible I would not have gotten involved in EA if there wasn't that link

Comment by Neel Nanda on Long-Term Future Fund: May 2021 grant recommendations · 2021-05-31T10:51:12.258Z · EA · GW

Po-Shen Loh – $100,000, with an expected reimbursement of up to $100,000

Awesome! I think NOVID is a really clever idea, and I'm excited to see it getting funding.

One concern I have about the value proposition, which I didn't see addressed here: It seems that this funding might be coming too late in the pandemic to be useful? It seems that NOVID will only really help in future pandemics if it clearly demonstrates value now. But as far as I'm aware, it's mainly being developed and deployed in the US, which seems to be most of the way to herd immunity. So it seems plausible that there won't be enough transmission for NOVID to really demonstrate value.

Comment by Neel Nanda on Why should we be effective in our altruism? · 2021-05-31T10:36:55.596Z · EA · GW

There is enormous spread in how much good some interventions do. For example, money spent helping the world's poorest people to be 100x more effective than money spent helping the typical person in the West. 100x differences are a really big deal, but feel unintuitive and hard to think about - these don't often come up in every day life. And caring about evidence and effectiveness is our main tool to identify these differences in spread, and focus on the best interventions. So we need to care about effectiveness, because we happen to live in a world where caring about it makes a massive difference in how much good we do

Comment by Neel Nanda on Concerns with ACE's Recent Behavior · 2021-04-26T08:44:53.690Z · EA · GW

 Evidence-based reasoning, with the understanding that the burden of proof lies with those who deny that the EA movement must make strenuous efforts to eliminate all forms of discrimination in its midst.

I feel somewhat skeptical of this, given that you also say:

This may include in some contexts behaviour consisting in denying that such discrimination exists or that it needs to be addressed.

It feels like 'trying to provide empirical evidence that the EA movement should not make overcoming discrimination an overwhelming priority' can certainly feel like denying discrimination exists, and can feel harmful to people. I'm somewhat skeptical that such a discussion would likely happen in a healthy and constructive way under prevailing social justice discussion norms. Have you ever come across good examples of such discussions?

Comment by Neel Nanda on Concerns with ACE's Recent Behavior · 2021-04-25T20:06:42.927Z · EA · GW

The key part of running feedback by an org isn't to inform the org of the criticism, it's to hear their point of view, and see whether any events have been misrepresented (from their point of view). And, ideally, to give them a heads up to give a response shortly after the criticism goes up

Comment by Neel Nanda on Cash Transfers as a Simple First Argument · 2021-04-18T22:10:56.832Z · EA · GW

I really like this example! I used in an interview I gave about EA and thought it went down pretty well. My main concern with using it is that I don't personally fund direct cash transfers (or think they're anywhere near the highest impact thing), and both think it can misrepresent the movement, and think that it's disingenuous to imply that EA is about robustly good things like this, when I actually care most about things like AI Safety.

As a result, I frame the example like this (if I can have a high-context conversation):

  • Effectiveness, and identifying the highest impact interventions is a cornerstone of EA. I think this is super important, because there's really big spread between how much good different interventions, much more than feels intuitive
  • Direct cash transfers are a proof of concept: There's good evidence that doubling your income increases your wellbeing by the same amount, no matter how wealthy you were to start with. We can roughly think of helping someone as just giving them money, and so increasing their income. The average person in the US has income about 100x the income of the world's poorest people, and so with the resources you'd need to double the income of an average American, you could help 100x as many of the world's poorest people!
    • Contextualise, and emphasise just how weird 100x differences are - these don't come up in normal life. It'd be like you were considering renting buying a laptop for $1000, shopped around for a bit, and found one just as good for $10! (Pick an example that I expect to resonate with big expenses the person faces, eg a laptop, car, rent, etc)
    • Emphasis that this is just a robust example as a proof of concept, and that in practice I think we can do way better - this just makes us confident that spread is out there, and worth looking for. Depending on the audience, maybe explain the idea of hits-based giving, and risk neutrality.
Comment by Neel Nanda on Concerns with ACE's Recent Behavior · 2021-04-17T09:24:26.097Z · EA · GW

Thanks for sharing, that part updated me a lot away from Ben's view and towards Hypatia's view. 

An aspect I found particularly interesting was that Anima International seems to do a lot of work in Eastern European countries, which tend to be much more racially homogenous, and I presume have fairly different internal politics around race to the US. And that ACE's review emphasises concerns, not about their ability to do good work in their countries, but about their ability to participate in international spaces with other organisations.

They work in: 

Denmark, Poland, Lithuania, Belarus, Estonia, Norway, Ukraine, the United Kingdom, Russia, and France

It seems even less justifiable to me to judge an organisation according to US views around racial justice, when they operate in such a different context.

EDIT: This point applies less than I thought. Looks like Connor Jackson, the person in question, is a director of their UK branch, which I'd consider much closer to the US on this topic. 

Comment by Neel Nanda on Launching a new resource: 'Effective Altruism: An Introduction' · 2021-04-17T09:09:25.569Z · EA · GW

Thanks for the clarification. I'm glad that's in there, and I'll feel better about this once the 'Top 10 problem areas' feed exists, but I still feel somewhat dissatisfied. I think that 'some EAs prioritise longtermism, some prioritise neartermism or are neutral. 80K personally prioritises longtermism, and does so in this podcast feed, but doesn't claim to speak on behalf of the movement and will point you elsewhere if you're specifically interested in global health or animal welfare' is a complex and nuanced point. I generally think it's bad to try making complex and nuanced points in introductory material like this, and expect that most listeners who are actually new to EA wouldn't pick up on that nuance. 

I would feel better about this if the outro episode covered the same point, I think it's easier to convey at the end of all this when they have some EA context, rather than at the start.

A concrete scenario to sketch out my concern:

Alice is interested in EA, and somewhat involved. Her friend Bob is interested in learning more, and Alice looks for intro materials. Because 80K is so prominent, Alice comes across 'Effective Altruism: An Introduction' first, and recommends this to Bob. Bob listens to the feed, and learns a lot, but because there's so much content and Bob isn't always paying close attention, Bob doesn't remember all of it. Bob only has a vague memory of Episode 0 by the end, and leaves with a vague sense that EA is an interesting movement, but only cares about weird, abstract things rather than suffering happening today, and concludes that the movement has got a bit too caught up in clever arguments. And as a result, Bob decides not to engage further.

Comment by Neel Nanda on Launching a new resource: 'Effective Altruism: An Introduction' · 2021-04-17T08:54:45.001Z · EA · GW

Ah, thanks for the clarification! That makes me feel less strongly about the lack of diversity. I interpreted it as prioritising ALLFED over global health stuff as representative of the work of the EA movement, which felt glaringly wrong

Comment by Neel Nanda on Launching a new resource: 'Effective Altruism: An Introduction' · 2021-04-16T15:45:38.705Z · EA · GW

I strongly second all of this. I think 80K represents quite a lot of EAs public facing outreach, and that it's important to either be explicit that this is longtermism focused, or to try to be representative of what happens in the movement as a whole. I think this especially holds for somewhat explicitly framed as an introductory resource, since I expect many people get grabbed by global health/animal welfare angles who don't get grabbed by longtermist angles.

Though I do see the countervailing concern that 80K is strongly longtermism focused, and that it'd be disingenuous for an introduction to 80K to give disproportionate time to neartermist causes, if those are explicitly de-prioritised 

Comment by Neel Nanda on Concerns with ACE's Recent Behavior · 2021-04-16T12:37:35.593Z · EA · GW

Thanks a lot for writing this up and sharing this. I have little context beyond following the story around CARE and reading this post, but based on the information I have, these seem like highly concerning allegations, and ones I would like to see more discussion around. And I think writing up plausible concerns like this clearly is a valuable public service.

Out of all these, I feel most concerned about the aspects that reflect on ACE as an organisation, rather than that which reflect the views of ACE employees. If ACE employees didn't feel comfortable going to CARE, I think it is correct for ACE to let them withdraw. But I feel concerned about ACE as an organisation making a public statement against the conference. And I feel incredibly concerned if ACE really did downgrade the rating of Anima International as a result. 

That said, I feel like I have fairly limited information about all this, and have an existing bias towards your position. I'm sad that a draft of this wasn't run by ACE beforehand, and I'd be keen to hear their perspective. Though, given the content and your desire to remain anonymous, I can imagine it being unusually difficult to hear ACE's thoughts before publishing.

Personally, I consider the epistemic culture of EA to be one of its most valuable aspects, and think it's incredibly important to preserve the focus on truth-seeking, people being free to express weird and controversial ideas, etc. I think this is an important part of EA finding neglected ways to improve the world, identifying and fixing its mistakes, and keeping a focus on effectiveness. To the degree that the allegations in this post are true, and that this represents an overall trend in the movement, I find this extremely concerning, and expect this to majorly harm the movement's ability to improve the world.

Comment by Neel Nanda on Concerns with ACE's Recent Behavior · 2021-04-16T12:27:51.299Z · EA · GW

I interpret it as 'the subgroup of the Effective Altruist movement predominantly focused on animal welfare'

Comment by Neel Nanda on "Insider giving" - An unfortunate donation strategy used by corporate insiders to avoid losses · 2021-04-14T12:09:56.042Z · EA · GW

Interesting! It's not that obvious to me that this is bad. Eg, if this gets people donating stock rather than donating nothing at all, this feels like a cash transfer from the government to charities?

Of course, WHICH charities receive the stock matters a lot here

inflates donation figures.

From the article linked:

And what they find is that "large shareholders’ gifts are suspiciously well timed. Stock prices rise abnormally about 6% during the one-year period before the gift date and they fall abnormally by about 4% during the one year after the gift date, meaning that large shareholders tend to find the perfect day on which to give."

A 4% inflation really doesn't seem that bad? Especially since, as Larks says, charities can sell stock themselves much sooner than a year.

Comment by Neel Nanda on Some quick notes on "effective altruism" · 2021-03-27T08:13:25.029Z · EA · GW

I also find that a bit cringy. To me, the issue is saying "I have SUCCEEDED at being effective at altruism", which feels like a high bar and somewhat arrogant to explicitly admit to

Comment by Neel Nanda on Long-Term Future Fund: Ask Us Anything! · 2021-03-25T14:57:33.965Z · EA · GW

Do you mean this as distinct from Jonas's suggestion of:

Nah, I think Jonas' suggestion would be a good implementation of what I'm suggesting. Though as part of this, I'd want the LTFF to be less public facing and obvious - if someone googled 'effective altruism longtermism donate' I'd want them to be pointed to this new fund.

Hmm, I agree that a version of this fund could be implemented pretty easily - eg just make a list of the top 10 longtermist orgs and give 10% to each. My main concern is that it seems easy to do in a fairly disingenuous and manipulative way, if we expect all of its money to just funge against OpenPhil. And I'm not sure how to do it well and ethically.

Comment by Neel Nanda on EA Funds has appointed new fund managers · 2021-03-24T21:33:35.716Z · EA · GW

Huh, I find this surprising. I'd thought the Global Health and Development Fund was already intended to focus on hits-based giving in global health. Can you elaborate a bit more on what the middle ground being hit here is, by the current fund?

Comment by Neel Nanda on AMA: Tom Chivers, science writer, science editor at UnHerd · 2021-03-11T11:23:25.660Z · EA · GW

What would your advice be for talking to the media about EA? (And when to figure out whether to do it at all!)

How would you frame the message of EA to go down well with a large audience? (Eg, in an article in a major newspaper). How would this change with the demographics/political bias of that audience? Do you think it's possible to convey longtermist ideas in such a setting?

Being ahead of the curve on COVID-19/pandemics seems like a major win for EA, but it has also been a major global tragedy. How do you think we can best talk about COVID when selling EA, that is both tactful and reflects well on EA?

Comment by Neel Nanda on Don't Be Bycatch · 2021-03-11T11:21:04.020Z · EA · GW

Thanks a lot for writing this! I think this is a really common trap to fall into, and I both see this a lot in others, and in myself.

To me, this feels pretty related to the trap of guilt-based motivation - taking the goals that I care about, and thinking of them as 'I should do this' or as obligations, and feeling bad and guilty when I don't meet them. Combined with having unrealistically high standards, based on a warped and perfectionist view of what I 'should' be capable of, hindsight bias and the planning fallacy and what I think the people around me are capable of. Which combine to mean that I set myself standards I can never really meet, feel guilty for failing to meet them, and ultimately build up aversions that stop me caring about whatever I'm working on, and to flinch away from it.

This is particularly insidious, because I find the intention behind this is often pure and important to me. It comes from a place of striving to be better, of caring about things, and wanting to live in consistency with my values. But in practice, this intention, plus those biases and failure modes, combine in me doing far worse than I could.

I find a similar mindset to your first piece of advice useful: I imagine a future version of myself that is doing far better than I am today, and ask how I could have gotten there. And I find that I'd be really surprised and confused if I suddenly got way better one day. But that it's plausible to me that each day I do a little bit better than before, and that, on average, this compounds over time. Which means it's important to calibrate my standards so that I expect myself to do a bit better than what I have been  realistically capable of before.

If you resonate with that, I wrote a blog post called Your Standards Are Too High on how I (try to) deal with this problem. And the Replacing Guilt series by Nate Soares is phenomenally good, and probably one of the most useful things I've ever read re own my mental health

Comment by Neel Nanda on Alice Crary's philosophical-institutional critique of EA: "Why one should not be an effective altruist" · 2021-02-27T22:10:32.322Z · EA · GW

I think what you've written is not an argument against consequentialism, it's about trying to put numbers on things in order to rank the consequences?

Regardless, that wasn't how I interpreted her case. It doesn't feel like she cares about the total amount of systemic equality and justice in the world. She fundamentally cares about this from the perspective of the individual doing the act, rather than the state of the world, which seems importantly different. And to me, THIS part breaks consequentialism

Comment by Neel Nanda on Alice Crary's philosophical-institutional critique of EA: "Why one should not be an effective altruist" · 2021-02-27T07:59:54.785Z · EA · GW

Thanks for sharing! One thing I didn't notice in the summary: The talk seemed specifically focused on the impact of EA on the animal advocacy space (which I found mildly surprising and interesting, since these critiques pattern match much more to global health/equity/justice concerns)

This article seems to basically boil down to "take a specific view of morality that the author endorses, which heavily emphasises virtue, justice, systemic change and individual obligations, and is importantly not consequentialist, yet also demanding enough to be hard to satisfice on".

Then, once you have taken this alternate view, observe that this wildly changes your moral conclusions and opinions on how to act, and much of what EA stands for.

You can quibble about "the article claims to be challenging the fundamental idea of EA, yet EA is compatible with any notion of the good and capable of doing this effectively". But I personally think that EA DOES have a bunch of common moral beliefs, eg the importance of consequentialism, impartial views of welfare, the importance of scope and numbers, and to some degree utilitarianism. And that EA beliefs are robust to people not sharing all of these views, and to pluralistic views like others in this thread have argued (eg, put in the effort to be a basically decent person according to common sense morality and then ruthlessly optimise for your notion of the good with your spare resources). But I think you also need to make some decisions about what you do and do not value, especially for a moral view that's demanding rather than just "be a basically decent person", and her view seems fairly demanding?

I'm a bit confused about EXACTLY what the view of morality here described is - it pattern matches onto virtue ethics, and views on the importance of justice and systemic change? But I definitely think it's quite different from any system that I subscribe to. And it doesn't feel like the article is really trying to convince me to take up this view, just taking it as implicit. And it seems fine to note that most EAs have some specific moral beliefs, and that if you substantially disagree with those then you have different conclusions? But it's hardly a put down critique of EA, it's just a point that tradeoffs are hard and you need to pick your values to make decisions.

The paragraph of the talk that felt most confusing/relevant:

This philosophical critique brings into question effective altruists’ very notion of doing the “most good.” As effective altruists use it, this phrase presupposes that the rightness of a social intervention is a function of its consequences and that the outcome involving the best consequences counts as doing most good. This idea has no place within an ethical stance that underlies the philosophical critique. Adopting this stance is a matter of seeing the real fabric of the world as endowed with values that reveal themselves only to a developed sensibility. To see the world this way is to leave room for an intuitively appealing conception of actions as right insofar as they exhibit just sensitivity to the worldly circumstances at hand. Accepting this appealing conception of action doesn’t commit one to denying that right actions frequently aim at ends. Here acting rightly includes acting in ways that are reflective of virtues such as benevolence, which aims at the well-being of others. With reference to the benevolent pursuit of others’ well-being, it certainly makes sense to talk about good states of affairs. But it is important, as Philippa Foot once put, “that we have found this end within morality, forming part of it, not standing outside it as a good state of affairs by which moral action in general is to be judged” (Foot 1985, 205). Right action also includes acting, when appropriate, in ways reflective of the broad virtue of justice, which aims at an end—giving people what they are owed—that can conflict with the end of benevolence. If we are responsive to circumstances, sometimes we will act with an eye to others’ well-being, and sometimes with an eye to other ends. In a case in which it is not right to improve others’ well-being, it makes no sense to say that we produce a worse result. To say this would be to pervert our grasp of the matter by importing into it an alien conception of morality. If keep our heads, we will say that the result we face is, in the only sense that is meaningful, the best one. There is here simply no room for EA-style talk of “most good.”

Comment by Neel Nanda on Some EA Forum Posts I'd like to write · 2021-02-23T13:13:39.822Z · EA · GW

I love the idea of this post! I'd be extremely excited to read the forecasting post and I think making that would be highly valuable. I'm not that interested in the others

Comment by Neel Nanda on Local Group Event Idea: EA Community Talks · 2021-01-23T20:51:53.777Z · EA · GW

Ah, awesome! I'd love to hear how it goes

Comment by Neel Nanda on vaidehi_agarwalla's Shortform · 2021-01-21T08:50:48.382Z · EA · GW
  • Offputting/intimidating to newer members

I want to emphasise this point, since I think it applies to both new and more experienced members. I personally find it quite high mental load to actively pay attention to communities on a new platform. Some of these are start-up costs (learning a new interface etc), but there are also ongoing costs of needing to check the new site, etc. And it is much easier to add something to an existing place I already check

Comment by Neel Nanda on Trying to help coral reefs survive climate change seems incredibly neglected. · 2021-01-20T21:29:48.146Z · EA · GW

The main thing I found interesting here is the "$9.9tn" claim, which seems super big. The original source seems to be this paper (non-paywalled link for convenience). I'm not sure how much I buy the paper's estimates and would be curious to hear other people's thoughts!

Comment by Neel Nanda on Lessons from my time in Effective Altruism · 2021-01-17T21:41:47.222Z · EA · GW

Thanks a lot for writing this up! I related to a lot of this, especially to #4. I'm curious if you have any concrete advice for orienting more towards being proactive? Or is this just something that slowly happened over time?

Comment by Neel Nanda on Scope-sensitive ethics: capturing the core intuition motivating utilitarianism · 2021-01-15T17:57:44.841Z · EA · GW

Thanks for this post! This helped clarify a fuzzy intuition I had around utilitarianism, roughly: That some moral positions are obvious, (eg saving many more people >> saving few), and that utilitarianism is the only reasonable system that gets these important parts right. And that I'm uncertain about all of the messy details, but they don't seem clear or important, so I don't care about what the system says about those, so I should follow utilitarianism for everything important

I much prefer this way of framing it.

Comment by Neel Nanda on evelynciara's Shortform · 2021-01-06T09:10:33.209Z · EA · GW

In the classic naive paperclip maximizer scenario, we assume there's a goal-directed AI system, and its human boss tells it to "maximize paperclips." At this point, it creates a plan to turn all of the iron atoms on Earth's surface into paperclips. The AI knows everything about the world, including the fact that blood hemoglobin and cargo ships contain iron. However, it doesn't know that it's wrong to kill people and destroy cargo ships for the purpose of obtaining iron. So it starts going around killing people and destroying cargo ships to obtain as much iron as possible for paperclip manufacturing.

I don't think this is a good representation of the classic scenario. It's not that the AI "doesn't know it's wrong". It clearly has a good enough model of the world to predict eg "if a human saw me trying to do this, they would try to stop me". The problem is coding an AI that cares about right and wrong. Which is a pretty difficult technical problem. One key part of why it's hard is that the interface for giving an AI goals is not the same interface you'd use to give a human goals.

Note that this is not the same as saying that it's impossible to solve, or that it's obviously much harder than making powerful AI in the first place, just that it's a difficult technical problem and solving it is one significant step towards safe AI. I think this is what Paul Christiano calls intent alignment

I think it's possible that this issue goes away with powerful language models, if that can give us an interface to input a goal via a similar interface to instructing a human. And I'm excited about efforts like this one. But I don't think it's at all obvious that this will just happen to work out. For example, GPT-3's true goal is "generate text that is as plausible as possible, based on the text in your training data". And it has a natural language interface, and this goal correlates a bit with "do what humans want", but it is not the same thing.

It's assuming that the system would make a special case for verbal commands that can be interpreted as objective functions and set out to optimize the objective function if possible. At a minimum, the AI system needs to convert each verbal command into a plan to execute it, somewhat like a query plan in relational databases. But not every plan to execute a verbal command would involve maximizing an objective function, and using objective functions in execution plans is probably dangerous for the reason that the classic paperclip argument tries to highlight, as well as overkill for most commands.

This point feels somewhat backwards. Everything Ai systems ever do is maximising an objective function, and I'm not aware of any AI Safety suggestions that get around this (just ones which have creative objective functions). It's not that they convert verbal commands to an objective function, they already have an objective function, which might capture 'obey verbal commands in a sensible way' or it might not. And my read on the paperclip maximising scenario is that "tell the AI to maximise paperclips" really means "encode an objective function that tells it to maximise paperclips"

 

Personally I think the paperclip maximiser scenario is somewhat flawed, and not a good representation of AI x-risk. I like it because it illustrates the key point of specification gaming - that it's really, really hard to make an objective function that captures "do the things we want you to do". But this is also going to be pretty obvious to the people making AGI, and they probably won't have an objective function as clearly dumb as maximise paperclips. But it might not be good enough.

Comment by Neel Nanda on richard_ngo's Shortform · 2021-01-05T17:00:49.864Z · EA · GW

Thanks for writing this up! I've found this thread super interesting to follow, and it's shifted my view on a few important points.

One lingering thing that seems super important is longtermism vs prioritising currently existing people. It still seems to me that GiveWell charities aren't great from a longtermist perspective, but that the vast majority of people are not longtermists. Which creates a weird tension when doing outreach, since I rarely want to begin by trying to pitch longtermism, but it seems disingenuous to pitch GiveWell charities.

Given that many EAs are not longtermist though, this seems overall fine for the "is the movement massively misleading people" question

Comment by Neel Nanda on Retrospective on Teaching Rationality Workshops · 2021-01-05T07:34:15.700Z · EA · GW

Huh, that completely didn't occur to me. Thanks a lot for pointing it out!

Do you think the easy fix you mentioned of "people engage differently with this kind of thing, and some people struggle with sensory detail, feel free to skip the sensory detail step if that doesn't resonate" would be sufficient? Or does it seem important to replace it with a more substantial alternative?

Also, I'm curious, does aphantasia specifically make it hard to simulate visual stimuli? Or is it anything sensory? Eg, can you imagine sounds or textures?

Comment by Neel Nanda on Retrospective on Teaching Rationality Workshops · 2021-01-05T07:30:57.037Z · EA · GW

Aww, thanks! What kinds of stuff did you do when organising rationality workshops?

Comment by Neel Nanda on Long-Term Future Fund: Ask Us Anything! · 2020-12-29T11:34:19.602Z · EA · GW
  • Open access fees for research publications relevant to longtermism, such that this work is available to anyone on the internet without any obstacles, plausibly increasing readership and citations.

How important is this in the context of eg scihub existing?

Comment by Neel Nanda on Strong Longtermism, Irrefutability, and Moral Progress · 2020-12-29T11:16:52.573Z · EA · GW

I feel a bit confused reading that. I'd thought your case was framed around a values disagreement about the worth of the long-term future. But this feels like a purely empirical disagreement about how dangerous AI is, and how tractable working on it is. And possibly a deeper epistemological disagreement about how to reason under uncertainty.

How do you feel about the case for biosecurity? That might help disentangle whether the core disagreement is about valuing the longterm future/x-risk reduction, vs concerns about epistemology and empirical beliefs, since I think the evidence base is noticeably stronger than for AI.

I think there's a pretty strong evidence base that pandemics can happen and, eg, dangerous pathogens can get developed in labs and released from labs. And I think there's good reason to believe that future biotechnology will be able to make dangerous pathogens, that might be able to cause human extinction, or something close to that. And that human extinction is clearly bad for both the present day, and the longterm future. 

If a strong longtermist looks at this evidence, and concludes that biosecurity is a really important problem because it risks causing human extinction and thus destroying the value of the longterm future, and is a thus a really high priority, would you object to that reasoning?

Comment by Neel Nanda on Strong Longtermism, Irrefutability, and Moral Progress · 2020-12-29T10:57:13.392Z · EA · GW

Really? If you're a rationalist (in the broad Popperian sense and the internet-cult sense), and we share common knowledge of each other's beliefs, then shouldn't we be able to argue towards closer agreement? Not if our estimates were totally arbitrary — but clearly they're not. Again, they're just especially uncertain.

I think there is an important point here. One of the assumptions in Aumann's theorem is that both people have the same prior, and I think this is rarely true in the real world. 

I roughly think of Bayesian reasoning as starting with a prior, and then adjusting the prior based on observed evidence. If there's a ton of evidence, and your prior isn't dumb, the prior doesn't really matter. But the more speculative the problem, and the less available evidence, the more the prior starts to matter. And your prior bakes in a lot of your assumptions about the world, and I think it's tricky to resolve disagreements about what your prior should be. At least not in ways that approach being objective. 

I think you can make progress on this. Eg, 'how likely is it that AI could get way better, really fast?' is a difficult question to answer, and could be baked into a prior either way. And things like AI Impact's study of discontinuous progress in other technologies can be helpful for getting closer to consensus. But I think choosing a good prior is still a really hard and important problem, and near impossible to be objective about

Comment by Neel Nanda on Strong Longtermism, Irrefutability, and Moral Progress · 2020-12-29T10:44:42.785Z · EA · GW

To me, the fundamental point isn't probabilities, it's that you need to make a choice about what you do. If I have the option to give a $1mn grant to preventing nuclear war or give the grant to something else, then no matter what I do, I have made a choice. And so, I need to have a decision theory for making a choice here.

And to me, subjective probabilities and Bayesian epistemology more generally, are by far the best decision theory I've come across for making choices under uncertainty. If there's a 1% chance of nuclear war, the grant is worth making, if there's a 10^-15 chance of nuclear war, the grant is not worth making. I need to make a decision, and so probabilities are fundamental, because they are my tool for making a decision.

And there are a bunch of important question where we don't have data, and there's no reasonable way to get data (eg, nuclear war!). And any approach which rejects the ability to reason under uncertainty in situations like this, is essentially the decision theory of "never make speculative grants like this". And I think this is a clearly terrible decision theory (though I don't think you're actually arguing for this policy?)

Comment by Neel Nanda on Long-Term Future Fund: Ask Us Anything! · 2020-12-28T20:55:17.729Z · EA · GW

Perhaps EA Funds shouldn’t focus on grantmaking as much: At a higher level, I’m not sure whether EA Funds’ strategy should be to build a grantmaking organization, or to become the #1 website on the internet for giving effectively, or something else

 

I found this point interesting, and have a vague intuition that EA Funds (and especially the LTFF) are really trying to do two different things:

  1. Having a default place for highly engaged EAs to donate, that is willing to take on large risks, fund things that seem weird, and rely heavily on social connections, the community and grantmaker intuitions
  2. Have a default place for risk-neutral donors who feel value aligned with EA to donate to, who don't necessarily have high trust for the community

Having something doing (1) seems really valuable, and I would feel sad if the LTFF reined back the kinds of things it funded to have a better public image. But I also notice that, eg, when giving donation advice to friends who broadly agree with EA ideas but aren't really part of the community, that I don't feel comfortable recommending EA Funds. And think that a bunch of the grants seem weird to anyone with moderately skeptical priors. (This is partially an opinion formed from the April 2019 grants, and I feel this less strongly for more recent grants).

And it would be great to have a good, default place to recommend my longtermist friends donate to, analogous to being able to point people to GiveWell top charities.

The obvious solution to this is to have two separate institutions, trying to do these two different things? But I'm not sure how workable that is here (and I'm not sure what a 'longtermist fund that tries to be legible and public facing, but without OpenPhil scale of money would actually look like!)

Comment by Neel Nanda on Make a Public Commitment to Writing EA Forum Posts · 2020-12-20T08:03:49.005Z · EA · GW

Congrats on writing a post! I enjoyed reading it. Presumably this is the second post you mentioned?

Comment by Neel Nanda on Ask Rethink Priorities Anything (AMA) · 2020-12-14T16:27:53.845Z · EA · GW

What would you do if Rethink Priorities had significantly more money? (Eg, 2x or 10x your current budget)

Comment by Neel Nanda on Is this a good way to bet on short timelines? · 2020-12-02T09:43:45.325Z · EA · GW

I personally feel skeptical of short AI timelines (though I feel far too confused about this question to have confident views!). I'd definitely be interested in having a call where you try to convince me of this though, if that offer is open to anyone! I expect to find this interesting, so don't care at all about money here.