How to be a good agnostic (and some good reasons to be dogmatic) 2023-02-04T11:08:18.793Z
Tyler Cowen on effective altruism (December 2022) 2023-01-13T09:39:53.181Z
What are the most underrated posts & comments of 2022, according to you? 2023-01-01T22:44:25.254Z
Katja Grace: Let's think about slowing down AI 2022-12-23T00:57:18.917Z
Announcing: EA Forum Podcast – Audio narrations of EA Forum posts 2022-12-05T21:50:14.551Z
Byrne Hobart & Dwarkesh Patel on hardcore believers, monasteries, and effective altruism 2022-12-03T22:00:50.324Z
Biological Anchors external review by Jennifer Lin (linkpost) 2022-11-30T13:06:44.056Z
Derek Parfit’s photographs of Venice 2022-11-18T17:10:42.402Z
Samo Burja: What the collapse of FTX means for effective altruism 2022-11-17T12:16:59.900Z
Bernard Williams: Ethics and the limits of impartiality 2022-09-07T11:26:14.914Z
An Audio Introduction to Nick Bostrom 2022-08-29T08:49:29.424Z
Radio Bostrom: Audio narrations of papers by Nick Bostrom 2022-08-08T15:26:37.973Z
Some core assumptions of effective altruism, according to me 2022-07-29T09:05:54.072Z
How to start a blog in 5 seconds for $0 2022-07-04T08:28:49.666Z
My setup for reading, highlighting and annotation 2022-07-03T10:52:52.886Z
A discussion of Holden Karnofsky's "Most Important Century" series (Thursday 21 October, 19:00 UK) 2021-10-16T20:50:26.359Z
[Link post] Sam Scheffler: Conservatism, Temporal Bias, and Future Generations 2021-09-19T08:44:50.972Z
Nick Bostrom: An Introduction [early draft] 2021-07-31T17:04:20.991Z
The Future of Humanity & The Methods of Ethics: A discussion of Bostrom, Sidgwick and Scheffler (Thursday 22 July, 6:30pm UK) 2021-07-18T18:58:42.912Z
What should an effective altruist be committed to? 2014-12-17T13:21:14.006Z


Comment by peterhartree (Peter_Hartree) on My takes on the FTX situation will (mostly) be cold, not hot · 2023-03-23T10:52:51.022Z · EA · GW

(I made these webpages a couple days after the FTX collapse. Buying domains is cheaper than therapy…)

Comment by peterhartree (Peter_Hartree) on My takes on the FTX situation will (mostly) be cold, not hot · 2023-03-23T10:48:41.815Z · EA · GW

Thoughts on “maximisation is perilous”:

(1) We could put more emphasis on the idea of “two-thirds utilitarianism”.

(2) I expect we could come up with a better name for two-thirds utilitarianism and a snappier way of describing the key thought. Deep pragmatism might work.

Comment by peterhartree (Peter_Hartree) on How much should governments pay to prevent catastrophes? Longtermism’s limited role · 2023-03-20T13:11:51.169Z · EA · GW

Thank you (again) for this.

I think this message should be emphasized much more in many EA and LT contexts, e.g. introductory materials on and

As your paper points out: longtermist axiology probably changes the ranking between x-risk and catastrophic risk interventions in some cases. But there's lots of convergence, and in practice your ranked list of interventions won't change much (even if the diff between them does... after you adjust for cluelessness, Pascal's mugging, etc).

Some worry that if you're a fan of longtermist axiology then this approach to comms is disingenous. I strongly disagree: it's normal to start your comms by finding common ground, and elaborate on your full reasoning later on.

Andrew Leigh MP seems to agree. Here's the blurb from his recent book, "What's The Worst That Could Happen?":

Did you know that you’re more likely to die from a catastrophe than in a car crash? The odds that a typical US resident will die from a catastrophic event—for example, nuclear war, bioterrorism, or out-of-control artificial intelligence—have been estimated at 1 in 6. That’s fifteen times more likely than a fatal car crash and thirty-one times more likely than being murdered. In What’s the Worst That Could Happen?, Andrew Leigh looks at catastrophic risks and how to mitigate them, arguing provocatively that the rise of populist politics makes catastrophe more likely.

Comment by peterhartree (Peter_Hartree) on Donation offsets for ChatGPT Plus subscriptions · 2023-03-17T00:09:13.610Z · EA · GW

Thanks for the post.

I've decided to donate $240 to both GovAI and MIRI to offset the $480 I plan to spend on ChatGPT Plus over the next two years ($20/month).

These amounts are small.

Let's say the value of your time is $500 / hour.

I'm not sure it was worth taking the time to think this through so carefully.

To be clear, I think concrete actions aimed at quality alignment research or AI policy aimed at buying more time are much more important than offsets.


By publicly making a commitment to offset a particular harm, you're establishing a basis for coordination - other people can see you really care about the issue because you made a costly signal


I won't dock anyone points for not donating to offset harm from paying for AI services at a small scale. But I will notice if other people make similar commitments and take it as a signal that people care about risks from commercial incentives.

Honestly, if someone told me they'd done this, my first thought would be "huh, they've taken their eye off the ball". My second would be "uh oh, they think it's a good idea to talk about ethical offsetting".

I think it's worth pricing in the possibility of reactions like this when reflecting on whether to take small actions like this for the purpose of signalling.

Comment by peterhartree (Peter_Hartree) on EA needs Life-Veterans and "Less Smart" people · 2023-03-09T14:49:18.926Z · EA · GW

+1 to Geoffrey here.

I still think of EA as a youth movement, though this label is gradually fading as the "founding cohort" matures.

It a trope that the youth are sometimes too quick to dismiss the wiser counsel of their elders.

I've witnessed many cases where, to my mind, people were (admirably) looking for good explicit arguments that they can easily understand, but (regrettably) forgetting that things like inferential distance sometimes make it hard to understand the views of people who are wiser or more expert than you are.

I'm sure I've made this mistake too. That said: my intellectual style is fairly slow and conservative compared to many of my peers, and I'm often happy to trust inarticulate holistic judgements over apparently solid explicit arguments. These traits insulate somewhat me from this youthful failure mode, though they expose me to similarly grave errors in other directions :/

Comment by peterhartree (Peter_Hartree) on What are the best examples of object-level work that was done by (or at least inspired by) the longtermist EA community that concretely and legibly reduced existential risk? · 2023-02-13T14:48:06.207Z · EA · GW

The CLTR Future Proof report has influenced UK government policy at the highest levels.

E.g. The UK "National AI Strategy ends with a section on AGI risk, and says that the Office for AI should pay attention to this.

Comment by peterhartree (Peter_Hartree) on What are the best examples of object-level work that was done by (or at least inspired by) the longtermist EA community that concretely and legibly reduced existential risk? · 2023-02-13T14:40:16.199Z · EA · GW

If you think the UN matters, then this seems good:

On September 10th 2021, the Secretary General of the United Nations released a report called “Our Common Agenda”. This report seems highly relevant for those working on longtermism and existential risk, and appears to signal unexpectedly strong interest from the UN. It explicitly uses longtermist language and concepts, and suggests concrete proposals for institutions to represent future generations and manage catastrophic and existential risks.

Comment by peterhartree (Peter_Hartree) on 'Evolutionary debunking arguments' about human moral intuitions · 2023-01-29T14:51:26.079Z · EA · GW

What matters is just whether there is a good justification to be found or not, which is a matter completely independent of us and how we originally came by the belief.

This is a good expression of the crux.

For many people—including many philosophers—it seems odd to think that questions of justification have nothing to do with us and our origins.

This is why the question of "what are we doing, when we do philosophy?" is so important.

The pragmatist-naturalist perspective says something like:

We are clever beasts on an unremarkable planet orbiting an unremarkable star, etc. Over the long run, the patterns of thought we call justified are those which are adaptive (or are spandrels along for the ride).

To be clear: this perspective is compatible with having fruitful conversations about the norms of morality, scientific enquiry, and all the rest.

Comment by peterhartree (Peter_Hartree) on Doing EA Better · 2023-01-29T14:29:37.917Z · EA · GW

I took the Gell-Mann amnesia interpretation and just concluded that he's probably being daft more often in areas I don't know so much about.

This is what Cowen was doing with his original remark.

Comment by peterhartree (Peter_Hartree) on 'Evolutionary debunking arguments' about human moral intuitions · 2023-01-28T13:47:34.003Z · EA · GW

From Moral Tribes:

Deep pragmatism seeks common ground. Not where we think it ought to be, but where it actually is.

With a little perspective, we can reflect and reach agreements with our heads, despite the irreconcilable differences in our hearts.

We all want to be happy. None of us wants to suffer. And our concern for happiness and suffering lies behind nearly everything else that we value, though to see this requires some reflection.

We can take this kernel of personal value and turn it into a moral value by adding the essence of the Golden Rule: your happiness and your suffering matter no more, and no less, than anyone else’s.

Finally, we can turn this moral value into a moral system by running it through the outcome-optimizing apparatus of the human prefromal cortex. This yields a moral philosophy that no one loves but that everyone “gets”—a second moral language that members of all tribes can speak.

Deep pragmatism is utilitarianism in the spirit of Jeremy Bentham. Bentham is often misread as a narrow-minded moral realist. But he is best read as a political pragmatist: not seeking a metaphysical principle, but rather a practical principle—something most of us can agree on—upon which to build a stable polity.

Comment by peterhartree (Peter_Hartree) on 'Evolutionary debunking arguments' about human moral intuitions · 2023-01-28T13:45:08.692Z · EA · GW

Joshua Greene's book, Moral Tribes, presents a compelling EDA. He doesn't bother directly arguing against the philosophical objections to EDA.

In general I think Moral Tribes is a must-read for those who are interested in evolutionary psychology, moral philosophy and especially utilitarianism.

Among other things, Greene argues that utilitarianism needs a rebrand. His suggestion: deep pragmatism.

Comment by peterhartree (Peter_Hartree) on 'Evolutionary debunking arguments' about human moral intuitions · 2023-01-28T13:40:05.672Z · EA · GW

EDAs are a problem for non-naturalistic moral realists in the British tradition (e.g. Sidgwick, Parfit). Some people think they're a problem for naturalistic moral realists too.

I've read ~10 philosophy papers that try to defend non-naturalistic moral realism against EDAs.

More than half of these defences have the following structure:

(P1) Metaethical claim about moral truth.

(P2) EDAs are incompatible with (P1).

(P3) Conclusion: EDAs are false.

A typical metaethical claim for (P1):

(P1*) The normative and the descriptive are fundamentally different (bangs table).

According to me, we should just accept EDAs and reject dubious versions of (P1).

Comment by peterhartree (Peter_Hartree) on How to Reform Effective Altruism after SBF Vox interview with Holden Karnofsky 1/23/2023 · 2023-01-27T08:46:57.676Z · EA · GW

The conversation touches on a couple of blog posts by Holden, especially:

Comment by peterhartree (Peter_Hartree) on How to Reform Effective Altruism after SBF Vox interview with Holden Karnofsky 1/23/2023 · 2023-01-27T08:38:19.573Z · EA · GW

Thanks for sharing.

Tyler Cowen is another figure who takes utilitarianism and longtermism seriously, but not too seriously. See:

Comment by peterhartree (Peter_Hartree) on EA should help Tyler Cowen publish his drafted book in China · 2023-01-21T10:06:54.105Z · EA · GW

Good idea. Are you planning to make this happen? What steps will you take next?

My quick thought is:

  1. Email Tyler. List a few people you think are worth reaching out to.

  2. If Tyler is keen, help him do the work.

It'd probably be worth releasing it in English as well, for an Anglosphere audience.

Tyler released Stubborn Attachments on Medium a year or two before it was published by Stripe Press. He could do the same for this book, with some big caveats at the start, along the lines that he made in the podcast.

If you don't plan to do something like (1) and (2) please DM me on Twitter. I'd probably be up for it, but I'm not sure I'd have time to product manage / significantly help Tyler after the initial setup.

I could fairly easily create an audiobook version. The TYPE III AUDIO AI narration pipeline is coming together nicely.

Comment by peterhartree (Peter_Hartree) on FLI FAQ on the rejected grant proposal controversy · 2023-01-20T21:18:33.161Z · EA · GW

I see "clearly expressing anger" and "posting when angry" as quite different things.

I endorse the former, but I rarely endorse the latter, especially in contexts like the EA Forum.

Let's distinguish different stages of anger:

The "hot" kind—when one is not really thinking straight, prone to exaggeration and uncharitable interpretations, etc.

The "cool" kind—where one can think roughly as clearly about the topic as any other.

We could think of "hot" and "cold" anger as a spectrum.

Most people experience hot anger from time to time. But I think EA figures—especially senior figures—should model a norm of only posting on the EA Forum when fairly cool.

My impression is that, during the Bostrom and FLI incidents, several people posted with considerably more hot anger than I would endorse. In these cases, I think the mistake has been quite harmful, and may warrant public and private apologies.

As a positive example: Peter Hurford's blog post, which he described as "angry", showed a level of reasonableness and clarity that made it, in my mind, "above the bar" to publish. The text suggests a relatively cool anger. I disagree with some parts of the post, but I am glad he published it. At the meta-level, my impression is that Peter was well within the range of "appropriate states of mind" for a leadership figure to publish a message like that in public.

Comment by peterhartree (Peter_Hartree) on On Living Without Idols · 2023-01-19T09:23:10.895Z · EA · GW

Thank you for this.

Strong contender for "top 10 EA Forum posts of 2023, according to Peter Hartree".

Comment by peterhartree (Peter_Hartree) on The EA community does not own its donors' money · 2023-01-18T20:06:52.698Z · EA · GW

A number of recent proposals have detailed EA reforms. I have generally been unimpressed with these - they feel highly reactive and too tied to attractive sounding concepts (democratic, transparent, accountable) without well thought through mechanisms.


Why more democratic decision making would be better has gone largely unargued. To the extent it has been, "conflicts of interest" and "insularity" seem like marginal problems compared to basically having a deep understanding of the most important questions for the future/global health and wellbeing.

Agree. Mood affiliation is not a reliable path to impact. 

Comment by peterhartree (Peter_Hartree) on Doing EA Better · 2023-01-18T13:49:59.285Z · EA · GW

This post is much too long and we're all going to have trouble following the comments.

It would be much better to split this up and post as a series. Maybe do that, and replace this post with links to the series?

Comment by peterhartree (Peter_Hartree) on Doing EA Better · 2023-01-18T13:37:53.744Z · EA · GW

Context: I've worked in various roles at 80,000 Hours since 2014, and continue to support the team in a fairly minimal advisory role.

Views my own.

I agree that the heavy use of a poorly defined concept of "value alignment" has some major costs.

I've been moderately on the receiving end of this one. I think it's due to some combination of:

  1. I take Nietzsche seriously (as Derek Parfit did).
  2. I have a strong intellectual immune system. This means it took me several years to get enthusiastically on board with utilitarianism, longtermism and AI safety as focus areas. There's quite some variance on the speed with which key figures decide to take an argument at face value and deeply integrate it into their decision-making. I think variance on this dimension is good—as in any complex ecosystem, pace layers are important.
  3. I insisted on working mostly remotely.
  4. I've made a big effort to maintain an "FU money" relationship to EA community, including a mostly non-EA friendship group.
  5. I am more interested in "deep" criticism of EA than some of my peers. E.g. I tweet about Peter Thiel on death with dignity, Nietzsche on EA, and I think Derek Parfit made valuable contributions but was not one of the Greats.
  6. Some of my object-level views have been quite different to those of my peers over the years. E.g. I’ve had reservations about the maximisation meme ever since I got involved, along the lines of those recently expressed by Holden.
  7. I mostly quit 80,000 Hours in autumn 2015, mainly due to concerns about strategy, messaging and fit with my colleagues. I took on a 90% time role again in 2017, for roughly 4 years.
  8. I have some beliefs and traits that some people find suggestive of a lack of moral seriousness (e.g. I'm into two-thirds utilitarianism; I don't lose sleep over EA/LT concerns; I'm fairly normie by EA standards).

There are some advantages to the status quo, and I don't have a positive proposal for improving this. 

If someone at 80K or CEA wants to pay me to think about it for a day, I'd be up for that. Maybe I'll do it anyway. I hesitate because I'm not sure how tractable this is, and I would not be surprised if, on further reflection, I came to think the status quo is roughly optimal at current margins.

Comment by peterhartree (Peter_Hartree) on Thread for discussing Bostrom's email and apology · 2023-01-16T13:56:59.854Z · EA · GW

In the original emails and the latest apology, he has done less to distance himself from racism than to endorse it.

In what ways do you think the 2023 message endorses racism? Is there a particular quote or feature of it that stands out to you?

The apology contains an emphatic condemnation of the use of a racist slur:

I completely repudiate this disgusting email from 26 years ago. It does not accurately represent my views, then or now. The invocation of a racial slur was repulsive. I immediately apologized for writing it at the time, within 24 hours; and I apologize again unreservedly today. I recoil when I read it and reject it utterly.

The 1996 email was part of a discussion of offensive communication styles. It included a heavily contested and controversial claim about group intelligence, which I will not repeat here. [1] Claims like these have been made by racist groups in the past, and an interest in such claims correlates with racist views. But there is not a strict correlation here: expressing or studying such claims does not entail you have racist values or motivations.

In general I see genetic disparity as one of the biggest underlying causes of inequality and injustice. I've no informed views or particular interests in averages between groups of different skin colour. But I do feel terrible for people who find themselves born with a difficult hand in the genetic lottery (e.g. a tendency to severe depression or dementia). And so I endorse research on genetic causes of chronic disadvantage, with the hope that we can improve things.

[1] This comment by Geoffrey Miller provides a bit more context on why Bostrom may have chosen this particular example.

Comment by peterhartree (Peter_Hartree) on Does EA understand how to apologize for things? · 2023-01-16T07:54:04.949Z · EA · GW

When someone's actions are criticised, they are often criticised for several different things. They may wish to apologise for some of these things, while explaining and defending others.

Comment by peterhartree (Peter_Hartree) on Do better, please ... · 2023-01-15T18:42:04.467Z · EA · GW

I think Bostrom 1996 deserves criticism for (ii).

He may deserve criticism for (i) as well.

Comment by peterhartree (Peter_Hartree) on Do better, please ... · 2023-01-15T18:31:38.592Z · EA · GW

While I appreciate posts like this, which speak about the importance of epistemic integrity, it seems to miss the fact that applauding someone for not lying is great but not if the belief they're holding is bad.

It is suggestive that you describe the belief as "bad" rather than "wrong","incorrect" or "false".

It's fine to criticise people for (i) holding beliefs that are wrong, or for (ii) expressing beliefs that are probably best not expressed in a given context (whether they are true or false).

But it's important to separate, as best we can, claims about (a) whether a particular belief is true from claims about (b) whether holding that belief has good consequences, or (c) correlates with good moral character.

This post would be better if it made this distinction more clearly.

Comment by peterhartree (Peter_Hartree) on CEA statement on Nick Bostrom's email · 2023-01-15T02:10:44.613Z · EA · GW

Agree. At a meta-level, I was disappointed by the seemingly panicked and reactive nature of the statement. The statement is bad, and so, it seems, is the process that produced it.

Comment by peterhartree (Peter_Hartree) on Quick PSA: (not is a malicious scam site · 2023-01-13T20:56:01.273Z · EA · GW

(If you want to visit the domain but not have it saved in your Chrome omnibar, just open an Incognito window.)

Comment by peterhartree (Peter_Hartree) on Quick PSA: (not is a malicious scam site · 2023-01-13T20:54:04.320Z · EA · GW

Thanks for the heads up.

From memory: I tried to register this domain 4-5 years ago, but it was already taken. 80,000 Hours does own,, and several other variations.

Comment by peterhartree (Peter_Hartree) on Thread for discussing Bostrom's email and apology · 2023-01-13T14:00:59.088Z · EA · GW

Please also feel free to give us feedback on this thread and approach. This is the first time we’ve tried this in response to events that dominate Forum discussion. You can give feedback by commenting on the thread.

Seems good to me, especially as a "quick experiment".

I shared some related thoughts on Chris Leong's recent thread: Should the forum be structured such that the drama of the day doesn't occur on the front page?

Comment by peterhartree (Peter_Hartree) on Should the forum be structured such that the drama of the day doesn't occur on the front page? · 2023-01-13T12:32:43.041Z · EA · GW

In spirit I think the answer is "yes". 

That said, a bad implementation could easily be worse than the status quo. Some obvious risks are:

  1. It could seem like (and/or actually be) an unhelpful form of suppressing debate. 
  2. External critics will probably characterise it as such, no matter how well the relevant tradeoffs are struck.
  3. There may be various other valuable functions (e.g. a read on community mood, a venue to let off steam) that significantly offset the cost of distraction.

I don't have quick thoughts on what a good implementation would look like, beyond:

(a) The general view that it should exploit "the power of defaults".

(b) Maybe look at what Hacker News does. From memory, their algorithm attempts to detect politically charged topics and emerging flame wars.

(c) On the posting side, perhaps there could be a "want to take a break before posting?" prompt, or even an enforced delay.  This might increase the quality of debate without much downside risk. I have noticed social media platforms experimenting with UI features along these lines.

Comment by peterhartree (Peter_Hartree) on My Thoughts on Bostrom's "Apology for an Old Email" · 2023-01-13T07:25:38.359Z · EA · GW

Thank you for posting this.

I roughly share the views you expressed here.

As you intended, the post is a valuable counterbalance to some of the other discussion.

Comment by peterhartree (Peter_Hartree) on CEA statement on Nick Bostrom's email · 2023-01-13T01:01:18.463Z · EA · GW

Some context:

  1. Bostrom's problematic email was written in 1996.

  2. Bostrom claims to have apologised for the email back in 1996, within 24 hours after sending it. If that's right, then the 2023 message is his second apology.

I am disappointed that the CEA statement does not include these details.

Comment by peterhartree (Peter_Hartree) on A personal response to Nick Bostrom's "Apology for an Old Email" · 2023-01-13T00:43:48.041Z · EA · GW

In my view, Bostrom's email would have been offensive in the 90s and it is offensive now, for good reason.


Bostrom’s apology is defensively couched - emphasising the age of the email, what others wrote on the listserv, that it would be best forgotten, that fear that people might smear him. I think that is cowardly and shows a disappointing lack of ownership of his actions.

I think these details are important context. I disagree with the final sentence.

When you are willfully disengaged from the empathy that underlies common decency

I don't see grounds for describing Bostrom in such harsh terms.

Comment by peterhartree (Peter_Hartree) on Reflections on Wytham Abbey · 2023-01-11T13:13:05.307Z · EA · GW

Personal view: the Wytham project is good. The lack of announcement and other proactive comms about it was a significant unforced error.

Comment by peterhartree (Peter_Hartree) on What are the most underrated posts & comments of 2022, according to you? · 2023-01-02T08:54:12.612Z · EA · GW

P.S. If you don't like the Bernard Williams stuff, I'd love to hear your quick thoughts on why.

He is a divisive figure, especially in Oxford philosophy circles. But Parfit was correct to take him seriously.

His book "Ethics & The Limits of Philosophy" is often recommended as the place to start.

Comment by peterhartree (Peter_Hartree) on What are the most underrated posts & comments of 2022, according to you? · 2023-01-01T23:48:23.811Z · EA · GW

My take: perhaps the more principled among us should make room for more messy fudges in our thought. Cluster thinking, bounded commensurability and two-thirds utilitarianism for the win.

Comment by peterhartree (Peter_Hartree) on What are the most underrated posts & comments of 2022, according to you? · 2023-01-01T23:21:56.609Z · EA · GW


Bernard Williams: Ethics and the limits of impartiality




Why it's good:

Derek Parfit saw Bernard Williams as his most important antagonist. Parfit was obsessed with Williams’ “Internal & External Reasons” paper for several decades.

My post introduces some of Bernard Williams’ views on metaphilosophy, metaethics and reasons.

What are we doing when we do moral philosophy? How should this self-understanding inform our practice of philosophy, and what we might hope to gain from it?

According to Williams:

Moral philosophy is about making sense of the situation in which we find ourselves, and deciding what to do about it.

Williams wants to push back against a “scientistic” trend in moral philosophy, and against philosophers who exhibit “a Platonic contempt for the the human and the contingent in the face of the universal”. Such philosophers believe that:

if there were an absolute conception of the world, a representation of it which was maximally independent of perspective, that would be better than more perspectival or locally conditioned representations of the world.

And, relatedly:

that offering an absolute conception is the real thing, what really matters in the direction of intellectual authority

Williams thinks there’s another way. It may not give us everything we want, but perhaps it’s all we can have. 

If the post leaves you wanting more, I got into related themes on Twitter last night, in conversation with The Ghost of Jeremy Bentham after some earlier exegetical mischief. Scroll down and click “Show replies”.

Comment by peterhartree (Peter_Hartree) on What are the most underrated posts & comments of 2022, according to you? · 2023-01-01T23:00:16.010Z · EA · GW


EA’s brain-over-body bias, and the embodied value problem in AI alignment


Geoffrey Miller


Why it's good:

Embodied cognition is a hot topic in cognitive science. Are AI safety people overlooking this?

From Geoffrey’s introduction:

Evolutionary biology and evolutionary medicine routinely analyze our bodies’ biological goals, fitness interests, and homeostatic mechanisms in terms of how they promote survival and reproduction. However the EA movement includes some ‘brain-over-body biases’ that often make our brains’ values more salient than our bodies’ values. This can lead to some distortions, blind spots, and failure modes in thinking about AI alignment. In this essay I’ll explore how AI alignment might benefit from thinking more explicitly and carefully about how to model our embodied values.

Big, if there’s something to it. But the piece received one three word comment...

Comment by peterhartree (Peter_Hartree) on What are the most underrated posts & comments of 2022, according to you? · 2023-01-01T22:48:36.631Z · EA · GW


Getting on a different train: can Effective Altruism avoid collapsing into absurdity?


Peter McLaughlin


Why it's good:

McLaughlin highlights a problem for people who want to say that scale matters, and also avoid the train to crazy town.

It's not clear how anyone actually gets off the train to crazy town. Once you allow even a little bit of utilitarianism in, the unpalatable consequences follow immediately. The train might be an express service: once the doors close behind you, you can’t get off until the end of the line.

As Richard Y. Chappell has put it, EAs want ‘utilitarianism minus the controversial bits’. Yet it’s not immediately clear how the models and decision-procedures used by Effective Altruists can consistently avoid any of the problems for utilitarianism: as examples above illustrate, it’s entirely possible that even the simplest utilitarian premises can lead to seriously difficult conclusions.

Tyler Cowen wrote a paper on this problem in 1996, called  ’What Do We Learn from the Repugnant Conclusion?’. McLaughlin's post opens with an excellent summary.

The upshot:

For any moral theory with universal domain where utility matters at all, either the marginal value of utility diminishes rapidly (asymptotically) towards zero, or considerations of utility come to swamp all other values.

Uh oh!

Comment by peterhartree (Peter_Hartree) on Announcing the EA Merch Store! · 2022-12-30T07:54:25.994Z · EA · GW

Can you say why?

I've just edited my comment to replace "appear" with "come across as" because maybe the original phrasing makes the point sound more focused on physical appearance than I intended.

Comment by peterhartree (Peter_Hartree) on Announcing the EA Merch Store! · 2022-12-30T01:40:05.804Z · EA · GW

Anecdotally, in November I noticed that "being someone who regularly wears EA t-shirts" was somewhat predictive of "being very upset by the FTX blow up". Obviously the merch won't be causing the upset, but the decision to wear merch may be part of a pattern of relating that leaves some people in a less than ideal position.

Comment by peterhartree (Peter_Hartree) on Announcing the EA Merch Store! · 2022-12-30T01:18:19.527Z · EA · GW

Merch helps EAs express themselves and fosters a sense of community. It can also help EAs recognize each other within the same setting or act as a conversation-starter at a function. But most importantly, it’s fun!

How confident are you that encouraging members of the EA community to wear EA merch is a good idea?

I've not thought about this much but my inside view is somewhat against EA merch.

Some ways that merch might backfire:

  1. Merch may encourage some individuals towards excessive identification with EA as a body of ideas or as a community.

  2. The merch, or the people who wear the merch, may come across as low status, off-putting or otherwise unattractive.

  3. Merch may contribute to unhelpful perceptions of EA (e.g. as a youth movement).

  4. People may unwittingly wear merch in contexts where it undermines their goals (e.g. damages their credibility).

  5. SBF might wear your merch.

Personally, I have a strong aversion to the idea of people wearing EA merch (I'm somewhat negative for startup merch and somewhat positive about band merch). I'm not sure where this comes from and how much to weigh it.

I've briefly discussed this with 5-10 people over the years. On a personal level, several shared my strong aversion to EA merch, while several others were neutral or strongly positive. I don't recall anyone having a confident overall view about whether merch should be encouraged.

I'd be interested to hear your thoughts on this, and especially your own steelman of the case that merch is, in fact, best discouraged.

If I thought about this more I guess that I'd end up moderately opposed to merch that uses the EA logo but neutral and relaxed about merch that is more oblique. Several of the items on your store are in the latter category and seem nicely done to me.

Comment by peterhartree (Peter_Hartree) on Working with the Beef Industry for Chicken Welfare · 2022-12-22T13:16:59.442Z · EA · GW

Brian Tomasik (2018) claims that pork, beef and meat cause far less animal suffering per meal than farmed fish, eggs, and chicken.

If he’s right, then even if we hold the total consumption of animal protein constant, it’d be good to shift consumption towards beef (or pork), and away from chicken (or farmed fish).

Working with beef producers to improve regulations around chicken welfare—and presumably raise the price of chicken—would probably lead people to reduce some chicken meals with beef meals (e.g. “this chicken burger is expensive… I’ll have a beef burger instead”). This could be another important mechanism for reducing total harm—in addition to reducing average suffering per chicken meal, and somewhat reducing total meat consumption.

(I’m only considering animal welfare here. An all-things-considered take would factor in environmental issues, public health, and so on…)

Comment by peterhartree (Peter_Hartree) on High-level hopes for AI alignment · 2022-12-22T11:50:45.543Z · EA · GW

Error: the "See here for audio version" link points to a different Cold Takes post.

The correct link is:

Comment by peterhartree (Peter_Hartree) on New blog: Some doubts about effective altruism · 2022-12-22T11:43:27.038Z · EA · GW

Cool. I'm interested to read some of these.

Hot take: I think you should change the name.

Current name has several issues:

(1) Confusing: based on the name alone, I'd expect the blog to contain very fundamental criticism of EA ideas or community, rather than criticism that is pretty well in the spirit of the enterprise. I'd also expect it to be more hostile than is justified by values or epistemic norms I share.

(2) Bad associations: I've seen the phrase "ineffective altruism" a few times before. All the examples I can remember were in the context of low-quality criticism and vibe-based sniping on Twitter.

(3) Hostile to a key audience: one of your most important audiences probably won't like the name much. If you're trying to have a good discussion, it's usually a bad idea to open by suggesting your interlocutor is misguided or "ineffective". [1]

The combination of (1)-(3) probably explain why this post doesn't already have 100 karma (83 at the time of writing). I'd guess the name will reduce engagement and sharing going forwards by at least 10% compared to a neutral name like "Thorstad's blog".

Thanks again for starting this. I will follow along with interest.

[1] Unfortunately the name "effective altruism" also does this, because it's dunking on the foil (regular altruism).

Comment by peterhartree (Peter_Hartree) on Reflections on Vox's "How effective altruism let SBF happen" · 2022-12-14T16:01:08.498Z · EA · GW

I’m not going to respond to the “show me the evidence” requests for now because I’m short on time and it’s hard to do this well. Also: I think you and most readers can probably identify a bunch of evidence in favour of these takes if you take a while to look.

Comment by peterhartree (Peter_Hartree) on Reflections on Vox's "How effective altruism let SBF happen" · 2022-12-14T15:57:50.326Z · EA · GW

I’m sorry to hear you’re finding this frustrating. Personally I’m enjoying our exchange because it’s giving me a reason to clarify and write down a bunch of things I’ve been thinking about for a long time, and I’m interested to hear what you and others make of them.

On Twitter I suggested we arrange a time to call. Would you be up for this? If yes, send me a DM.

Comment by peterhartree (Peter_Hartree) on Reflections on Vox's "How effective altruism let SBF happen" · 2022-12-14T15:54:03.701Z · EA · GW

it wasn't until philosophers raised the stakes to salience that x-risk started to be taken even close to sufficiently seriously.

I agree that philosophers, especially Derek Parfit, Nick Bostrom and Tyler Cowen*, have helped get this up the agenda. So too have many economists, astronomers, futurists, etc. Philosophers don’t have a monopoly on identifying what matters in practice—in fact they’re usually pretty bad at this.

Same thing goes if we look at social movements instead of individuals: the anti-nuclear bomb and environmental folks may have done more for getting catastrophic risk up the agenda than effective altruism has so far—especially in terms of generating a widespread culture concern and sense of unease, which certainly warmed up the audience for Bostrom, Parfit, and so on.

Effective altruism movement is only just getting started (hopefully), and it has achieved remarkable successes already. So I do think we’re on track to play a critical role, and we have Bostrom and Parfit and Ord and Sidgwick and Cowen to thank for that—along with many, many others.

*Those who don’t see Tyler Cowen as fundamentally a philosopher—perhaps one of the greats, certainly better than Parfit (with whom he collaborated early on)—are not following carefully.

Comment by peterhartree (Peter_Hartree) on Reflections on Vox's "How effective altruism let SBF happen" · 2022-12-14T15:48:20.220Z · EA · GW

There is no sense to the idea that philosophically-informed decision-making is inherently more risky than philosophically ignorant decision-making. [Quite the opposite: it wasn't until philosophers raised the stakes to salience that x-risk started to be taken even close to sufficiently seriously.]

I strongly disagree with this. The key reason is: most of the time, norms that have been exposed to evolutionary selection pressures beat explicit “rational reflection” by individual humans. One of the major mistakes of Enlightenment philosophers was to think it is usually the other way around. These mistakes were plausibly a necessary condition for some of the horrific violence that’s taken place since they started trending.

I often run into philosophy graduates who tell me that relying on intuitive moral judgements about particular cases is “arrogant”. I reply by asking “where do these intuitions come from?” The metaphysical realists say “they are truths of reason, underwritten by the non-natural essence of rationality itself”. The naturalists say: “these intuitions were transmitted to you via culture and genetics, itself subject to aeons of evolutionary pressure”. I side with the naturalists, despite all the best arguments for non-naturalism (to my mind, they’re mostly bad!).

One way to think about the 21st century predicament is that we usually learn via trial and error and selection pressures, but this dynamic in a world with modern technology seems unlikely to go well.

Comment by peterhartree (Peter_Hartree) on Reflections on Vox's "How effective altruism let SBF happen" · 2022-12-14T09:21:54.238Z · EA · GW

I also don't see any evidence for the claim of EA philosophers having "eroded the boundary between this kind of philosophizing and real-world decision-making".

Have you visited the 80,000 Hours website recently?

I think that effective altruism centrally involves taking the ideas of philosophers and using them to inform real-world decision-making. I am very glad we’re attempting this, but we must recognise that this is an extraordinarily risky business. Even the wisest humans are unqualified for this role. Many of our attempts are 51:49 bets at best—sometimes worth trying, rarely without grave downside risk, never without an accompanying imperative to listen carefully for feedback from the world. And yes—diverse, hedged experiments in overconfidence also make sense. And no, SBF was not hedged anything like enough to take his 51:49 bets—to the point of blameworthy, perhaps criminal negligence.

A notable exception to the “we’re mostly clueless” situation is: catastrophes are bad. This view passes the “common sense” test, and the “nearly all the reasonable takes on moral philosophy” test too (negative utilitarianism is the notable exception). But our global resource allocation mechanisms are not taking “catastrophes are bad” seriously enough. So, EA—along with other groups and individuals—has a role to play in pushing sensible measures to reduce catastrophic risks up the agenda (as well as the sensible disaster mitigation prep).

(Derek Parfit’s “extinction is much worse than 99.9% wipeout” claim is far more questionable—I put some of my chips on this, but not the majority.)

As you suggest, the transform function from “abstract philosophical idea” to “what do” is complicated and messy, and involves a lot of deference to existing norms and customs. Sadly, I think that many people with a “physics and philosophy” sensibility underrate just how complicated and messy the transform function really has to be. So they sometimes make bad decisions on principle instead of good decisions grounded in messy common sense.

I’m glad you shared the J.S. Mill quote.

…the beliefs which have thus come down are the rules of morality for the multitude, and for the philosopher until he has succeeded in finding better

EAs should not be encouraged to grant themselves practical exception from “the rules of morality for the multitude” if they think of themselves as philosophers. Genius, wise philosophers are extremely rare (cold take: Parfit wasn’t one of them).

To be clear: I am strongly in favour of attempts to act on important insights from philosophy. I just think that this is hard to do well. One reason is that there is a notable minority of “physics and philosophy” folks who should not be made kings, because their “need for systematisation” is so dominant as to be a disastrous impediment for that role.

In my other comment, I shared links to Karnofsky, Beckstead and Cowen expressing views in the spirit of the above. From memory, Carl Shuman is in a similar place, and so are Alexander Berger and Ajeya Cotra.

My impression is that more than half of the most influential people in effective altruism are roughly where they should be on these topics, but some of the top “influencers”, and many of the ”second tier”, are not.

(Views my own. Sword meme credit: the artist currently known as John Stewart Chill.)

Comment by peterhartree (Peter_Hartree) on Reflections on Vox's "How effective altruism let SBF happen" · 2022-12-13T20:55:48.305Z · EA · GW

There's also Nick Beckstead disavowing his earlier "hardcore utilitarianism" in favour of something like Tyler Cowen's two thirds utilitarianism.