How much does performance differ between people? 2021-03-25T22:56:32.660Z
Giving and receiving feedback 2020-09-07T07:24:33.941Z
What are novel major insights from longtermist macrostrategy or global priorities research found since 2015? 2020-08-13T09:15:39.622Z
Max_Daniel's Shortform 2019-12-13T11:17:10.883Z
When should EAs allocate funding randomly? An inconclusive literature review. 2018-11-17T14:53:38.803Z


Comment by Max_Daniel on Concerns with ACE's Recent Behavior · 2021-04-22T20:21:52.362Z · EA · GW

I think that people who are really enthusiastic about EA are pretty likely to stick around even when they're infuriated by things EAs are saying.


If you know someone (eg yourself) who you think is a counterargument to this claim of mine, feel free to message me.

I would guess it depends quite a bit on these people's total exposure to EA at the time when they encounter something they find infuriating (or even just somewhat off / getting a vibe that this community probably is "not for them").

If we're imagining people who've already had 10 or even 100 hours of total EA exposure, then I'm inclined to agree with your claim and sentiment. (Though I think there would still be exceptions, and I suspect I'm at least a bit more into "try hard to avoid people bouncing for reasons unrelated to actual goal misalignment" than you.) 

I'm less sure for people who are super new to EA as a school of thought or community.

We don't need to look at hypothetical cases to establish this. My memory of events 10 years ago is obviously hazy but I'm fairly sure that I had encountered both GiveWell's website and Overcoming Bias years before I actually got into EA. At that time I didn't understand what they were really about, and from skimming they didn't clear my bar of "this seems worth engaging with". I think Overcoming Bias seemed like some generic libertarian blog to me, and at the time I thought libertarians were deluded and callous; and for GiveWell I had landed on some in-the-weeds page on some specific intervention and I was like "whatever I'm not that interested in malaria [or whatever the page was about]". Just two of the many links you open, glance at for a few seconds, and then never (well, in this case luckily not quite) come back to.

This case is obviously very different from what we're discussing here. But I think it serves to reframe the discussion by illustrating that there are a number of different reasons for why someone might bounce from EA depending on a number of that person's property, with the amount of prior exposure being a key one. I'm skeptical that any blanket statement of type "it's OK if people bounce for reason X" will do a good job at describing a good strategy for dealing with this issue.

Comment by Max_Daniel on Concerns with ACE's Recent Behavior · 2021-04-22T14:14:00.098Z · EA · GW

The comment it is replying to doesn't seem at all hostile to me

(I mostly agree with your comment, but note that from the wording of ACE's comment it isn't clear to me if (a) they think that Jakub's comment is hostile or (b) that Hypatia's OP is hostile, or (c) that the whole discussion is hostile or whatever. To be clear, I think that kind of ambiguity is also a strike against that comment.)

Comment by Max_Daniel on Concerns with ACE's Recent Behavior · 2021-04-22T14:03:26.807Z · EA · GW

I'm confused about why this comment was heavily downvoted. I'd be curious if people think (a) the norms ("assign greater credence to the beliefs of members of discriminated groups" etc.) described by Eze are bad, or (b) they don't accurately describe actual "social justice norms" or potential norms at ACE or whatever actual norms may be relevant for this discussion, and therefore the comment is besides the point, or (c) something else.

Comment by Max_Daniel on Wild Animal Initiative featured in Vox · 2021-04-22T12:39:54.631Z · EA · GW

FWIW, I can see where you're coming from but I think for me personally these kind of posts are overall net good. The key thing is making me aware that some new piece has been published, which may be interesting for various use cases. (E.g. currently I'm designing an EA seminar series.) I don't care much about how the post is phrased, I get all the relevant info from the title + link.

Tbc, it seems very possible to me that there would be another way for me to get such information that doesn't take up Forum space or comes with other costs.

Comment by Max_Daniel on Concerns with ACE's Recent Behavior · 2021-04-21T21:27:19.826Z · EA · GW

Thank you, I thought this was a very thoughtful and helpful comment.

I added some thoughts to my previous comments here and here based on this.

(I also agree with the sentiment that, as you alluded to, in such situations it can be quite delicate to decide which information to make vs. not make public. FWIW my sense is that, given that substantial public discussion had already started, you navigated this well in this comment, but I'm also aware that this is something that is incredibly hard to assess "from the outside", and so I don't feel like I could reasonably be very confident about my assessment.)

Comment by Max_Daniel on If Bill Gates believes all lives are equal, why is he impeding vaccine distribution? · 2021-04-21T09:54:23.427Z · EA · GW

(Thanks, I thought that was a really useful comment for me to read. I had suspected that the OP was off in various ways - e.g. the "shortages suggest prices are too low" point seemed clear - but haven't followed the situation closely enough to know for sure what to think about some of the details, and this comment helped with that.)

Comment by Max_Daniel on Concerns with ACE's Recent Behavior · 2021-04-20T22:06:53.433Z · EA · GW

Inferring from the list you wrote, you seem to be under the impression that the speaker in question was going to deliver a talk at the conference, but according to Eric Herboso's top-level comment, "the facebook commenter in question would be on a panel talking about BLM".

Yes, I had been under that impression (based on my vague memory of having heard about this situation when Buck had posted about it on Facebook). Given what Eric wrote, it sounds like you're probably right that the "baseline plan" was a panel rather than a talk, so obviously my list of potential compromises would need to be modified (change topic of the panel, move the person from the panel to a talk on another topic, make the panel "informal" etc.). I don't think this by itself matters much for the key points I was trying to make in my comment.

Separately, I agree that the second quote at least suggests that maybe what in fact happened was that ACE asked CARE to ban this person from even attending the conference. I haven't followed this situation enough to have an object-level view of whether I think that would have been a reasonable/good demand. I also didn't mean to say that the hypothetical compromises I suggested in my earlier comment would have in fact been good for the world overall. 

I still think that (i) negotiation by demanding/suggesting particular outcomes (as opposed to first getting on the same page on both parties' interests) is usually instrumentally bad even for one's own interests [so no matter the content of the demand, if ACE's negotiation strategy had been of that type, I'd think they probably made a mistake], (ii) confidentiality norms for specifics of negotiations are often good, would often be undermined even by disclosing just what one said oneself (i.e. without directly revealing anything the other party did or said) and (iii) while somewhat uncertain I would tentatively push back pretty strongly against outside demands to break confidentiality or outside signals that discourage an org from entering into confidential negotiations in the future

Indeed, (iii) was the main reason why I commented at all. I'm not that interested in what happens in the animal advocacy world, but I tentatively don't want incentives that punish EA-ish orgs for utilizing dispute resolution mechanisms that involve confidentiality because I think eroding the norm that confidentiality is OK in such situations could be pretty bad.

[ETA: I wasn't trying to comment on the object level, but for the record based on this comment by Anima International leadership staff it does sound like ACE may in fact not have approached the dispute in the way I outlined here. I'm specifically referring to claims that (1) ACE started the conversation by freezing funds and (2) them to some extent having violated strict confidentiality themselves by having disclosed that some of the 'negotiations' were about racism, attitudes toward DEI, etc. - I think that (2) may actually be a dynamic of the problem I pointed to, i.e., that by disclosing partial information you can create pressures on the involved parties to disclose even more information or otherwise justify their behavior. However, note that ACE may be disputing these claims.

To be clear, I also think that accurately assessing such situations "from the outside" is very tricky, and I don't think I can reasonably have a strong view on whether or not any detail such as disclosing the topic of some conversation was or wasn't a mistake. I also think that ACE is in a particularly tricky situation when navigating such matters because it's not some "random" actor who has a dispute with the conference organizers but also by virtue of their mission is committed to evaluating the conference organizer's performance. I think this raises some interesting more general question about good practices for how to navigate disputes between charity evaluators and charities.]

Comment by Max_Daniel on Concerns with ACE's Recent Behavior · 2021-04-20T18:36:47.376Z · EA · GW

[Note that I have no idea whatsoever about what actually happened here. This is purely hypothetical.]

FWIW if I was in a position similar to ACE's here are a few potential "compromises" I would have explored. (Of course, which of these would be acceptable to me would depend on exactly why I'm concerned and how strongly, etc.) I think some of them wouldn't typically be considered deplatforming, though I would imagine that people who are against deplatforming would find many if not all of them at least somewhat objectionable (I would also guess that some who are pro maximal deplatforming in this case would find many if not all of these objectionable):

  • Changing the topic of the relevant speaker's talk
  • Adding a talk with a different perspective on BLM etc. to the conference program
  • Replacing the talk with a panel that also includes different perspectives
  • Removing the talk from the 'official' program but explicitly or implicitly allowing it to take part 'informally'
  • Adding some sort of disclaimer to the talk/program saying the conference organizers are aware this is a sensitive topic and they disagree with the speaker

I could probably generate a bunch of other ideas if I spent more time generating them.

Perhaps more fundamentally, in the spirit of the 'Harvard method' for negotiations I wouldn't have sent a list of demands or acceptable outcomes to CARE but would have stated my interests. I.e. roughly which properties of possible outcomes I care about. E.g. these might be things like "my staff members feel comfortable attending", "avoiding a perception by attendees or the public that CARE, or the animal advocacy movement more broadly, are" (or even "sending a signal that CARE/the animal advocacy movement support anti-racism/BLM"), etc.

I would then encourage CARE to state their interests, and then try to make sure they and I consider the full space of possible outcomes (which includes the potential compromises listed above and many more), and work with them to see if there is any outcome in there that is acceptable to both of us.

If this went well (even if there was no deal in the end) I would imagine such a conversation would involve a lot of back-and-forth, hinge on details, and that it would be quite hard to give outsiders an accurate picture of what the parties wanted. Certainly I wouldn't expect that "here is a list of compromises we wanted" would do a good job at this.

There is a separate question of whether even if it was possible it would be a good idea. I'm not sure. I think that on one hand various stakeholders have an interested in understanding at least roughly what interests the parties were pursuing. E.g. if I was a donor to ACE b/c I want ACE to generally maximize the effectiveness of the animal advocacy movement then I would want to know if the interests ACE was pursuing in this and similar negotiation situations were consistent with that mission. However, this may be covered by fairly general statements (e.g. the general goals and strategy of ACE) and accountability/governance mechanisms (e.g. ACE's board holding org leadership accountable to pursuing those goals and not others in all their dealings). On the other hand, too much transparency about specifics of the conversation can easily jeopardize the parties' ability to reach a Pareto-preferred outcome, in particular if outside observers are irrational or prefer to 'punish' the other party. I also think it's generally conducive to a constructive conversation if the parties don't have to worry that in the case of no deal the other party will try  to paint them in a bad light (since such worries increase the expected cost of bargaining); and unfortunately even disclosing just what oneself said might have this effect, or at least create pressure on the other party to reveal more about their bargaining strategy. (E.g. if party A is like "we proposed X but B wouldn't accept", and B seems at first glance reasonable, this will create pressure on party B to explain why they thought X wasn't acceptable, which might rely on confidential or costly-to-disclose information, etc.) It just seems like there are a bunch of tricky principal-agent and other problems to navigate here, and my impression is that certain norms of confidentiality have evolved around negotations that often seem reasonable to me (though I don't feel highly confident, and I also don't have a great picture of how exactly the common norms look like). I'm definitely not convinced that it would be good to create public pressure on organizations that would prevent them from entering confidential conversations with other parties in the future. If I know that you have supporters who will pressure you into disclosing what we talk about, I might prefer to not negotiate with you in the first place, and we might both be worse off as a result.

Again, I have absolutely no idea how close any of my hypotheticals are to what actually went down here.

[ETA: I wasn't trying to comment on the object level, but for the record based on this comment by Anima International leadership staff it does sound like ACE may in fact not have approached the dispute in the way I outlined here. I'm specifically referring to claims that (1) ACE started the conversation by freezing funds and (2) them to some extent having violated strict confidentiality themselves by having disclosed that some of the 'negotiations' were about racism, attitudes toward DEI, etc. - I think that (2) may actually be a dynamic of the problem I pointed to, i.e., that by disclosing partial information you can create pressures on the involved parties to disclose even more information or otherwise justify their behavior. However, note that ACE may dispute these claims.

To be clear, I also think that accurately assessing such situations "from the outside" is very tricky, and I don't think I can reasonably have a strong view on whether or not any detail such as disclosing the topic of some conversation was or wasn't a mistake. I also think that ACE is in a particularly tricky situation when navigating such matters because it's not some "random" actor who has a dispute with the conference organizers but also by virtue of their mission is committed to evaluating the conference organizer's performance. I think this raises some interesting more general question about good practices for how to navigate disputes between charity evaluators and charities.]

Comment by Max_Daniel on Avoiding the Repugnant Conclusion is not necessary for population ethics: new many-author collaboration. · 2021-04-18T20:18:09.208Z · EA · GW

In general, I don't see how papers which say (little more than) "We agree with X" merit publication. What would be the point of a paper which said, e.g. "We, some utilitarian philosophers, do not think the usual objections to utilitarianism succeed because of the usual counter-objections"? We already know that philosophers believe a variety of things.

I have some sympathy to your general point. However, I think this case is relevantly different from utilitarian philosophers stating they agree with utilitarianism, for the following reasons:

  • Many philosophers working in (non-applied) ethics seem to have an attitude of extreme reverence toward Derek Parfit. Parfit rejected the Repugnant Conclusion [1], and essentially founded [2] the field of population ethics on the premise that it's task was to find some 'Theory X' that would avoid the Repugnant Conclusion and other problems.
  • My impression is that most academic work in population ethics has in fact sought to avoid the Repugnant Conclusion, with papers such as Huemer's In Defence of Repugnance being the exception.
  • The name 'Repugnant Conclusion' suggests that it is obviously unacceptable.

None of these claims have true analogs for utilitarianism. It's not the case that the field of normative ethics was conceived as a project to defeat utilitarianism; there is plenty of work arguing for utilitarianism; etc.

More broadly, I think analytic philosophy has a tendency to spawn 'industries' that produce ever more refined attempts and rebuttals of formal theories that try to provide a solution to some problem, the framing of which is usually taking for granted. Perhaps the most infamous examples are countless attempts to find some definition or 'analysis' of the concept of knowledge in terms of more primitive concepts such as justification, truth, and belief, in response to Edmund Gettier's examples allegedly showing that knowledge can't simply be justified true belief. (Indeed, philosophers have discussed the 'Gettier Problem problem', i.e. the philosophical problem of explaining why solving the original Gettier Problem is pointless or otherwise problematic.) Other examples might be the logical positivist project to reduce meaning to predictions of sense data, attempts at defusing van Inwagen's Consequence Argument for the incompatibility of free will and determinism by providing counterexamples to one of its premises, or the ever-growing zoo of Frankfurt-style examples aimed at showing that moral responsibility does not require a 'could have done otherwise' property.

To the extent that there is progress in philosophy, I think it often consists in disrupting such industries by reframing the problem they were built on or forcefully arguing against some desideratum that was thought to be necessary for a 'solution'. (A more cynical view would be that such work merely replaces one flawed industry with the next.) At the very least, such work has often become famous, e.g. Quine's Two Dogmas of Empiricism, Strawson's Freedom and Resentment, Frankfurt's Alternate possibilities and moral responsibility and Freedom of the Will and the Concept of a Person, Kripke's Naming and Necessity, etc.

However, a comparison with such contributions also brings me back to where I agree with you: I think these philosophers provided value because they didn't merely state that they disagreed with, or disliked something about some 'industry'. And, crucially, they even went beyond arguing for or explaining their view. They also made a positive contribution by showing how different and more fruitful work could look like. So e.g., roughly speaking, Quine said "you can't 'reduce' the meaning of an individual proposition to anything, you need to look at the full web of beliefs", Strawson said "the basis for moral responsibility lies not in questions whether or not anyone could have done anything otherwise but in people's 'reactive attitudes' toward each others' behavior", Frankfurt regarding the same issue instead pointed to the internal structure of a moral agent's preferences, etc. Then other philosophers can and did make positive contributions by describing how meaning is a holistic property, what it is about the structure of an agent's internal preferences that makes them morally responsible for their actions, etc.

At least at first glance I couldn't find such a  positive contribution in the paper we're discussing here. It's all well and good to say that one doesn't like the existing population ethics 'industry' - and I agree, in my view the field has been stale for a long time and has consisted mostly of footnotes to Parfit -, but then what else do you want people to do? Quine wouldn't have been nearly as influential had he said "perhaps one day the correct approach to meaning will be uncovered, but I don't know whether Carnap would agree with it". And I suspect the lack of a clear positive recommendation or other 'way out' may prevent this paper from having the effect it tries to have. Though perhaps only time will tell. (E.g. arguably semantic holism wasn't exactly well developed in Two Dogmas itself.)



[1] At least Parfit clearly rejected the Repugnant Conclusion in Reasons and Persons, Part IV of which seems close to a 'founding document' for population ethics. As the authors of the paper discussed here mention, Parfit seems to later have somewhat changed his stance, though my memory from one of his last papers was still that he was hoping to advance some view avoiding the Repugnant Conclusion (through some combination of lexicality and incomparability or vagueness - indeed the paper was titled Can We Avoid the Repugnant Conclusion?). However, I'm no expert on Parfit's late work and could easily be wrong; e.g. I don't know what if anything he says on population ethics in On What Matters.

[2] There are papers on what we'd today call population ethics that precede Parfit's work, and Reasons in Persons in particular. However, my impression is that Parfit's work, and Reasons and Persons in particular, have had a domineering influence over subsequent work in what became known as population ethics, at least among analytic philosophers in a broadly consequentialist tradition. Again, I'm no expert on the history of population ethics, and would welcome corrections.

Comment by Max_Daniel on How much does performance differ between people? · 2021-04-17T10:04:23.921Z · EA · GW

Are these correlations actually measuring something which could plausibly be linearly related (e.g. Z score for both IQ and income)?

I haven't looked at the papers to check and don't remember but my guess would be that it's something like this. Plus maybe some papers looking at things other than correlation.

Comment by Max_Daniel on Concerns with ACE's Recent Behavior · 2021-04-16T14:44:01.650Z · EA · GW

I think that's a very fair point. I do think it would be possible to run the post by an org in an anonymous way (create a new email account & send clean copy of doc), but as e.g. Larks points out it's easy to accidentally break anonymity.

Comment by Max_Daniel on Concerns with ACE's Recent Behavior · 2021-04-16T14:40:07.198Z · EA · GW

I'd be surprised if fewer than say 80% of the people who would say they find this very concerning won't end up also reading the response from ACE. I'd be more worried if this would be a case where most people would just form a quick negative association and won't follow up later when this all turns out to be more or less benign.

Yeah, I agree with this. I don't think the time delay is that big of a deal by itself, more like something that might be slightly annoying / slightly time-costly to a medium number of people.

Comment by Max_Daniel on Concerns with ACE's Recent Behavior · 2021-04-16T13:01:56.698Z · EA · GW

But I think there are reasons to not contact an org before, besides urgency, e.g. lacking time, or predicting that private communication will not be productive enough to spend the little time we have at our disposal. So I currently think we should approve if people bring up the energy to voice honest concerns even if they don’t completely follow the ideal playbook. What do you, or others think?

I agree with the spirit of "I currently think we should approve if people bring up the energy to voice honest concerns even if they don’t completely follow the ideal playbook".

However, at first glance I don't find the specific "reasons to not contact an org before" that you state convincing:

  • "Lacking time" - I think there are ways that require minimal time commitment. For instance, committing to not (or not substantially) revise the post based on an org's response. I struggle to imagine a situation where someone is able to spend several hours writing a post but then absolutely can't find the 10 minutes required to send an email to the org the post is about.
  • "Predicting that private communication will not be productive enough to spend the little time we have at our disposal" - I think this misunderstands one key reason for running a post about an org/person by that org/person before publishing. In my view, the key reason for this norm is not that private communication can be better but to improve a public conversation that's going to happen anyway by delaying it a bit so each involved party can 'prepare' for it.

Basically, I think in many cases the "ideal" process would be:

  • <author> runs post by <org> saying "fyi I'm going to publish this by <date>. I'm giving you a heads up b/c I think this is good practice and allows you to think about if/how to reply. If there are clear misunderstandings or factual mistakes in my post, let me know and I might be able to correct them. However, I'd prefer substantive discussion to take place publicly."
  • <org> deliberates internally if/how to reply publicly, and sends minor suggestions for corrections to <author>.
  • <author> posts, potentially after correcting some clear mistakes/misunderstandings (I expect this usually takes 5-30 minutes if the post was well done).
  • <org> posts their reply shortly after <author>'s post (if they want to post one).


Concretely, as a "public spectator" in such cases it tends to be quite useful to "hear both sides". If I see something by just one side, then unless I see a reason for urgency, my reaction will tend to be "OK I guess I'll wait engaging with this until I can hear both sides".

Comment by Max_Daniel on Possible misconceptions about (strong) longtermism · 2021-04-13T22:11:10.020Z · EA · GW

I agree that CL may or may not follow from AL depending on one's other ethical and empirical views.

However, I'm not sure I understand if and why you think this is a problem for longtermism specifically, as opposed to effective altruism more broadly. For instance, consider the typical EA argument for donating to more rather than less effective global health charities. I think that argument essentially is that donating to a more effective charity has better ex-ante effects. 

Put differently, I think many EAs donate to AMF because they believe that GiveWell has established that marginal donations to AMF have pretty good ex-ante effects compared to other donation options (at least if we only look at a certain type of effect, namely short-term effects on human beneficiaries). But I haven't seen many people arguing on the EA Forum that, actually, it is a misconception that someone has made a thorough case for donating to AMF because maybe making decisions solely by evaluating ex-ante effects is not a useful way of interacting with the world. [1]

So you directing a parallel criticism at longtermism specifically leaves me a little confused. Perhaps I'm misunderstanding you?

(I'm setting aside your potential empirical defeater '1.' since I largely agree with the discussion on it in the other responses to your comment. I.e. I think it is countered strongly, though not absolutely decisively, by the 'beware suspicious convergence' argument.)


[1] People have claimed that there isn't actually a strong case for donating to AMF; but usually such arguments are based on types of effects (e.g. on nonhuman animals or on far-future outcomes) that the standard pro-AMF case allegedly doesn't sufficiently consider rather than on claims that, actually, ex-ante effects are the wrong kind of thing to pay attention to in the first place.

Comment by Max_Daniel on Possible misconceptions about (strong) longtermism · 2021-04-13T21:51:29.446Z · EA · GW

I don't want to start a pointless industry of alternatively 'shooting down' & refining purported cases of simple cluelessness, but just for fun here is another reason for why our cluelessness regarding "conceiving a child on Tuesday vs. Wednesday" really is complex:

Shifting the time of conception by one day (ignoring the empirical complication pointed out by Denise below) also shifts the probability distribution of birth date by weekday, e.g. whether the baby's birth occurs on a Tuesday or Wednesday. However, for all we know the weekday of birth has a systematic effect on birth-related health outcomes of mother or child. For instance, consider some medical complication occurring during labor with weekday-independent probability, which needs to be treated in a hospital. We might then worry that on a Wednesday healthcare workers will tend to be more overworked, and so slightly more likely to make mistakes, than on a Tuesday (because many of them will have had the weekend off and so on Wednesday they've been through a larger period of workdays without significant time off). On the other hand, we might think that people are reluctant to go to a hospital on a weekend such that there'll be a "rush" on hospitals on Mondays, which takes until Wednesday to "clear" - making in fact Monday or Tuesday more stressful for healthcare workers. And so on and so on ...

(This is all made up, but if I google for relevant terms I pretty quickly find studies such as Weekday of Surgery Affects Postoperative Complications and Long-Term Survival of Chinese Gastric Cancer Patients after Curative Gastrectomy  or Outcomes are Worse in US Patients Undergoing Surgery on Weekends Compared With Weekdays or Influence of weekday of surgery on operative
complications. An analysis of 25.000 surgical procedures or ...

I'm sure many of these studies are terrible but their existence illustrates that it might be pretty hard to justify an epistemic state that is committed to the effect of different weekdays exactly canceling out.)

((It doesn't help if we could work out the net effect on all health outcomes at birth, say b/c we can look at empirical data from hospitals. Presumably some non-zero net effect on e.g. whether or not we increase the total human population by 1 at an earlier time would remain, and then we're caught in the 'standard' complex cluelessness problem of working out whether the long-term effects of this are  net positive or net negative etc.))

I'm wondering if a better definition of simple cluelessness would be something like: "While the effects don't 'cancel out', we are justified in believing that their net effect will be small compared to differences in short-term effects."

Comment by Max_Daniel on Possible misconceptions about (strong) longtermism · 2021-04-13T21:34:23.195Z · EA · GW

I'm also inclined to agree with this. I actually only very recently realized that a similar point had also been made in the literature: in this 2019 'discussion note' by Lok Lam Yim, which is a reply to Greaves's cluelessness paper:

This distinction between ‘simple’ and ‘complex’ cases of cluelessness, though an ingenious one, ultimately fails. Upon heightened scrutiny, a so-called ‘simple’ case often collapses into a ‘complex’ case. Let us consider Greaves’s example of a ‘simple’ case: helping an old lady cross the road. It is possible that this minor act of kindness has some impacts of systematic tendencies of a ‘complex’ nature. For instance, future social science research may show that old ladies often tell their grandchildren benevolent stories they have encountered to encourage their grandchildren to help others. Future psychological research may show that small children who are encouraged to help others are usually more charitable, and these children, upon reaching adulthood, are generally more sympathetic to the effective altruism movement, which Greaves considers a ‘complex’ case. This shows that a so-called ‘simple’ decision (such as whether to help an old lady to cross the road) can systematically lead to consequences of a ‘complex’ nature (such as an increase in the possibility of their grandchildren joining the effective altruism movement), thereby suffering from the same problem of genuine cluelessness as a ‘complex’ case.

Morally important actions are often, if not always, others-affecting. With the advancement of social science and psychological research, we are likely to discover that most others-concerning actions have some systematic impacts on others. These systematic impacts may lead to another chain of systematic impacts, and so on. Along the chain of systematic impacts, it is likely that at least one of them is of a ‘complex’ nature.

Comment by Max_Daniel on My personal cruxes for focusing on existential risks / longtermism / anything other than just video games · 2021-04-13T18:11:06.061Z · EA · GW

Hmm - good question if that would be true for one of my 'cruxes' as well. FWIW my immediate intuition is that it wouldn't, i.e. that I'd have >1% credence in all relevant assumptions. Or at least that counterexamples would feel 'pathological' to me, i.e. like weird edge cases I'd want to discount. But I haven't carefully thought about it, and my view on this doesn't feel that stable.

I also think the 'foundational' property you gestured at does some work for why my intuitive reaction is "this seems wild".

Thinking about this, I also realized that maybe some distinction between "how it feels like if I just look at my intuition" and "what my all-things-considered belief/'betting odds' would be after I take into account outside views, peer disagreement, etc.". The example that made me think about this were startup founders, or other people embarking on ambitious projects that based on their reference class are very likely to fail. [Though idk if 99% is the right quantitative threshold for cases that appear in practice.] I would guess that some people with that profile might say something like "sure, in one sense I agree that the chance of me succeeding must be very small - but it just does feel like I will succeed to me, and if I felt otherwise I wouldn't do what I'm doing".

Comment by Max_Daniel on My personal cruxes for focusing on existential risks / longtermism / anything other than just video games · 2021-04-13T11:46:29.925Z · EA · GW

Thank you for sharing! I generally feel pretty good about people sharing their personal cruxes underlying their practical life/career plans (and it's toward the top of my implicit "posts I'd love to write myself if I can find the time" list).

I must confess it seems pretty wild to me to have a chain of cruxes like this start with one in which one has a credence of 1%. (In particular one that affects one's whole life focus in a massive way.) To be clear, I don't think I have an argument that this epistemic state must be unjustified or anything like that. I'm just reporting that it seems very different from my epistemic and motivational state, and that I have a hard time imagining 'inhabiting' such a perspective. E.g. to be honest I think that if I had that epistemic state I would probably be like "uhm I guess if I was a consistent rational agent I would do whatever the beliefs I have 1% credence in imply, but alas I'm not, and so even if I don't endorse this on a meta level I know that I'll mostly just ~ignore this set of 1% likely views and do whatever I want instead".

Like, I feel like I could understand why people might be relatively confident in moral realism, even though I disagree with them. But this "moral realism wager" kind of view/life I find really hard to imagine :)

Comment by Max_Daniel on How much does performance differ between people? · 2021-04-03T16:31:22.668Z · EA · GW

I'm not sure how extreme your general take on communication is, and I think at least I have a fairly similar view.

I agree that the kind of practical experiences you mention can be a good reason to be more careful with the use of some mathematical concepts but not others. I think I've seen fewer instances of people making fallacious inferences based on something being log-normal, but if I had I think I might have arrived at similar aspirations as you regarding how to frame things.

(An invalid type of argument I have seen frequently is actually the "things multiply, so we get a log-normal" part. But as you have pointed out in your top-level comment, if we multiply a small number of thin-tailed and low-variance factors we'll get something that's not exactly a 'paradigmatic example' of a log-normal distribution even though we could reasonably approximate it with one. On the other hand, if the conditions of the 'multiplicative CLT' aren't fulfilled we can easily get something with heavier tails than a log-normal. See also fn26 in our doc:

We’ve sometimes encountered the misconception that products of light-tailed factors always converge to a log-normal distribution. However, in fact, depending on the details the limit can also be another type of heavy-tailed distribution, such as a power law (see, e.g., Mitzenmacher 2004, sc. 5-7 for an accessible discussion and examples). Relevant details include whether there is a strictly positive minimum value beyond which products can’t fall (ibid., sc. 5.1), random variation in the number of factors (ibid., sc. 7), and correlations between factors.


Comment by Max_Daniel on How much does performance differ between people? · 2021-04-03T16:20:48.336Z · EA · GW

I think the main takeaway here is that you find that section confusing, and that's not something one can "argue away", and does point to room for improvement in my writing. :)

With that being said, note that we in fact don't say anywhere that anything 'is thin-tailed'. We just say that some paper 'reports' a thin-tailed distribution, which seems uncontroversially true. (OTOH I can totally see that the "by contrast" is confusing on some readings. And I also agree that it basically doesn't matter what we say literally - if people read what we say as claiming that something is thin-tailed, then that's a problem.)

FWIW, from my perspective the key observations (which I apparently failed to convey in a clear way at least for you) here are:

  • The top 1% share of ex-post "performance" [though see elsewhere that maybe that's not the ideal term] data reported in the literature varies a lot, at least between 3% and 80%. So usually you'll want to know roughly where on the spectrum you are for the job/task/situation relevant to you rather than just whether or not some binary property holds.
  • The range of top 1% shares is almost as large for data for which the sources used a mathematically 'heavy-tailed' type of distribution as model. In particular, there are some cases where we some source reports a mathematically 'heavy-tailed' distribution but where the top 1% share is barely larger than for other data based on a mathematically 'thin-tailed' distribution.
    • (As discussed elsewhere, it's of course mathematically possible to have a mathematically 'thin-tailed' distribution with a larger top 1% share than a mathematically 'heavy-tailed' distribution. But the above observation is about what we in fact find in the literature rather than about what's mathematically possible. I think the key point here is not so much that we haven't found a 'thin-tailed' distribution with larger top 1% share than some 'heavy-tailed' distribution. but that the mathematical 'heavy-tailed' property doesn't cleanly distinguish data/distributions by their top 1% share even in practice.)
  • So don't look at whether the type of distribution used is 'thin-tailed' or 'heavy-tailed' in the mathematical sense, ask how heavy-tailed in the everyday sense (as operationalized by top 1% share or whatever you care about) your data/distribution is.

So basically what I tried to do is mentioning that we find both mathematically thin-tailed and mathematically heavy-tailed distributions reported in the literature in order to point out that this arguably isn't the key thing to pay attention to. (But yeah I can totally see that this is not coming across in the summary as currently worded.)

As I tried to explain in my previous comment, I think the question whether performance in some domain is actually 'thin-tailed' or 'heavy-tailed' in the mathematical sense is closer to ill-posed or meaningless than true or false. Hence why I set aside the issue of whether a normal distribution or similar-looking log-normal distribution is the better model.

Comment by Max_Daniel on How much does performance differ between people? · 2021-04-03T10:51:35.402Z · EA · GW

As an aside, for a good and philosophically rigorous criticism of cavalier assumptions of normality or (arguably) pseudo-explanations that involve the central limit theorem, I'd recommend Lyon (2014), Why are Normal Distributions Normal?

Basically I think that whenever we are in the business of understanding how things actually work/"why" we're seeing the data distributions we're seeing, often-invoked explanations like the CLT or "multiplicative" CLT are kind of the tip of the iceberg that provides the "actual" explanation (rather then being literally correct by themselves), this iceberg having to do with the principle of maximum entropy / the tendency for entropy to increase / 'universality' and the fact that certain types of distributions are 'attractors' for a wide range of generating processes. I'm too much of an 'abstract algebra person' to have a clear sense of what's going on, but I think it's fairly clear that the folk story of "a lot of things 'are' normally distributed because of 'the' central limit theorem" is at best an 'approximation' and at worst misleading.

(One 'mathematical' way to see this is that it's fishy that there are so many different versions of the CLT rather than one clear 'canonical' or 'maximally general' one. I guess stuff like this also is why I tend to find common introductions to statistics horribly unaesthetic and have had a hard time engaging with them.) 

Comment by Max_Daniel on How much does performance differ between people? · 2021-04-03T10:38:35.670Z · EA · GW

I haven't read the meta-analysis, but I'd tentatively bet that much like biological properties these jobs actually follow log-normal distributions and they just couldn't tell (and weren't trying to tell) the difference. 

I kind of agree with this (and this is why I deliberately said that "they report a Gaussian distribution" rather than e.g. "performance is normally distributed"). In particular, yes, they just assumed a normal distribution and then ran with this in all cases in which it didn't lead to obvious problems/bad fits no matter the parameters. They did not compare Gaussian with other models.

I still think it's accurate and useful to say that they were using (and didn't reject) a normal distribution as model for low- and medium-complexity jobs as this does tell you something about how the data looks like. (Since there is a lot of possible data where no normal distribution is a reasonable fit.)

I also agree that probably a log-normal model is "closer to the truth" than a normal one. But on the other hand I think it's pretty clear that actually neither a normal nor a log-normal model is fully correct. Indeed, what would it mean that "jobs actually follow a certain type of distribution"? If we're just talking about fitting a distribution to data, we will never get a perfect fit, and all we can do is providing goodness-of-fit statistics for different models (which usually won't conclusively identify any single one). This kind of brute/naive empiricism just won't and can't get us to "how things actually work". On the other hand, if we try to build a model of the causal generating mechanism of job performance it seems clear that the 'truth' will be much more complex and messy - we will only have finitely many contributing things (and a log-normal distribution is something we'd get at best "in the limit"), the contributing factors won't all be independent etc. etc. Indeed, "probability distribution" to me basically seems like the wrong type to talk about when we're in the business of understanding "how things actually work" - what we want then is really a richer and more complex model (in the sense that we could have several different models that would yield the same approximate data distribution but that would paint a fairly different picture of "how things actually work"; basically I'm saying that things like 'quantum mechanics' or 'the Solow growth model' or whatever have much more structure and are not a single probability distribution).

Comment by Max_Daniel on How much does performance differ between people? · 2021-04-03T10:22:01.174Z · EA · GW

Yeah, I think we agree on the maths, and I'm quite sympathetic to your recommendations regarding framing based on this. In fact, emphasizing "top x% share" as metric and avoiding any suggestion that it's practically useful to treat "heavy-tailed" as a binary property were my key goals for the last round of revisions I made to the summary - but it seems like I didn't fully succeed.

FWIW, I maybe wouldn't go quite as far as you suggest in some places. I think the issue of "mathematically 'heavy-tailed' distributions may not be heavy-tailed in practice in the everyday sense" is an instance of a broader issue that crops up whenever one uses mathematical concepts that are defined in asymptotic terms in applied contexts. 

To give just one example, consider that we often talk of "linear growth", "exponential growth", etc. I think this is quite useful, and that it would overall be bad to 'taboo' these terms and always replace them with some 'model-agnostic' metric that can be calculated for finitely many data points. But there we have the analog issue that depending on the parameters an e.g. exponential function can for practical purposes look very much like a linear function over the relevant finite range of data.

Another example would be computational complexity, e.g. when we talk about algorithms being "polynomial" or "exponential" regarding how many steps they require as function of the size of their inputs.

Yet another example would be attractors in dynamical systems.

In these and many other cases we encounter the same phenomenon that we often talk in terms of mathematical concepts that by definition only tell us that some property holds "eventually", i.e. in the limit of arbitrarily long amounts of time, arbitrarily much data, or similar.

Of course, being aware of this really is important. In practice it often is crucial to have an intuition or more precise quantitative bounds on e.g. whether we have enough data points to be able to use some computational method that's only guaranteed to work in the limit of infinite data. And sometimes we are better off using some algorithm that for sufficiently large inputs would be slower than some alternative, etc.

But on the other hand, talk in terms of 'asymptotic' concepts often is useful as well. I think one reason for why is that in practice when e.g. we say that something "looks like a heavy-tailed distribution" or that something "looks like exponential growth" we tend to mean "the top 1% share is relatively large / it would be hard to fit e.g. a normal distribution" or "it would be hard to fit a straight line to this data" etc., as opposed to just e.g. "there is a mathematically heavy-tailed distribution that with the right parameters provides a reasonable fit" or "there is an exponential function that with the right parameters provides a reasonable fit". That is, the conventions for the use of these terms are significantly influenced by "practical" considerations (and things like Grice's communication maxims) rather than just their mathematical definition.

So e.g. concretely when in practice we say that something is "log-normally distributed" we often do mean that it looks more heavy-tailed in the everyday sense than a normal distribution (even though it is a mathematical fact that there are log-normal distributions that are relatively thin-tailed in the everyday sense - indeed we can make most types of distributions arbitrarily thin-tailed or heavy-tailed in this sense!).

Comment by Max_Daniel on Announcing "Naming What We Can"! · 2021-04-03T09:46:21.072Z · EA · GW

Hmm, how about EYCTLEWTU (Estimates You Cannot Take Literally Even When They're Unbiased)?

Comment by Max_Daniel on How much does performance differ between people? · 2021-04-02T13:00:20.608Z · EA · GW

I'm really interested in the relation between the increasing number of AI researchers and the associated rate of new ideas in AI.

Yeah, that's an interesting question. 

One type of relevant data that's different from looking at the output distribution across scientists is just looking at the evolution of total researcher hours on one hand and measures of total output on the other hand. Bloom and colleagues' Are ideas getting harder to find? collects such data and finds that research productivity, i.e. roughly output her researcher hour, has been falling everywhere they look:

The number of researchers required today to achieve the famous doubling of computer chip density is more than 18 times larger than the number required in the early 1970s. More generally, everywhere we look we find that ideas, and the exponential growth they imply, are getting harder to find.

Comment by Max_Daniel on Announcing "Naming What We Can"! · 2021-04-02T08:11:07.046Z · EA · GW

I think this would be a step in the right direction but wouldn't go far enough. People will still confuse us because of our shared first name.

I'm thinking that maybe I should go for Min Cat Astrophe instead. I'm hoping that the middle name would also help make me more popular on the Internet.

Comment by Max_Daniel on [New org] Canning What We Give · 2021-04-01T14:53:36.014Z · EA · GW

May I kindly suggest "Yes we can!" as a promotional slogan for your new organization? It seems to have a good track record.

Comment by Max_Daniel on How much does performance differ between people? · 2021-04-01T07:45:22.700Z · EA · GW

That's very interesting, thanks for sharing!

ETA: I've added this to our doc acknowledging your comment.

Comment by Max_Daniel on What are your main reservations about identifying as an effective altruist? · 2021-03-31T17:04:40.934Z · EA · GW

That definitely resonates, and is one reason why I tend to not heavily emphasize EA as label or community when I interface with the "non-EA world".

Comment by Max_Daniel on What are your main reservations about identifying as an effective altruist? · 2021-03-31T08:54:02.195Z · EA · GW

I feel fine about referring to myself as "an EA" in contexts where this is convenient and doesn't imply major "identity" or "ideological" commitments. And indeed I sometimes do so.

In many ways the label does seem quite descriptive of my views and career goals.

I don't feel like I identify as an EA in any strong sense, or like I would want to describe myself as such no matter the context. For me, this partly has to do how I think about EA relative to other life goals. The way I roughly see it now, maximizing impartial goodness is one of the most important goals I'm pursuing; but there are also other, more personal goals. I feel like tying my identity to just one of them would "privilege" that one goal at the expense of others, in a way that messes with my way of internally resolving conflict between them (and of course such conflicts sometimes to come up). This feels true to me even if it turned out that in some sense, maximizing impartial goodness was my most important goal, or the one I cared most about, or similar.

For a couple of months during the first year after I had encountered EA I was more in a mindset of "EA is the most important/only goal, and I can pursue other goals only insofar as they're instrumentally useful or it would be psychologically impossible for me to not pursue them". Partly this was due to bad social influences. This isn't exactly the same as "identifying as an EA", but I now think my mindset at the time was both unhealthy and instrumentally harmful for my long-term ability to do good, and so it's one key reason for why I'm skeptical about, in some sense, "emphasizing EA too much".

[I wasn't at the Leaders Forum 2019.]

Comment by Max_Daniel on Max_Daniel's Shortform · 2021-03-30T09:46:26.077Z · EA · GW

Thanks Tony, I appreciate you engaging here.

It sounds to me like we're largely on the same page, and that my original post may have somewhat overstated the differences between at least some 'human progress' proponents and the longtermist EA perspective. 

On the other hand, looking at just the priorities revealed by what these communities focus on in practice, it does seem like there must be some disagreements.

FWIW, I would guess that one of the main ways in which what you say would lead to pushback from EAs is that at times it sounds somewhat anthropocentric - i.e. considering the well-being of humans, but not non-human animals. 

  • Many if not most EAs believe that nonhuman animals have the capacity to suffer in a morally relevant way, and so consider factory farming to be a moral catastrophe not just because of its adverse effects on the human population or the climate, but for the sake of farmed animals having bad lives under cruel conditions (whether these animals are raised and slaughtered in the US or in China - less intensive animal farming is much more widespread outside of the US, but even globally the vast majority of farmed animals live in factory farms because these have so much larger animal populations).
  • On the other hand, I do think there may be a fair amount of convergence between EA and 'human progress proponents' on how to address this problem. In particular, I think it's fair to say that EAs tend to be less ideologically committed to particular strategies such as vegan outreach and instead, as they try to do in any cause, adopt an open-minded approach in order to identify whatever works best. E.g. they're at least open to and have funded welfare reforms, tend to be interested in clean meat and other animal product alternatives - e.g. the Good Food Institute is one of the very few top charities recommended by the EA-aligned Animal Charity Evaluators.
  • EAs have also pushed the envelope on which cause areas may merit consideration if one cares about the suffering of non-human animals. For instance, they're aware that many more marine than land animals are directly killed for human consumption, and have helped launch new organizations in this area such as the Fish Welfare Initiative. In terms of even more "out there" topics, EAs have considered the well-being of wild animals (taking them seriously as individuals we care about rather than at the species level for the sake of biodiversity), including whether insects and other invertebrates may have the capacity to suffer (which is relevant both because most animals alive are invertebrates and for evaluating insect farming as another reaction to the issues of factory farming).

To be clear, I think we may well agree on most of this. And it's not directly relevant to the future-related issues we've been discussing (though see e.g. here and here). I'm partly mentioning this because the ideal communication strategy for engaging EAs in this particular area probably looks a bit different since EA has such an unusually large fraction of people who are unusually sympathetic to, and open about, farmed and wild animal welfare being globally important considerations.

Comment by Max_Daniel on Some quick notes on "effective altruism" · 2021-03-29T17:28:58.790Z · EA · GW

I think "spreading the ideas widely" is different from "making the community huge"

Yeah, I think that's an important insight I also agree with.

In an ideal world the best thing to do would be to expose everyone to some kind of "screening device" (e.g. a pitch or piece of content with a call to action at the end) which draws them into the EA community if and only if they'd make a net valuable contribution. In the actual world there is no such screening device, but I suspect we could still do more to expand the reach of "exposure to the initial ideas / basic framework of EA" while relying on self-selection and existing gatekeeping mechanisms for reducing the risk of dilution etc.

My main concern with such a strategy would actually not be that it risks dilution but that it would be more valuable once we have more of a "task Y", i.e. something a lot of people can do. (Or some other change that would allow us to better utilize more talent.)

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-29T15:11:15.415Z · EA · GW

Related: I remember a comment (can't find it anymore) somewhere by Liv Boeree or some other poker player familiar with EA. The commenter explained that monetary results aren't the greatest metric for assessing the skill of top poker players. Instead, it's  best to go with assessments by expert peers. (I think this holds mostly for large-field tournaments, not online cash games.) 

If I remember correctly, Linchuan Zhang made or referred to that comment somewhere on the Forum when saying that it was similar with assessing forecaster skill. (Or maybe it was you? :P)

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-29T15:09:44.807Z · EA · GW

I'm not trying to be obtuse, it wasn't super clear to me on a quick-ish skim; maybe if I'd paid more attention I've have clocked it.

FWIW I think it's the authors' job to anticipate how their audience is going to engage with their writing, where they're coming from etc. - You were not the only one who reacted by pushing back against our framing as evident e.g. from Khorton's much upvoted comment.

So no matter what we tried to convey, and what info is in the post or document if one reads closely enough, I think this primarily means that I (as main author of the wording in the post) could have done a better job, not that you or anyone else is being obtuse.

Comment by Max_Daniel on Some quick notes on "effective altruism" · 2021-03-29T12:09:24.451Z · EA · GW

I think EA is unlikely to be able to attract >1% of the (Western and non-Western) population primarily because understanding EA ideas (and being into them) typically requires a scientific and prosocial/altruistic mindset, advanced education, and the right age (no younger than ~16, not old enough to be too busy with lots of other life goals). Trying to attract >1% of the population would in my view likely lead to a harmful dilution of the EA community.

Thanks for stating your view on this as I would guess this will be a crux for some.

FWIW, I'm not sure if I agree with this. I certainly agree that there is a real risk from 'dilution' and other risks from both too rapid growth and a too large total community size.

However, I'm most concerned about these risks if I imagine a community that's kind of "one big blob" without much structure. But that's not the only strategy on the table. There could also be a strategy where the total community is quite large but there is structure and diversity within the community regarding what exactly 'being an EA' means for people, who interacts with whom, who commands how many resources, etc.

I feel like many other professional, academic, or political communities are both quite large overall and, at least to some extent, maintain spaces that aren't harmed by "dilution". Perhaps most notably, consider that almost any academic discipline is huge and yet there is formal and informal structure that to some extent separates the wheat from the chaff. There is the majority of people who drops out of academia after their PhDs and the tiny majority of those who become a professor; there is the majority of papers that will never be cited or are of poor quality, and then there is the very few number of top journals; there is the majority of colleges and university where faculty is mostly busy teaching and from where we don't expect much innovation, and the tiny fraction of research-focused top universities, etc.

I'm not saying this is clearly the way to go, or even feasible at all, for EA. But I do feel quite strongly that "we need to protect spaces for really high-quality interactions and intellectual progress" or similar - even if we buy them as assumption - does not imply it's best to keep to total size of the community small.

Perhaps as an intuition pump, consider how the life of Ramanujan might have looked like if there hadn't existed a maths book accessible to people in his situation, a "non-elite" university and other education accessible to someone in his situation, etc.

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-29T11:51:28.399Z · EA · GW

its quite surprising to me that publishing a highly cited paper early in one's career isn't correlated with larger total number of citations, at the high-performing tail (did I understand that right? Were they considering the right tail?

No, they considered the full distribution of scientists with long careers and sustained publication activity (which themselves form the tail of the larger population of everyone with a PhD). 

That is, their analysis includes the right tail but wasn't exclusively focused on it. Since by its very nature there will only be few data points in the right tail, it won't have a lot of weight when fitting their model. So it could in principle be the case that if we looked only at the right tail specifically this would suggest a different model.

It is certainly possible that early successes may play a larger causal role in the extreme right tail - we often find distributions that are mostly log-normal, but with a power-law tail, suggesting that the extreme tail may follow different dynamics.

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-29T11:44:19.539Z · EA · GW

Ed Boyden at MIT has this idea of "hidden gems" in the literature which are extremely undercited papers with great ideas: I believe the original idea for PCR, a molecular bio technique, had been languishing for at least 5 years with very little attention before later rediscover.

A related phenomenon has been studied in the scientometrics literature under the label 'sleeping beauties'.

Here is what Clauset et al. (2017, pp. 478f.) say in their review of the scientometrics/'science of science' field:

However, some discoveries do not follow these rules, and the exceptions demonstrate that there can be more to scientific impact than visibility, luck, and positive feedback. For instance, some papers far exceed the predictions made by sim- ple preferential attachment (5, 6). And then there are the “sleeping beauties” in science: discoveries that lay dormant and largely unnoticed for long periods of time before suddenly attracting great attention (7–9). A systematic analysis of nearly 25 million publications in the natural and social sciences over the past 100 years found that sleeping beauties occur in all fields of study (9).

Examples include a now famous 1935 paper by Einstein, Podolsky, and Rosen on quantum me- chanics; a 1936 paper by Wenzel on waterproofing materials; and a 1958 paper by Rosenblatt on artificial neural networks. The awakening of slumbering papers may be fundamentally un- predictable in part because science itself must advance before the implications of the discovery can unfold.

[See doc linked in the OP for full reference.]

Comment by Max_Daniel on Max_Daniel's Shortform · 2021-03-29T11:29:12.390Z · EA · GW

[Longtermist EA vs. human progress/progress studies.

I'm posting a quick summary of my current understanding, which I needed to write anyway for an email conversation. I'm not that familiar with the human progress/progress studies communities and would be grateful if people pointed out where my impression of them seems off, as well as for takes on whether I seem correct about what the key points of agreement and disagreement are.]


Here's a quick summary of my understanding of the 'longtermist EA' and 'progress studies' perspectives, in a somewhat cartoonish way to gesture at points of agreement and disagreement. 

EA and progress studies mostly agree about the past. In particular, they agree that the Industrial Revolution was a really big deal for human well-being, and that this is often overlooked/undervalued. E.g. here's a blog post by someone somewhat influential in EA:

Looking to the future, the progress studies community is most worried about the Great Stagnation. They are nervous that science seems to be slowing down, that ideas are getting harder to find, and that economic growth may soon be over. Industrial-Revolution-level progress was by far the best thing that ever happened to humanity, but we're at risk of losing it. That seems really bad. We need a new science of progress to understand how to keep it going. Probably this will eventually require a number of technological and institutional innovations since our current academic and economic systems are what's led us into the current slowdown.

If we were making a list of the most globally consequential developments from the past, EAs would in addition to the Industrial Revolution point to the Manhattan Project and the hydrogen bomb: the point in time when humanity first developed the means to destroy itself. (They might also think of factory farming as an example for how progress might be great for some but horrible for others, at least on some moral views.) So while they agree that the world has been getting a lot better thanks to progress, they're also concerned that progress exposes us to new nuclear-bomb-style risks. Regarding the future, they're most worried about existential risk -- the prospect of permanently forfeiting our potential of a future that's much better than the status quo. Permanent stagnation would be an existential risk, but EAs tend to be even more worried about catastrophes from emerging technologies such as misaligned artificial intelligence or engineered pandemics. They might also be worried about a potential war between the US and China, or about extreme climate change. So in a sense they aren't as worried about progress stopping than they are about progress being mismanaged and having catastrophic unintended consequences. They therefore aim for 'differential progress' -- accelerating those kinds of technological or societal change that would safeguard us against these catastrophic risks, and slowing down whatever would expose us to greater risk. So concretely they are into things like "AI safety" or "biosecurity" -- e.g. making machine learning systems more transparent so we could tell if they were trying to deceive their users, or implementing better norms around the publication of dual-use bio research.

The single best book on this EA perspective is probably The Precipice by my FHI colleague Toby Ord.

Overall, EA and the progress studies perspective agree on a lot -- they're probably closer than either would be to any other popular 'worldview'. But overall EAs probably tend to think that human progress proponents are too indiscriminately optimistic about further progress, and too generically focused on keeping progress going. (Both because it might be risky and because EAs probably tend to be more "optimistic" that progress will accelerate anyway, most notably due to advances in AI.) Conversely, human progress proponents tend to think that EA is insufficiently focused on ensuring a future of significant economic growth and the risks imagined by EAs either aren't real or that we can't do much to prevent them except encouraging innovation in general.

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T17:49:10.326Z · EA · GW

Relatedly, you might be interested in these two footnotes discussing how impressive it is that Sinatra et al. (2016) - the main paper we discuss in the doc - can predict the evolution of the Hirsch index (a citation measure) over a full career based on the the Hirsch index after the 20 or 50 papers:

Note that the evolution of the Hirsch index depends on two things: (i) citations to future papers and (ii) the evolution of citations to past papers. It seems easier to predict (ii) than (i), but we care more about (i). This raises the worry that predictions of the Hirsch index are a poor proxy of what we care about – predicting citations to future work – because successful predictions of the Hirsch index may work largely by predicting (ii) but not (i). This does make Sinatra and colleagues’ ability to predict the Hirsch index less impressive and useful, but the worry is attenuated by two observations: first, the internal validity of their model for predicting successful scientific careers is independently supported by its ability to predict Nobel prizes and other awards; second, they can predict the Hirsch index over a very long period, when it is increasingly dominated by future work rather than accumulating citations to past work.  

Acuna, Allesina, & Kording (2012) had previously proposed a simple linear model for predicting scientists’ Hirsch index. However, the validity of their model for the purpose of predicting the quality of future work is undermined more strongly by the worry explained in the previous footnote; in addition, the reported validity of their model is inflated by their heterogeneous sample that, unlike the sample analyzed by Sinatra et al. (2016), contains both early- and late-career scientists. (Both points were observed by Penner et al. 2013.)

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T17:44:22.966Z · EA · GW

Thanks! I agree with a lot of this.

I think the case of citations / scientific success is a bit subtle:

  • My guess is that the preferential attachment story applies most straightforwardly at the level of papers rather than scientists. E.g. I would expect that scientists who want to cite something on topic X will cite the most-cited paper on X rather than first looking for papers on X and then looking up the total citations of their authors.
  • I think the Sinatra et al. (2016) findings which we discuss in our relevant section push at least slightly against a story that says it's all just about "who was first in some niche". In particular, if preferential attachment at the level of scientists was a key driver, then I would expect authors who get lucky early in their career - i.e. publish a much-cited paper early - to get more total citations. In particular, citations to future papers by a fixed scientist should depend on citations to past papers by the same scientist. But that is not what Sinatra et al. find - they instead find that within the career of a fixed scientist the per-paper citations seem entirely random. 
    • Instead their model uses citations to estimate an 'intrinsic characteristic' that differs between scientists - what they call Q. 
      • (I don't think this is very strong evidence that such an intrinsic quality 'exists' because this is just how they choose the class of models they fit. Their model fits the data reasonably well, but we don't know if a different model with different bells and whistles wouldn't fit the data just as well or better. But note that, at least in my view, the idea that there are ability differences between scientists that correlate with citations looks likely on priors anyway, e.g. because of what we know about GMA/the 'positive manifold' of cognitive tasks or garden-variety impressions that some scientists just seem smarter than others.)
  • The International Maths Olympiad (IMO) paper seems like a clear example of our ability to measure an 'intrinsic characteristic' before we've seen the analog of a citation counts. IMO participants are high school students, and the paper finds that even among people who participated in the IMO in the same year and got their PhD from the same department IMO scores correlate with citations, awards, etc. Now, we might think that maybe maths is extreme in that success there depends unusually much on fluid intelligence or something like that, and I'm somewhat sympathetic to that point / think it's partly correct. But on priors I would find it very surprising if this phenomenon was completely idiosyncratic to maths. Like, I'd be willing to bet that scores at the International Physics Olympiad, International Biology Olympiad, etc., as well as simply GMA or high school grades or whatever, correlate with future citations in the respective fields.
    • The IMO example is particularly remarkable because it's in the extreme tail of performance. If we're not particularly interested in the tail, then I think some of studies on more garden-variety predictors such as GMA or personality we cite in the relevant section give similar examples.
Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T17:21:44.825Z · EA · GW

And yes, I totally agree that how well we can predict (rather than just the question whether predictability is zero or nonzero) is relevant in practice.

If the ex-post distribution is heavy-tailed, there are a bunch of subtle considerations here I'd love someone to tease out. For example, if you have a prediction method that is very good for the bottom 90% but biased toward 'typical' outcomes, i.e. the median, then you might be better off in expectation to allocate by a lottery over the full population (b/c this gets you the mean, which for heavy-tailed distributions will be much higher than the median).

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T17:17:05.130Z · EA · GW

I think it's plausible that VCs aren't better than chance when choosing between a suitably restricted "population", i.e. investment opportunities that have passed some bar of "plausibility". 

I don't think it's plausible that they are no better than chance simpliciter. In that case I would expect to see a lot of VCs who cut costs by investing literally zero time into assessing investment opportunities and literally fund on a first-come first-serve or lottery basis.

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T13:41:30.713Z · EA · GW

I'm suspicious you can do a good job of predicting ex ante outcomes. After all, that's what VCs would want to do and they have enormous resources. Their strategy is basically to pick as many plausible winners as they can fund.

I agree that looking at e.g. VC practices is relevant evidence. However, it seems to me that if VCs thought they couldn't predict anything, they would allocate their capital by a uniform lottery among all applicants, or something like that. I'm not aware of a VC adopting such a strategy (though possible I just haven't heard of it); to the extent that they can distinguish "plausible" from "implausible" winners, this does suggest some amount of ex-ante predictability. Similarly, my vague impression is that VCs and other investors often specialize by domain/sector, which suggests they think they can utilize their knowledge and network when making decisions ex ante.

Sure, predictability may be "low" in some sense, but I'm not sure we're saying anything that would commit us to denying this.

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T13:34:54.757Z · EA · GW

I'm not sure exactly what follows from this. I'm a bit worried you're concentrated on the wrong metric - success - when it's outputs that are more important. Can you explain why you focus on outcomes?

I'm not sure I agree that outputs are more important. I think it depends a lot on the question or decision we're considering, which is why I highlighted a careful choice of metric as one of the key pieces of advice.

So e.g. if our goal is to set performance incentives (e.g. salaries), then it may be best to reward people for things that are under their control. E.g. pay people more if they work longer hours (inputs), or if there are fewer spelling mistakes in their report (cardinal output metric) or whatever. At other times, paying more attention to inputs or outputs rather than outcomes or things beyond the individual performer's control may be justified by considerations around e.g. fairness or equality.

All of these things are of course really important to get right within the EA community as well, whether or not we care about them instrumentally or intrinsically. There are lot of tricky and messy questions here.

But if we can say anything general, then I think that especially in EA contexts we care more, ore more often, about outcomes/success/impact on the world, and less about inputs and outputs, than usual. We want to maximize well-being, and from 'the point of view of the universe' it doesn't ultimately matter if someone is happy because someone else produced more outputs or because the same outputs had greater effects. Nor does it ultimately matter if impact differences are due to differences in talent, resource endowments, motivation, luck, or ...

Another way to see this is that often actors that care more about inputs or outputs do so because they don't internalize all the benefits from outcomes. But if a decision is motivated by impartial altruism, there is a sense in which there are no externalities.

Of course, we need to make all the usual caveats against 'naive consequentialism'. But I do think there is something important in this observation.

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T13:13:27.966Z · EA · GW

Thanks for this comment!

I'm sympathetic to the point that we're lumping together quite different things under the vague label "performance", perhaps stretching its beyond its common use. That's why I said in bold that we're using a loose notion of performance. But it's possible it would have been better if I had spent more time to come up with a better terminology.

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T11:54:19.603Z · EA · GW

Thanks! Fixed in post and doc.

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T11:25:45.745Z · EA · GW

Great, thank you! I do believe work by Woolley was what I had in mind.

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T09:41:20.817Z · EA · GW

Thanks for clarifying!

FWIW I think I see the distinction between popularity and other qualities as less clear as you seem to do. For instance, I would expect that book sales and startup returns are also affected by how "good" in whatever other sense the book or startup product is. Conversely, I would guess that realistically Nobel Prizes and other scientific awards are also about popularity and not just about the quality of the scientific work by other standards. I'm happy to concede that, in some sense, book sales seem more affected by popularity than Nobel Prizes, but it seems a somewhat important insight to me that neither is "just about popularity" nor "just about achievement/talent/quality/whatever".

It's also not that clear to me whether there is an obviously more adequate standard of overall "goodness" here: how much joy the book brings readers? What literary critics would say about the book? I think the ultimate lesson here is that the choice of metric is really important, and depends a lot on what you want to know or decide, which is why "Carefully choose the underlying population and the metric for performance" is one of our key points of advice. I can see that saying something vague and general like "some people achieve more" and then giving examples of specific metrics pushes against this insight by suggesting that these are the metrics we should generally most care about. FWIW I still feel OK about our wording here since I feel like in an opening paragraph we need to balance nuance/detail and conciseness / getting the reader interested.

As an aside, my vague impression is that it's somewhat controversial to what extent successful teams have different qualities to successful individuals. In some sense this is of course true since there are team properties that don't even make sense for individuals. However, my memory is that for a while there was some more specific work in psychology that was allegedly identifying properties that predicted team success better than the individual abilities of its members, which then largely didn't replicate.

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T08:53:44.508Z · EA · GW

Thanks for these points!

My super quick take is that 1. definitely sounds right and important to me, and I think it would have been good if we had discussed this more in the doc.

I think 2. points to the super important question (which I think we've mentioned somewhere under Further research) how typical performance/output metrics relate to what we ultimately care about in EA contexts, i.e. positive impact on well-being. At first glance I'd guess that sometimes these metrics 'overstate' heavy-tailedness of EA impact (for e.g. the reasons you mentioned), but sometimes they might also 'understate' them. For instance, the metrics might not 'internalize' all the effects on the world (e.g. 'field building' effects from early-stage efforts), or for some EA situations the 'market' may be even more winner-takes-most than usual (e.g. for some AI alignment efforts it only matters if you can influence DeepMind), or the 'production function' might have higher returns to talent than usual (e.g. perhaps founding a nonprofit or contributing valuable research to preparadigmatic fields is "extra hard" in a way not captured by standard metrics when compared to easier cases).

Comment by Max_Daniel on How much does performance differ between people? · 2021-03-26T08:39:33.885Z · EA · GW

I think you're right that complexity at the very least isn't the only cause/explanation for these differences. 

E.g. Aguinis et al. (2016) find that, based on an analysis of a very large number of productivity data sets, the following properties make a heavy-tailed output distribution more likely:

  • Multiplicity of productivity,
  • Monopolistic productivity,
  • Job autonomy,
  • Job complexity,
  • No productivity ceiling (I guess your point is a special case of this: if the marginal cost of increasing output becomes too high too soon, there will effectively be a ceiling; but there can also e.g. be ceilings imposed by the output metric we use, such as when a manager gives a productivity rating on a 1-10 scale)

As we explain in the paper, I have some open questions about the statistical approach in that paper. So I currently don't take their analysis to be that much evidence that this is in fact right. However, they also sound right to me just based on priors and based on theoretical considerations (such as the ones in our section on why we expect heavy-tailed ex-ante performance to be widespread).

In the part you quoted, I wrote "less complex jobs" because the data I'm reporting is from a paper that explicitly distinguishes low-, medium-, and high-complexity jobs, and finds that only the first two types of job potentially have a Gaussian output distribution (this is Hunter et al. 1990). [TBC, I understand that the reader won't know this, and I do think my current wording is a bit sloppy/bad/will predictably lead to the valid pushback you made.]

[References in the doc linked in the OP.]