Max_Daniel's Shortform

post by Max_Daniel · 2019-12-13T11:17:10.883Z · EA · GW · 61 comments


Comments sorted by top scores.

comment by Max_Daniel · 2019-12-17T11:56:08.786Z · EA(p) · GW(p)

[Some of my high-level views on AI risk.]

[I wrote this for an application a couple of weeks ago, but thought I might as well dump it here in case someone was interested in my views. / It might sometimes be useful to be able to link to this.]

[In this post I generally state what I think ​before ​updating on other people’s views – i.e., what’s ​sometimes known as​ ‘impressions’ as opposed to ‘beliefs.’ [EA · GW]]


  • Transformative AI (TAI) – the prospect of AI having impacts at least as consequential as the Industrial Revolution – would plausibly (~40%) be our best lever for influencing the long-term future if it happened this century, which I consider to be unlikely (~20%) but worth betting on.
  • The value of TAI depends not just on the technological options available to individual actors, but also on the incentives governing the strategic interdependence between actors. Policy could affect both the amount and quality of technical safety research and the ‘rules of the game’ under which interactions between actors will play out.

Why I'm interested in TAI as a lever to improve the long-run future

I expect my perspective to be typical of someone who has become interested in TAI through their engagement with the effective altruism (EA) community. In particular,

  • My overarching interest is to make the lives of as many moral patients as possible to go as well as possible, no matter where or when they live; and
  • I think that in the world we find ourselves in – it could have been otherwise –, this goal entails strong longtermism [EA · GW],​ i.e. the claim that “the primary determinant of the value of our actions today is how those actions affect the very long-term future.”

Less standard but not highly unusual (within EA) high-level views I hold more tentatively:

  • The indirect long-run impacts of our actions are extremely hard to predict and don’t ‘cancel out’ in expectation. In other words, I think that what ​Greaves (2016)​ calls ​complex cluelessness​ is a pervasive problem. In particular, evidence that an action will have desirable effects in the short term generally is ​not​ a decisive reason to believe that this action would be net positive overall, and neither will we be able to establish the latter through any other means.
  • Increasing the relative influence of longtermist actors is one of the very few strategies we have good reasons to consider net positive. Shaping TAI is a particularly high-leverage instance of this strategy, where the main mechanism is reaping an ‘epistemic rent’ from having anticipated TAI earlier than other actors. I take this line of support to be significantly more robust than any ​particular story on how TAI might pose a global catastrophic risk including even broad operationalizations of the ‘value alignment problem.’

My empirical views on TAI

I think the strongest reasons to expect TAI this century are relatively outside-view-based (I talk about this century just because I expect that later developments are harder to predictably influence, not because I think a century is particularly meaningful time horizon or because I think TAI would be less important later):

  • We’ve been able to automate an increasing number of tasks (with increasing performance and falling cost), and I’m not aware of a convincing argument for why we should be ​highly confident​ that this trend will stop short of ​full automation –​ i.e., AI systems being able to do all tasks more economically efficiently than humans –, despite moderate scientific and economic incentives to find and publish one.
  • Independent types of weak evidence such as ​trend extrapolation​ and ​expert​ ​surveys​ suggest we might achieve full automation this century.
  • Incorporating full automation into macroeconomic growth models predicts – at least under some a​ssumptions – a sustained higher rate of economic growth (e.g. ​Hanson 2001​, Nordhaus 2015​, ​Aghion et al. 2017​), which arguably was the main driver of the welfare-relevant effects of the Industrial Revolution.
  • Accelerating growth this century is consistent with extrapolating historic growth rates, e.g. Hanson (2000[1998])​.

I think there are several reasons to be skeptical, but that the above succeeds in establishing a somewhat robust case for TAI this century not being wildly implausible.

My impression is that I’m less confident than the typical longtermist EA in various claims around TAI, such as:

  • Uninterrupted technological progress would eventually result in TAI;
  • TAI will happen this century;
  • we can currently anticipate any specific way of positively shaping the impacts of TAI;
  • if the above three points were true then shaping TAI would be the most cost-effective way of improving the long-term future.

My guess is this is due to different priors, and due to frequently having found extant specific arguments for TAI-related claims (including by staff at FHI and Open Phil) less convincing than I would have predicted. I still think that work on TAI is among the few best shots for current longtermists.

comment by Stefan_Schubert · 2019-12-17T13:30:04.599Z · EA(p) · GW(p)

Awesome post, Max, many thanks for this. I think it would be good if these difficult questions were discussed more on the forum by leading researchers like yourself.

I think you should post this as a normal post; it's far too good and important to be hidden away on the shortform.

comment by Jonas Vollmer · 2020-06-14T16:05:35.540Z · EA(p) · GW(p)

I second Stefan's suggestion to share this as a normal post – I realize I should have read your shortform much sooner.

comment by meerpirat · 2020-10-14T14:49:03.958Z · EA(p) · GW(p)

Thanks for putting your thoughts together, I only accidentally stumbled on this and I think it would be a great post, too.

I was really surprised about you giving ~20% for TAI this century, and am still curious about your reasoning, because it seems to diverge strongly from your peers. Why do you find inside-view based arguments less convincing? I've updated pretty strongly on the deep (reinforcement) learning successes of the last years, and on our growing computational and algorithmic level understanding of the human mind. I've found AI Impacts' collection of inside- and outside-view arguments against current AI leading to AGI fairly unconvincing, e.g. the list of "lacking capacities" seem to me (as someone following CogSci, ML and AI Safety related blogs) to get a lot of productive research attention.

comment by Pablo_Stafforini · 2019-12-17T14:32:20.197Z · EA(p) · GW(p)

[deleted because the question I asked turned out to be answered in the comment, upon careful reading]

comment by Max_Daniel · 2019-12-13T11:17:11.201Z · EA(p) · GW(p)

What's the right narrative about global poverty and progress? Link dump of a recent debate.

The two opposing views are:

(a) "New optimism:" [1] This is broadly the view that, over the last couple of hundred years, the world has been getting significantly better, and that's great. [2] In particular, extreme poverty has declined dramatically, and most other welfare-relevant indicators have improved a lot. Often, these effects are largely attributed to economic growth.

  • Proponents in this debate were originally Bill Gates, Steven Pinker, and Max Roser. But my loose impression is that the view is shared much more widely.
  • In particular, it seems to be the orthodox view in EA; cf. e.g. Muehlhauser listing one of Pinker's books in his My worldview in 5 books post, saying that "Almost everything has gotten dramatically better for humans over the past few centuries, likely substantially due to the spread and application of reason, science, and humanism."

(b) Hickel's critique: Anthropologist Jason Hickel has criticized new optimism on two grounds:

  • 1. Hickel has questioned the validity of some of the core data used by new optimists, claiming e.g. that "real data on poverty has only been collected since 1981. Anything before that is extremely sketchy, and to go back as far as 1820 is meaningless."
  • 2. Hickel prefers to look at different indicators than the new optimists. For example, he has argued for different operationalizations of extreme poverty or inequality.

Link dump (not necessarily comprehensive)

If you only read two things, I'd recommend (1) Hasell's and Roser's article explaining where the data on historic poverty comes from and (2) the take by economic historian Branko Milanovic.

By Hickel (i.e. against "new optimism"):

By "new optimists":

Commentary by others:

My view

  • I'm largely unpersuaded by Hickel's charge that historic poverty data is invalid. Sure, it's way less good than contemporary data. But based on Hasell's and Roser's article, my impression is that the data is better than I would have thought, and its orthodox analysis and interpretation more sophisticated than I would have thought. I would be surprised if access to better data would qualitatively change the "new optimist" conclusion.
  • I think there is room for debate over which indicators to use, and that Hickel makes some interesting points here. I find it regrettable that the debate around this seems so adversarial.
  • Still, my sense is that there is an important, true, and widely underappreciated (particularly by people on the left, including my past self) core of the "new optimist" story. I'd expect looking at other indicators could qualify that story, or make it less simplistic, point to important exceptions etc. - but I'd probably consider a choice of indicators that painted an overall pessimistic picture as quite misleading and missing something important.
  • On the other hand, I would quite strongly want to resist the conclusion that everything in this debate is totally settled, and that the new optimists are clearly right about everything, in the same way in which orthodox climate science is right about climate change being anthropogenic, or orthodox medicine is right about homeopathy not being better than placebo. But I think the key uncertainties are not in historic poverty data, but in our understanding of wellbeing and its relationship to environmental factors. Some examples of why I think it's more complicated
    • The Easterlin paradox
    • The unintuitive relationship between (i) subjective well-being in the sense of the momentary affective valence of our experience on one hand and (ii) reported life satisfaction. See e.g. Kahneman's work on the "experiencing self" vs. "remembering self".
    • On many views, the total value of the world is very sensitive to population ethics, which is notoriously counterintuitive. In particular, on many plausible views, the development of the total welfare of the world's human population is dominated by its increasing population size.
  • Another key uncertainty is the implications of some of the discussed historic trends for the value of the world going forward, about which I think we're largely clueless. For example, what are the effects of changing inequality on the long-term future?

[1] It's not clear to me if "new optimism" is actually new. I'm using Hickel's label just because it's short and it's being used in this debate anyway, not to endorse Hickel's views or make any other claim.

[2] There is an obvious problem with new optimism, which is that it's anthropocentric. In fact, on many plausible views, the total axiological value of the world at any time in the recent past may be dominated by the aggregate wellbeing of nonhuman animals; even more counterintuitively, it may well be dominated by things like the change in the total population size of invertebrates. But this debate is about human wellbeing, so I'll ignore this problem.

comment by Jonas Vollmer · 2020-06-14T16:15:56.467Z · EA(p) · GW(p)

In addition to the examples you mention, the world has become much more unequal over the past centuries, and I wonder how that impacts welfare. Relatedly, I wonder to what degree there is more loneliness and less purpose and belonging than in previous times, and how that impacts welfare (and whether it relates to the Easterlin paradox). EAs don't seem to discuss these aspects of welfare often. (Somewhat related books: Angus Deaton's The Great Escape and Junger's Tribe.)

comment by Denise_Melchin · 2020-06-14T18:00:55.192Z · EA(p) · GW(p)

(Have not read through Max' link dump yet, which seems very interesting, I also feel some skepticism of the 'new optimism' worldview.)

One major disappointment in Pinker's book as well as in related writings for me has been that they do little to acknowledge that how much progress you think the world has seen depends a lot on your values. To name some examples, not everyone views the legalization of gay marriage and easier access to abortion as progress, and not everyone thinks that having plentiful access to consumer goods is a good thing.

I would be very interested in an analysis of 'progress' in light of the different moral foundations discussed by Haidt. I have the impression that Pinker exclusively focuses on the 'care/harm' foundation, while completely ignoring others like Sanctity/purity or Authority/respect and this might be where some part of the disconnect between the 'New optimists' and opponents is coming from.

comment by Jonas Vollmer · 2020-06-15T08:23:27.892Z · EA(p) · GW(p)

Your point reminds me of the "history is written by the winners" adage – presumably, most civilizations would look back and think of their history as one of progress because they views their current values most favorably.

Perhaps this is one of the paths that would eventually contribute to a "desired dystopia" outcome, as outlined in Ord's book: we fail to realize that our social structure is flawed and leads to suffering in a systematic manner that's difficult to change.

(Also related: )

comment by willbradshaw · 2020-07-14T13:06:09.605Z · EA(p) · GW(p)

I have relatively little exposure to Hickel, save for reading his guardian piece and a small part of the dialogue that followed from that, but I don't get the impression he's coming from a position of putting more weight on Sanctity/purity or Authority/respect; in general I'd guess that few people in left-wing social-science academia are big on those sorts of moral foundations, except indirectly via moral/cultural relativism.

Taking Haidt's moral foundations theory as read for the moment, I'd guess that the Fairness foundation is doing a lot of the work in this disagreement. In general, leftists and liberals seem to differ a lot in what they consider culpable harm, and Fairness/exploitation seems like a big part of that.

comment by Aidan O'Gara · 2020-07-14T11:00:04.825Z · EA(p) · GW(p)

Very interesting writeup, I wasn't aware of Hickel's critique but it seems reasonable.

Do you think it matters who's right? I suppose it's important to know whether poverty is increasing or decreasing if you want to evaluate the consequences of historical policies or events, and even for general interest. But does it have any specific bearing on what we should do going forwards?

comment by willbradshaw · 2020-07-14T13:12:16.640Z · EA(p) · GW(p)

Do you think it matters who's right?

I think it matters quite a lot when it comes to assessing where to go from here: in particular, how cautious and conservative to be, and how favourable towards untested radical change.

If things have gotten way better and are likely to continue to get way better in the foreseeable future, then we should probably broadly stick with what we're doing – some tinkering around the edges to fix obvious abuses, but no root-and-branch restructuring unless something goes obviously and profoundly wrong.

Whereas if things are failing to get better, or are actively getting worse, then it might be worth taking big risks in order to get out of the hole.

I've often had conversations with people to my left where they seem way too willing to smash stuff in the process of getting to deep systemic change, which is potentially sensible if you think we're in a very bad place and getting worse but madness if you think we're in an extremely unusually good place and getting better.

comment by Max_Daniel · 2020-07-14T12:46:29.251Z · EA(p) · GW(p)

Thanks, this is a good question. I don't think it has specific bearing on future actions, but does have some broader relevance. For example, longtermists have sometimes discussed the total value of the long-term future [EA · GW]: in this context, we may be interested in whether things have been getting better or worse in order to extrapolate this trend forward.

(Though this is not why I wrote this post. - That was more because I happened to find it interesting personally.)

Of course, this trend extrapolation would only be one among many considerations. In addition, ideally we'd want a trend on the world's total value, not a trend on just poverty. So e.g. the anthropocentrism would be a problem here.

comment by lucy.ea8 · 2019-12-13T22:43:51.588Z · EA(p) · GW(p)

I agree that the world has gotten much better than it was.

There are two important reasons for this, the other improvements that we see mostly follow from them.

  1. Energy consumption (is wealth) The energy consumption per person has increased over the last 500 years and that increased consumption translates to welfare.
  2. Education (knowledge) The amount of knowledge that we as humanity posses has increased dramatically, and that knowledge is widely accessible. 75% of kids finishing 9th grade, 12.5% finishing 6th grade, 4.65% less than 6th grade unfortunately around 7-8% kids have never gone to school. Education increases translate to increase in health, wealth (actually energy consumption) more in countries with market economies than non-market economies.

The various -isms (capitalism, socialism, communism, neoliberalism, colonialism, fascism) have very little to do with human development, and in fact have been very negative for human development. (I am skipping theory about how the -isms are supposed to work, and jumping to the actual effects).

comment by lucy.ea8 · 2019-12-14T23:24:32.784Z · EA(p) · GW(p)

"Almost everything has gotten dramatically better for humans over the past few centuries, likely substantially due to the spread and application of reason, science, and humanism."

Pinker has his critics, a sample at

The improvements in knowledge are secondary to the tapping of fossil fuels and the resulting energy consumption, which eventually caused the demographic transition.

Once the demographic transition happened, there are no young men willing to fight foreign wars and violence declined. i.e. outright occupation (colonialism) gave way to neocolonialism, and that is the world we find ourselves in today.

I find it hard to take any claims of "reason" and "humanism" seriously, while the world warms per capita consumption of fossil fuel is 10 times higher in USA than "developing" countries. Countries of the global south still have easily solvable problems like basic education and health that are under funded.

Richard A. Easterlin has a good understanding when he asks "Why Isn't the Whole World Developed?"

comment by lucy.ea8 · 2019-12-15T16:31:38.427Z · EA(p) · GW(p)

When downvoting please explain why

comment by Aaron Gertler (aarongertler) · 2020-01-15T00:29:02.285Z · EA(p) · GW(p)

I just now saw this post, but I would guess that some readers wanted more justification for the use of the term "secondary", which implies that you're assigning value to both of (improvements in knowledge) and (tapping of fossil fuels) and saying that the negative value of the latter outweighs the value of the former. I'd guess that readers were curious how you weighed these things against each other.

I'll also note that Pinker makes no claim that the world is perfect or has no problems, and that claiming that "reason" or "humanism" has made the world better does not entail that they've solved all the world's problems or even that the world is improving in all important ways. You seem to be making different claims than Pinker does about the meaning of those terms, but you don't explain how you define them differently. (I could be wrong about this, of course; that's just what I picked up from a quick reading of the comment.)

comment by lucy.ea8 · 2020-01-15T19:40:03.141Z · EA(p) · GW(p)

Thanks Aaron for your response. I am assigning positive value to both improvements in knowledge and increased energy use (via tapping of fossil fuel energy). I am not weighing them one vs the other. I am saying that without the increased energy from fossil fuels we would still be agricultural societies, with repeated rise and fall of empires. The indus valley civilization, ancient greeks, mayans all of the repeatedly crashed. At the peak of those civilizations I am sure art, culture and knowledge flourished. Eventually humans out ran their resources and crashed, the crash simplified art forms, culture, and knowledge was also lost.

The driver is energy, and the result is increased art, culture, knowledge and peace too. Reason and humanism have very little to do with why our world is peaceful today (in the sense that outright murder, slavery, colonialism are no longer accepted).

I read the book by Pinker and his emphasis on Western thought and enlightenment was off putting. We are all human, there are no Western values or Eastern values.

Hans Rosling puts it beautifully “There is no such thing as Swedish values. Those are modern values”

comment by Max_Daniel · 2020-06-26T13:41:01.297Z · EA(p) · GW(p)

[See this research proposal [EA(p) · GW(p)] for context. I'd appreciate pointers to other material.]

[WIP, not comprehensive] Collection of existing material on 'impact being heavy-tailed'

Conceptual foundations

  • Newman (2005) provides a good introduction to powers laws, and reviews several mechanisms generating them, including: combinations of exponentials; inverses of quantities; random walks; the Yule process [also known as preferential attachment]; phase transitions and critical phenomena; self-organized criticality.
  • Terence Tao, in Benford’s law, Zipf’s law, and the Pareto distribution, offers a partial explanation for why heavy-tailed distributions are so common empirically.
  • Clauset et al. (2009[2007]) explain why it is very difficult to empirically distinguish power laws from other heavy-tailed distributions (e.g. log-normal). In particular, seeing a roughly straight line in a log-log plot is not sufficient to identify a power law, despite such inferences being popular in the literature. Referring to power-law claims by others, they find that “the distributions for birds, books, cities, religions, wars, citations, papers, proteins, and terrorism are plausible power laws, but they are also plausible log-normals and stretched exponentials.” (p. 26) [NB on citations Golosovsky & Solomon, 2012, claim to settle the question in favor of a power law - and in fact an extreme tail even heavier than that -, and they are clearly aware of the issues pointed out by Clauset et al. On the other hand, Brzezinski, 2014, using a larger data set, seems to confirm the results of Clauset et al., finding that a pure power law is the single best fit only for physics & astronomy papers, while in all other disciplines we either can't empirically distinguish between several heavy-tailed distributions or a power law can be statistically rejected.]
  • Lyon (2014) argues that, unlike commonly believed, the Central Limit Theorem cannot explain why normal distributions are so common. (The critique also applies to the analog explanation for the log-normal distribution, an example of a heavy-tailed distribution.) Instead, an appeal to the principle of maximum entropy is suggested.

Impact in general / cause-agnostic

EA community building

Global health


  • Will MacAskill, in an interview by Lynette Bye [EA · GW], on the distribution of impact across work time for a fixed individual: "Maybe, let's say the first three hours are like two thirds the value of the whole eight-hour day. And then, especially if I'm working six days a week, I'm not convinced the difference between eight and ten hours is actually adding anything in the long term."
comment by Linch · 2020-06-26T23:37:38.959Z · EA(p) · GW(p)

I did some initial research/thinking on this before the pandemic came and distracted me completely. Here's a very broad outline that might be helpful.

comment by Max_Daniel · 2020-06-27T08:07:29.089Z · EA(p) · GW(p)

Great, thank you!

I saw that you asked Howie for input - are there other people you think it would be good to talk to on this topic?

comment by Linch · 2020-06-29T08:23:54.890Z · EA(p) · GW(p)

You're probably aware of this, but Anders Sandberg has done some thinking about this. Also presumably David Roodman based on his public writings (though I have not contacted him myself).

More broadly, I'm guessing that anybody who either you've referenced above, or who I've linked in my doc, would be helpful, though of course many of them are very busy.

comment by Max_Daniel · 2020-07-07T17:28:55.317Z · EA(p) · GW(p)

[Mathematical definitions of heavy-tailedness. Currently mostly notes to myself - I might turn these into a more accessible post in the future. None of this is original, and might indeed be routine for a maths undergraduate specializing in statistics.]

There are different definitions of when a probability distribution is said to have a heavy tail, and several closely related terms. They are not extensionally equivalent. I.e. there are distributions that are heavy-tailed according to some, but not all common definitions; this is for example true for the log-normal distribution.

Here I'll collect all definitions I encounter, and what I know about how they relate to each other.

I don't think the differences matter for most EA purposes, where the weakest definition that includes e.g. log-normals seems safe to use (except maybe #0 below, which might be too weak). I'm mainly collecting the definitions because I'm curious and because I think they can be an avoidable source of confusion for someone trying to understand discussions involving heavy-tailedness. (The differences might matter for more technical purposes, e.g. when deciding which statistical method to use to analyze certain data.)

There is also a less interesting way in which definitions can differ: a distribution can have a heavy right tail, a heavy left tail, or both. Some definitions thus come in three variants. I'm for now going to ignore this, stating only one variant per definition.

List of definitions

X will always denote a random variable.

0. X is leptokurtic (or super-Gaussian) iff its kurtosis is strictly larger than 3 (which is the kurtosis of e.g. all normal distributions), i.e. µ_4/σ^4 > 3, where µ_4 = E[(X - E[X])^4] is the fourth central moment and σ is the standard deviation.

1. X has a heavy right tail iff the moment-generating function of X is infinite at all t > 0.

2. X is heavy-tailed iff it has an infinite nth moment for some n.

3. X is heavy-tailed iff it has infinite variance (i.e. infinite 2nd central moment).

4. X has a long right tail iff for all real numbers t the conditional probability P[X > x + t | X > x] converges to 1 as x goes to infinity.

4b. X has a heavy right tail iff there is a real number x_0 such that the conditional mean exceedance (CME) E[X - x | X > x] is a strictly increasing function of x for x > x_0. (This is a definition by Bryson, 1974, who may have coined the term 'heavy-tailed' and shows that distributions with constant CME are precisely the exponential distributions.)

5. X is subexponential (or fulfills the catastrophe principle) iff for all n > 0 and i.i.d. random variables X_1, ..., X_n with the same distribution as X the quotients of probabilities P[X_1 + ... + X_n > x] / P[max(X_1, ..., X_n)] converges to 1 as x goes to infinity.

6. X has a regularly varying right tail with tail index 0 < α ≤ 2 iff there is a slowly varying function L: (0,+∞) → (0,+∞) such that for all x > 0 we have P[X > x] = x^(-α) * L(x). (L is slowly varying iff, for all a > 0, the quotient L(ax)/L(x) converges to 1 as x goes to infinity.)

Relationships between definitions

(Note that even for those I state without caveats I haven't convinced myself of a proof in detail.)

I'll use #0 to refer to the clause on the right hand side of the "iff" statement in definition 0, and so on.

(For some of these one might have to use the suitable versions of heavy right tail / left tail etc. - e.g. perhaps #1 needs to be replaced with "heavy right and left tail" or "heavy right or left tail" etc.)

  • I suspect that #0 is the weakest condition, i.e. that all other definitions imply that X is super-Gaussian.
  • I suspect that #6 is the strongest condition, i.e. implies all others.
  • I think that: #3 => #2 => #1 and #5 => #4 => #1 (where '=>' denotes implications).

Why I think that:

  • #0 weakest: Heuristically, many other definitions state or imply that some higher moments don't exist, or are at least "close" to such a condition (e.g. #1). By contrast, #0 merely requires that a certain moment is larger than for the normal distribution. Also, the exponential distribution is super-Gaussian but not usually considered to be heavy-tailed - in fact, "heavy-tailed" is sometimes loosely explained to mean "having heavier tails than an exponential distribution".
  • #6 strongest: The condition basically says that the distribution behaves like a Pareto distribution (or "power law") as we look further down the tail. And for Pareto distributions with α ≤ 2 it's well known and easy to see that the variance doesn't exist, i.e. #3 holds. Similarly, I've seen power laws being cited as examples of distributions fulfilling the catastrophe principle, i.e. #5.
  • #3 => #2 is obvious.
  • #2 => #1: A statement very close to the contrapositive is well known: if the moment-generating function exists in an open neighborhood around some value, then the nth moments about that value are given by the nth derivative of the moment-generating function at that value. (I'm not sure if there can be weird cases where the moment-generating function exists in some points but no open interval.)
  • #5 => #4 and #4 => #1 are stated on Wikipedia.
comment by Inda · 2020-06-26T22:47:28.561Z · EA(p) · GW(p)

This is a good link-list. It seems undiscoverable here though. I think thinking on how you can make such lists discoverable is useful. Making it a top-level post seems an obvious improvement.

comment by Max_Daniel · 2020-06-27T08:05:36.443Z · EA(p) · GW(p)

Thanks for the suggestion. I plan to make this list more discoverable once I feel like it's reasonably complete, e.g. by turning it into its own top-level post or appending it to a top-level post writeup of my research on this topic.

comment by Max_Daniel · 2020-02-19T10:31:10.026Z · EA(p) · GW(p)

[On ]

  • [ETA: After having talked to more people, it now seems to me that disagreeing on this point more often explains different reactions than I thought it would. I'm also now less confident that my impression that there wasn't bad faith from the start is correct, though I think I still somewhat disagree with many EAs on this. In particular, I've also seen plenty of non-EA people who don't plausibly have a "protect my family" reaction say the piece felt like a failed attempt to justify a negative bottom line that was determined in advance.] (Most of the following doesn't apply in cases where someone is acting in bad faith and is determined to screw you over. And in fact I've seen the opposing failure mode of people assuming good faith for too long. But I don't think this is a case of bad faith.)
  • I've seen some EAs react pretty negatively or angrily to that piece. (Tbc, I've also seen different reactions.) Some have described the article as a "hit piece".
  • I don't think it qualifies as a hit piece. More like a piece that's independent/pseudo-neutral/ambiguous and tried to stick to dry facts/observations but in some places provides a distorted picture by failing to be charitable / arguably missing the point / being one-sided and selective in the observation it reports.
  • I still think that reporting like this is net good, and that the world would be better if there was more of it at the margin, even if it has flaws similarly severe to that one. (Tbc, I think there would have been a plausibly realistic/achievable version of that article that would have been better, and that there is fair criticism one can direct at it.)
  • To put it bluntly, I don't believe that having even maximally well-intentioned and intelligent people at key institutions is sufficient for achieving a good outcome for the world. I find it extremely hard to have faith in a setup that doesn't involve a legible system/structure with things like division of labor, checks and balances, procedural guarantees, healthy competition, and independent scrutiny of key actors. I don't know if the ideal system for providing such outside scrutiny will look even remotely like today's press, but currently it's one of the few things in this vein that we have for nonprofits, and Karen Hao's article is an (albeit flawed) example of it.
  • Whether this specific article was net good or not seems pretty debatable. I definitely see reasons to think it'll have bad consequences, e.g. it might crowd out better reporting, might provide bad incentives by punishing orgs for trying to do good things, ... I'm less wedded to a prediction of this specific article's impact than to the broader frame for interpreting and reacting to it.
  • I find something about the very negative reactions I've seen worrying. I of course cannot know what they were motivated by, but some seemed like I would expect someone to react who's personally hurt because they judge a situation as being misunderstood, feels like they need to defend themself, or like they need to rally to protect their family. I can relate to misunderstandings being a painful experience, and have sympathy for it. But I also think that if you're OpenAI, or "the EA community", or anyone aiming to change the world, then misunderstandings are part of the game, and that any misunderstanding involves at least two sides. The reactions I'd like to see would try to understand what has happened and engage constructively with how to productively manage the many communication and other challenges involved in trying to do something that's good for everyone without being able to fully explain your plans to most people. (An operationalization: If you think this article was bad, I think that ideally the hypothesis "it would be good it we had better reporting" would enter your mind as readily as the hypothesis "it would be good if OpenAI's comms team and leadership had done a better job".)
comment by Max_Daniel · 2020-02-14T18:50:21.982Z · EA(p) · GW(p)

[Is longtermism bottlenecked by "great people"?]

Someone very influential in EA recently claimed in conversation with me that there are many tasks X such that (i) we currently don't have anyone in the EA community who can do X, (ii) the bottleneck for this isn't credentials or experience or knowledge but person-internal talent, and (iii) it would be very valuable (specifically from a longtermist point of view) if we could do X. And that therefore what we most need in EA are more "great people".

I find this extremely dubious. (In fact, it seems so crazy to me that it seems more likely than not that I significantly misunderstood the person who I think made these claims.) The first claim is of course vacuously true if, for X, we choose some ~impossible task such as "experience a utility-monster amount of pleasure" or "come up with a blueprint for how to build safe AGI that is convincing to benign actors able to execute it". But of course more great people don't help with solving impossible tasks.

Given the size and talent distribution of the EA community my guess is that for most apparent X, the issue either is that (a) X is ~impossible, or (b) there are people in EA who could do X, but the relevant actors cannot identify them, or (c) acquiring the ability to do X is costly (e.g. perhaps you need time to acquire domain-specific expertise), even for maximally talented "great people", and the relevant actors either are unable to help pay that cost (e.g. by training people themselves, or giving them the resources to allow them to get training elsewhere) or make a mistake by not doing so.

My best guess for the genesis of the "we need more great people" perspective: Suppose I talk a lot to people at an organization that thinks there's a decent chance we'll develop transformative AI soon but it will go badly, and that as a consequence tries to grow as fast as possible to pursue various ambitious activities which they think reduces that risk. If these activities are scalable projects with short feedback loops on some intermediate metrics (e.g. running some super-large-scale machine learning experiments), then I expect I would hear a lot of claims like "we really need someone who can do X". I think it's just a general property of a certain kind of fast-growing organization that's doing practical things in the world that everything constantly seems like it's on fire. But I would also expect that, if I poked a bit at these claims, it would usually turn out that X is something like "contribute to this software project at the pace and quality level of our best engineers, w/o requiring any management time" or "convince some investors to give us much more money, but w/o anyone spending any time transferring relevant knowledge". If you see that things break because X isn't done, even though something like X seems doable in principle (perhaps you see others do it), it's tempting to think that what you need is more "great people" who can do X. After all, people generally are the sort of stuff that does things, and maybe you've actually seen some people do X. But it still doesn't follow that in your situation "great people" are the bottleneck ...

Curious if anyone has examples of tasks X for which the original claims seem in fact true. That's probably the easiest way to convince me that I'm wrong.

comment by Buck · 2020-02-21T05:36:33.990Z · EA(p) · GW(p)

I'm not quite sure how high your bar is for "experience", but many of the tasks that I'm most enthusiastic about in EA are ones which could plausibly be done by someone in their early 20s who eg just graduated university. Various tasks of this type:

  • Work at MIRI on various programming tasks which require being really smart and good at math and programming and able to work with type theory and Haskell. Eg we recently hired Seraphina Nix to do this right out of college. There are other people who are recent college graduates who we offered this job to who didn't accept. These people are unusually good programmers for their age, but they're not unique. I'm more enthusiastic about hiring older and more experienced people, but that's not a hard requirement. We could probably hire several more of these people before we became bottlenecked on management capacity.
  • Generalist AI safety research that Evan Hubinger does--he led the writing of "Risks from Learned Optimization" during a summer internship at MIRI; before that internship he hadn't had much contact with the AI safety community in person (though he'd read stuff online).
    • Richard Ngo is another young AI safety researcher doing lots of great self-directed stuff; I don't think he consumed an enormous amount of outside resources while becoming good at thinking about this stuff.
  • I think that there are inexperienced people who could do really helpful work with me on EA movement building; to be good at this you need to have read a lot about EA and be friendly and know how to talk to lots of people.

My guess is that EA does not have a lot of unidentified people who are as good at these things as the people I've identified.

I think that the "EA doesn't have enough great people" problem feels more important to me than the "EA has trouble using the people we have" problem.

comment by Max_Daniel · 2020-02-21T12:35:19.747Z · EA(p) · GW(p)

Thanks, very interesting!

I agree the examples you gave could be done by a recent graduate. (Though my guess is the community building stuff would benefit from some kinds of additional experience that has trained relevant project management and people skills.)

I suspect our impressions differ in two ways:

1. My guess is I consider the activities you mentioned less valuable than you do. Probably the difference is largest for programming at MIRI and smallest for Hubinger-style AI safety research. (This would probably be a bigger discussion.)

2. Independent of this, my guess would be that EA does have a decent number of unidentified people who would be about as good as people you've identified. E.g., I can think of ~5 people off the top of my head of whom I think they might be great at one of the things you listed, and if I had your view on their value I'd probably think they should stop doing what they're doing now and switch to trying one of these things. And I suspect if I thought hard about it, I could come up with 5-10 more people - and then there is the large number of people neither of us has any information about.

Two other thoughts I had in response:

  • It might be quite relevant if "great people" refers only to talent or also to beliefs and values/preferences. E.g. my guess is that there are several people who could be great at functional programming who either don't want to work for MIRI, or don't believe that this would be valuable. (This includes e.g. myself.) If to count as "great person" you need to have the right beliefs and preferences, I think your claim that "EA needs more great people" becomes stronger. But I think the practical implications would differ from the "greatness is only about talent" version, which is the one I had in mind in the OP.
  • One way to make the question more precise: At the margin, is it more valuable (a) to try to add high-potential people to the pool of EAs or (b) change the environment (e.g. coordination, incentives, ...) to increase the expected value of activities by people in the current pool. With this operationalization, I might actually agree that the highest-value activities of type (a) are better than the ones of type (b), at least if the goal is finding programmers for MIRI and maybe for community building. (I'd still think that this would be because, while there are sufficiently talented people in EA, they don't want to do this, and it's hard to change beliefs/preferences and easier to get new smart people excited about EA. - Not because the community literally doesn't have anyone with a sufficient level of innate talent. Of course, this probably wasn't the claim the person I originally talked to was making.)
comment by Buck · 2020-02-23T01:00:51.794Z · EA(p) · GW(p)
My guess is I consider the activities you mentioned less valuable than you do. Probably the difference is largest for programming at MIRI and smallest for Hubinger-style AI safety research. (This would probably be a bigger discussion.)

I don't think that peculiarities of what kinds of EA work we're most enthusiastic about lead to much of the disagreement. When I imagine myself taking on various different people's views about what work would be most helpful, most of the time I end up thinking that valuable contributions could be made to that work by sufficiently talented undergrads.

Independent of this, my guess would be that EA does have a decent number of unidentified people who would be about as good as people you've identified. E.g., I can think of ~5 people off the top of my head of whom I think they might be great at one of the things you listed, and if I had your view on their value I'd probably think they should stop doing what they're doing now and switch to trying one of these things. And I suspect if I thought hard about it, I could come up with 5-10 more people - and then there is the large number of people neither of us has any information about.

I am pretty skeptical of this. Eg I suspect that people like Evan (sorry Evan if you're reading this for using you as a running example) are extremely unlikely to remain unidentified, because one of the things that they do is think about things in their own time and put the results online. Could you name a profile of such a person, and which of the types of work I named you think they'd maybe be as good at as the people I named?

It might be quite relevant if "great people" refers only to talent or also to beliefs and values/preferences

I am not intending to include beliefs and preferences in my definition of "great person", except for preferences/beliefs like being not very altruistic, which I do count.

E.g. my guess is that there are several people who could be great at functional programming who either don't want to work for MIRI, or don't believe that this would be valuable. (This includes e.g. myself.)

I think my definition of great might be a higher bar than yours, based on the proportion of people who I think meet it? (To be clear I have no idea how good you'd be at programming for MIRI because I barely know you, and so I'm just talking about priors rather than specific guesses about you.)


For what it's worth, I think that you're not credulous enough of the possibility that the person you talked to actually disagreed with you--I think you might doing that thing whose name I forget where you steelman someone into saying the thing you think instead of the thing they think.

comment by Max_Daniel · 2020-02-26T18:00:06.552Z · EA(p) · GW(p)
I don't think that peculiarities of what kinds of EA work we're most enthusiastic about lead to much of the disagreement. When I imagine myself taking on various different people's views about what work would be most helpful, most of the time I end up thinking that valuable contributions could be made to that work by sufficiently talented undergrads.

I agree we have important disagreements other than what kinds of EA work we're most enthusiastic about. While not of major relevance for the original issue, I'd still note that I'm surprised by what you say about various other people's view on EA, and I suspect it might not be true for me: while I agree there are some highly-valuable tasks that could be done by recent undergrads, I'd guess that if I made a list of the most valuable possible contributions then a majority of the entries would require someone to have a lot of AI-weighted generic influence/power (e.g. the kind of influence over AI a senior government member responsible for tech policy has, or a senior manager in a lab that could plausibly develop AGI), and that because of the way relevant existing institutions are structured this would usually require a significant amount of seniority. (It's possible for some smart undergrads to embark on a path culminating in such a position, but my guess this is not the kind of thing you had in mind.)

I am pretty skeptical of this. Eg I suspect that people like Evan (sorry Evan if you're reading this for using you as a running example) are extremely unlikely to remain unidentified, because one of the things that they do is think about things in their own time and put the results online. [...]
I am not intending to include beliefs and preferences in my definition of "great person", except for preferences/beliefs like being not very altruistic, which I do count.

I don't think these two claims are plausibly consistent, at least if "people like Evan" is also meant to exclude beliefs and preferences: For instance, if someone with Evan-level abilities doesn't believe that thinking in their own time and putting results online is a worthwhile thing to do, then the identification mechanism you appeal to will fail. More broadly, someone's actions will generally depend on all kinds of beliefs and preferences (e.g. on what they are able to do, on what people around them expect, on other incentives, ...) that are much more dependent on the environment than relatively "innate" traits like fluid intelligence. The boundary between beliefs/preferences and abilities is fuzzy, but as I suggested at the end of my previous comment, I think for the purpose of this discussion it's most useful to distinguish changes in value we can achieve (a) by changing the "environment" of existing people vs. (b) by adding more people to the pool.

Could you name a profile of such a person, and which of the types of work I named you think they'd maybe be as good at as the people I named?

What do you mean by "profile"? Saying what properties they have, but without identifying them? Or naming names or at least usernames? If the latter, I'd want to ask the people if they're OK with me naming them publicly. But in principle happy to do either of these things, as I agree it's a good way to check if my claim is plausible.

I think my definition of great might be a higher bar than yours, based on the proportion of people who I think meet it?

Maybe. When I said "they might be great", I meant something roughly like: if it was my main goal to find people great at task X, I'd want to invest at least 1-10 hours per person finding out more about how good they'd be at X (this might mean talking to them, giving them some sort of trial tasks etc.) I'd guess that for between 5 and 50% of these people I'd eventually end up concluding they should work full-time doing X or similar.

Also note that originally I meant to exclude practice/experience from the relevant notion of "greatness" (i.e. it just includes talent/potential). So for some of these people my view might be something like "if they did 2 years of deliberate practice, they then would have a 5% to 50% chance of meeting the bar for X". But I know think that probably the "marginal value from changing the environment vs. marginal value from adding more people" operationalization is more useful, which would require "greatness" to include practice/experience to be consistent with it.

If we disagree about the bar, I suspect that me having bad models about some of the examples you gave explains more of the disagreement than me generally dismissing high bars. "Functional programming" just doesn't sound like the kind of task to me with high returns to super-high ability levels, and similar for community building; but it't plausible that there are bundles of tasks involving these things where it matters a lot if you have someone whose ability is 6 instead of 5 standard deviations above the mean (not always well-defined, but you get the idea). E.g. if your "task" is "make a painting that will be held in similar regards as the Mona Lisa" or "prove P != NP" or "be as prolific as Ramanujan at finding weird infinite series for pi", then, sure, I agree we need an extremely high bar.

For what it's worth, I think that you're not credulous enough of the possibility that the person you talked to actually disagreed with you--I think you might doing that thing whose name I forget where you steelman someone into saying the thing you think instead of the thing they think.

Thanks for pointing this out. FWIW, I think there likely is both substantial disagreement between me and that person and that I misunderstood their view in some ways.

comment by richard_ngo · 2020-06-15T16:55:48.401Z · EA(p) · GW(p)

Task X for which the claim seems most true for me is "coming up with novel and important ideas". This seems to be very heavy-tailed, and not very teachable.

I would also expect that, if I poked a bit at these claims, it would usually turn out that X is something like "contribute to this software project at the pace and quality level of our best engineers, w/o requiring any management time" or "convince some investors to give us much more money, but w/o anyone spending any time transferring relevant knowledge".

Neither of these feel like central examples of the type of thing EA needs most. Most of the variance of the impact of the software project will be in how good the idea is; same for most of the variance of the impact of getting funding.

Robin Hanson is someone who's good at generating novel and important ideas. Idk how he got that way, but I suspect it'd be very hard to design a curriculum to recreate that. Do you disagree?

comment by Max_Daniel · 2020-06-16T09:35:40.564Z · EA(p) · GW(p)
Task X for which the claim seems most true for me is "coming up with novel and important ideas". This seems to be very heavy-tailed, and not very teachable.

I agree that the impact from new ideas will be heavy tailed - i.e. a large share of the total value from new ideas will be from the few best ideas, and few people. I'd also guess that this kind of creativity is not that teachable. (Though not super certain about both.)

I feel less sure that 'new ideas' is among the things most needed in EA, when discounted by the difficulty of generating them. (I do think there probably are a number of undiscovered and highly important ideas out there, partly based on EA's track record and partly based on a sense that there are a lot of things we don't know or understand about how to make the long-term future go well.) If I had to guess where to optimally invest flexible resources at the margin, I feel highly uncertain whether it would be in "find people who're good at generating new ideas" versus things like "advance known research directions" or "accumulate AI-weighted influence/power".

comment by richard_ngo · 2020-06-16T17:11:26.470Z · EA(p) · GW(p)

People tend to underestimate the importance of ideas, because it's hard to imagine what impact they will have without doing the work of coming up with them.

I'm also uncertain how impactful it is to find people who're good at generating ideas, because the best ones will probably become prominent regardless. But regardless of that, it seems to me like you've now agreed with the three points that the influential EA made. Those weren't comparative claims about where to invest marginal resources, but rather the absolute claim that it'd be very beneficial to have more talented people.

Then the additional claim I'd make is: some types of influence are very valuable and can only be gained by people who are sufficiently good at generating ideas. It'd be amazing to have another Stuart Russell, or someone in Stephen Pinker's position but more onboard with EA. But they both got there by making pioneering contributions in their respective fields. So when you talk about "accumulating AI-weighted influence", e.g. by persuading leading AI researchers to be EAs, that therefore involves gaining more talented members of EA.

comment by Jonas Vollmer · 2020-06-14T15:51:48.020Z · EA(p) · GW(p)

I stumbled a bit with the framing here: I think it's often the case that you need a lot of person-internal talent (including a good attitude, altruistic commitment, etc.) to learn X.

I'd personally be excited to spend more time on mentorship of EA community members but it feels kind of hard to find potential mentees who aren't already in touch with many other mentors (either because I'm bad at finding them or because we need more "great people" or because I'm not great at mentoring people to learn X).

comment by Max_Daniel · 2020-06-15T12:09:35.812Z · EA(p) · GW(p)

I agree that, basically by definition, higher talent means higher returns on learning. My claim was not that talent is unimportant, but roughly that the answer to "Why don't we have anyone in the community who can do X?" more often is "Because no-one has spent enough effort practicing X." than it is "Because there is no EA who is sufficiently talented that they could do X well given an optimal environment, training etc.".

(More generally, I agree that the OP could do a better job at framing the debate, setting out the key considerations and alternative views etc. I hope to write an improved version in the next few months.)

comment by Max_Daniel · 2020-08-13T09:54:27.201Z · EA(p) · GW(p)

[EA's focus on marginal individual action over structure is a poor fit for dealing with info hazards.]

I tend to think that EAs sometimes are too focused on optimizing the marginal utility of individual actions as opposed to improving larger-scale structures. For example, I think it'd be good if there was much content and cultural awareness on how to build good organizations as there is on how to improve individual cognition. - Think about how often you've heard of "self improvement" or "rationality" as opposed to things like "organizational development".

(Yes, this is similar to the good old 'systemic change' objection aimed at "what EAs tend to do in practice" rather than "what is implied by EAs' normative views".)

It occurred to me that one instance where this might bite in particular are info hazards.

I often see individual researchers agonizing about whether they can publish something they have written, which of several framings to use, and even which ideas are safe to mention in public. I do think that this can sometimes be really important, and that there are areas with a predictably high concentration of such cases, e.g. bio.

However, in many cases I feel like these concerns are far-fetched and poorly targeted.

  • They are far-fetched when they overestimate the effects a marginal publication by a non-prominent person can have on the world. E.g. the US government isn't going to start an AGI project because you posted a thought on AI timelines on LessWrong.
  • They are poorly targeted when they focus on the immediate effects of marginal individual action. E.g., how much does my paper contribute to 'AI capabilities'? What connotations will readers read into different terms I could use for the same concept?

On the other hand, in such cases often there are important info hazards in the areas researchers are working about. For example, I think it's at least plausible that there is true information on, say, the prospects and paths to transformative AI, that would be bad to bring to the attention of, say, senior US or Chinese government officials.

It's not the presence of these hazards but the connection with typical individual researcher actions that I find dubious. To address these concerns, rather than forward chaining from individual action one considers to take for other reasons, I suspect it'd be more fruitful to backward-chain from the location of large adverse effects (e.g. the US government starting an AGI project, if you think that's bad). I suspect this would lead to a focus on structure for the analysis, and a focus on policy for solutions. Concretely, questions like:

  • What are the structural mechanisms for how information gets escalated to higher levels of seniority within, e.g., the US government or Alphabet?
  • Given current incentives, how many publications of potentially hazardous information do we expect, and through which channels?
  • What are mechanisms that can massively amplify the visibility of information? E.g. when will media consider something newsworthy, when and how do new academic subfields form?
comment by Max_Daniel · 2020-01-08T14:26:22.019Z · EA(p) · GW(p)

[Some of my tentative and uncertain views on AI governance, and different ways of having impact in that area. Excerpts, not in order, from things I wrote in a recent email discussion, so not a coherent text.]

1. In scenarios where OpenAI, DeepMind etc. become key actors because they develop TAI capabilities, our theory of impact will rely on a combination of affecting (a) 'structure' and (b) 'content'. By (a) I roughly mean how the relevant decision-making mechanisms look like irrespective of the specific goals and resources of the actors the mechanism consists of; e.g., whether some key AI lab is a nonprofit or a publicly traded company; who would decide by what rules/voting scheme how Windfall profits would be redistributed; etc. By (b) I mean something like how much the CEO of a key firm, or their advisors, care about the long-term future. -- I can see why relying mostly on (b) is attractive, e.g. it's arguably more tractable; however, some EA thinking (mostly from the Bay Area / the rationalist community to be honest) strikes me as focusing on (b) for reasons that seem ahistoric or otherwise dubious to me. So I don't feel convinced that what I perceive to be a very stark focus on (b) is warranted. I think that figuring out if there are viable strategies that rely more on (a) is better done from within institutions that have no ties with key TAI actors, and also might be best done my people that don't quite match the profile of the typical new EA that got excited about Superintelligence or HPMOR. Overall, I think that making more academic research in broadly "policy relevant" fields happen would be a decent strategy if one ultimately wanted to increase the amount of thinking on type-(a) theories of impact.

2. What's the theory of impact if TAI happens in more than 20 years? More than 50 years? I think it's not obvious whether it's worth spending any current resources on influencing such scenarios (I think they are more likely but we have much less leverage). However, if we wanted to do this, then I think it's worth bearing in mind that academia is one of few institutions (in a broad sense) that has a strong track record of enabling cumulative intellectual progress over long time scales. I roughly think that, in a modal scenario, no-one in 50 years is going to remember anything that was discussed on the EA Forum or LessWrong, or within the OpenAI policy team, today (except people currently involved); but if AI/TAI was still (or again) a hot topic then, I think it's likely that academic scholars will read academic papers by Dafoe, his students, the students of his students etc. Similarly, based on track records I think that the norms and structure of academia are much better equipped than EA to enable intellectual progress that is more incremental and distributed (as opposed to progress that happens by way of 'at least one crisp insight per step'; e.g. the Astronomical Waste argument would count as one crisp insight); so if we needed such progress, it might make sense to seed broadly useful academic research now. 


My view is closer to "~all that matters will be in the specifics, and most of the intuitions and methods for dealing with the specifics are either sort of hard-wired or more generic/have different origins than having thought about race models specifically". A crux here might be that I expect most of the tasks involved in dealing with the policy issues that would come up if we got TAI within the next 10-20 years to be sufficiently similar to garden-variety tasks involved in familiar policy areas that as a first pass: (i) if theoretical academic research was useful, we'd see more stories of the kind "CEO X / politician Y's success was due to idea Z developed through theoretical academic research", and (ii) prior policy/applied strategy experience is the background most useful for TAI policy, with usefulness increasing with the overlap in content and relevant actors; e.g.: working with the OpenAI policy team on pre-TAI issues > working within Facebook on a strategy for how to prevent the government to split up the firm in case a left-wing Democrat wins > business strategy for a tobacco company in the US > business strategy for a company outside of the US that faces little government regulation > academic game theory modeling. That's probably too pessimistic about the academic path, and of course it'll depend a lot on the specifics (you could start in academia to then get into Facebook etc.), but you get the idea.


Overall, the only somewhat open question for me is whether ideally we'd have (A) ~only people working quite directly with key actors or (B) a mix of people working with key actors and more independent ones e.g. in academia. It seems quite clear to me that the optimal allocation will contain a significant share of people working with key actors [...]

If there is a disagreement, I'd guess it's located in the following two points: 

(1a) How big are countervailing downsides from working directly with, or at institutions having close ties with, key actors? Here I'm mostly concerned about incentives distorting the content of research and strategic advice. I think the question is broadly similar to: If you're concerned about the impacts of the British rule on India in the 1800s, is it best to work within the colonial administration? If you want to figure out how to govern externalities from burning fossil fuels, is it best to work in the fossil fuel industry? I think the cliche left-wing answer to these questions is too confident in "no" and is overlooking important upsides, but I'm concerned that some standard EA answers in the AI case are too confident in "yes" and are overlooking risks. Note that I'm most concerned about kind of "benign" or "epistemic" failure modes: I think it's reasonably easy to tell people with broadly good intentions apart from sadists or even personal-wealth maximizers (at least in principle -- if this will get implemented is another question); I think it's much harder to spot cases like key people incorrectly believing that it's best if they keep as much control for themselves/their company as possible because after all they are the ones with both good intentions and an epistemic advantage (note that all of this really applies to a colonial administration with little modification, though here in cases such as the "Congo Free State" even the track record of "telling personal-wealth maximizers apart from people with humanitarian intentions" maybe isn't great -- also NB I'm not saying that this argument would necessarily be unsound; i.e. I think that in some situations these people would be correct).

(1b) To what extent to we need (a) novel insights as opposed to (b) an application of known insights or common-sense principles? E.g., I've heard claims that the sale of telecommunication licenses by governments is an example where post-1950 research-level economics work in auction theory has had considerable real-world impact, and AFAICT this kind of auction theory strikes me as reasonably abstract and in little need of having worked with either governments or telecommunication firms. Supposing this is true (I haven't really looked into this), how many opportunities of this kind are there in AI governance? I think the case for (A) is much stronger if we need little to no (a), as I think the upsides from trust networks etc. are mostly (though not exclusively) useful for (b). FWIW, my private view actually is that we probably need very little of (a), but I also feel like I have a poor grasp of this, and I think it will ultimately come down to what high-level heuristics to use in such a situation.

comment by Aaron Gertler (aarongertler) · 2020-01-16T23:27:04.356Z · EA(p) · GW(p)

I found this really fascinating to read. Is there any chance that you might turn it into a "coherent text" at some point?

I especially liked the question on possible downsides of working with key actors; orgs in a position to do this are often accused of collaborating in the perpetuation of bad systems (or something like that), but rarely with much evidence to back up those claims. I think your take on the issue would be enlightening.

comment by Max_Daniel · 2020-01-17T12:00:49.073Z · EA(p) · GW(p)

Thanks for sharing your reaction! There is some chance that I'll write up these and maybe other thoughts on AI strategy/governance over the coming months, but it depends a lot on my other commitments. My current guess is that it's maybe only 15% likely that I'll think this is the best use of my time within the next 6 months.

comment by Max_Daniel · 2020-03-23T09:29:58.406Z · EA(p) · GW(p)

[Epistemic status: speculation based on priors about international organizations. I know next to nothing about the WHO specifically.]

[On the WHO declaring COVID-19 a pandemic only (?) on March 12th. Prompted by this Facebook discussion on epistemic modesty [EA · GW] on COVID-19.]

- [ETA: this point is likely wrong, cf. Khorton's comment [EA(p) · GW(p)] below. However, I believe the conclusion that the timing of WHO declarations by itself doesn't provide a significant argument against epistemic modesty still stands, as I explain in a follow-up comment below [EA(p) · GW(p)].] The WHO declaring a pandemic has a bunch of major legal and institutional consequences. E.g. my guess is that among other things it affects the amounts of resources the WHO and other actors can utilize, the kind of work the WHO and others are allowed to do, and the kind of recommendations the WHO can make.

- The optimal time for the WHO to declare a pandemic is primarily determined by these legal and institutional consequences. Whether COVID-19 is or will in fact be a pandemic in the everyday or epidemiological sense is an important input into the decision, but not a decisive one.

- Without familiarity with the WHO and the legal and institutional system it is a part of, it is very difficult to accurately assess the consequences of the WHO declaring a pandemic. Therefore, it is very hard to evaluate the timing of the WHO's declaration without such familiarity. And being even maximally well-informed about COVID-19 itself isn't even remotely sufficient for an accurate evaluation.

- The bottom line is that the WHO officially declaring that COVID-19 is a pandemic is a totally different thing from any individual persuasively arguing that COVID-19 is or will be a pandemic. In a language that would accurately reflect differences in meaning, me saying that COVID-19 is a pandemic and the WHO declaring COVID-19 is a pandemic would be done using different words. It is simply not the primary purpose of this WHO speech act to be an early, accurate, reliable, or whatever indicator of whether "COVID-19 is a pandemic", to predict its impact, or any other similar thing. It isn't primarily epistemic in any sense.

- If just based on information about COVID-19 itself someone confidently thinks that the WHO ought to have declared a pandemic earlier, they are making a mistake akin to the mistake reflected by answering "yes" to the question "could you pass me the salt?" without doing anything.

So did the WHO make a mistake by not declaring COVID-19 to be a pandemic earlier, and if so how consequential was it? Well, I think the timing was probably suboptimal just because my prior is that most complex institutions aren't optimized for getting the timing of such things exactly right. But I have no idea how consequential a potential mistake was. In fact, I'm about 50-50 on whether the optimal time would have been slightly earlier or slightly later. (Though substantially earlier seems significantly more likely optimal than substantially later.)

comment by Khorton · 2020-03-23T14:02:25.399Z · EA(p) · GW(p)

"The WHO declaring a pandemic has a bunch of major legal and institutional consequences. E.g. my guess is that among other things it affects the amounts of resources the WHO and other actors can utilize, the kind of work the WHO and others are allowed to do, and the kind of recommendations the WHO can make."

Are you sure about this? I've read that there aren't major implications to it being officially declared a pandemic.

This article suggests there aren't major changes based on 'pandemic' status

comment by Max_Daniel · 2020-03-25T16:05:59.869Z · EA(p) · GW(p)

[Epistemic status: info from the WHO website and Wikipedia, but I overall invested only ~10 min, so might be missing something.]

Under the 2005 International Health Regulations (IHR), states have a legal duty to respond promptly to a PHEIC.
[Note by me: The International Health Regulations include multiple instances of "public health emergency of international concern". By contrast, they include only one instance of "pandemic", and this is in the term "pandemic influenza" in a formal statement by China rather than the main text of the regulation.]
  • The WHO declared a PHEIC due to COVID-19 on January 30th.
  • The OP was prompted by a claim that the timing of the WHO using the term "pandemic" provides an argument against epistemic modesty. (Though I appreciate this was less clear in the OP than it could have been, and maybe it was a bad idea to copy my Facebook comment here anyway.) From the Facebook comment I was responding to:
For example, to me, the WHO taking until ~March 12 to call this a pandemic*, when the informed amateurs I listen to were all pretty convinced that this will be pretty bad since at least early March, is at least some evidence that trusting informed amateurs has some value over entirely trusting people usually perceived as experts.
  • Since the WHO declaring a PHEIC seems much more consequential than them using the term "pandemic", the timing of the PHEIC declaration seems more relevant for assessing the merits of the WHO response, and thus for any argument regarding epistemic modesty.
  • Since the PHEIC declaration happened significantly earlier, any argument based on the premise that it happened too late is significantly weaker. And whatever the apparent initial force of this weaker argument, my undermining response from the OP still applies.
  • So overall, while the OP's premise appealing to major legal/institutional consequences of the WHO using the term "pandemic" seems false, I'm now even more convinced of the key claim I wanted to argue for: that the WHO response does not provide an argument against epistemic modesty in general, nor for the epistemic superiority of "informed amateurs" over experts on COVID-19.
comment by Lukas_Gloor · 2020-03-25T21:30:30.044Z · EA(p) · GW(p)

About declaring it a "pandemic," I've seen the WHO reason as follows (me paraphrasing):

«Once we call it a pandemic, some countries might throw up their hands and say "we're screwed," so we should better wait before calling it that, and instead emphasize that countries need to try harder at containment for as long as there's still a small chance that it might work.»

So overall, while the OP's premise appealing to major legal/institutional consequences of the WHO using the term "pandemic" seems false, I'm now even more convinced of the key claim I wanted to argue for: that the WHO response does not provide an argument against epistemic modesty in general, nor for the epistemic superiority of "informed amateurs" over experts on COVID-19.

Yeah, I think that's a good point.

I'm not sure I can have updates in favor or against modest epistemology because it seems to me that my true rejection is mostly "my brain can't do that." But if I could have further updates against modest epistemology, the main Covid-19-related example for me would be how long it took some countries to realize that flattening the curve instead of squishing it is going to lead to a lot more deaths and tragedy than people seem to have initially thought. I realize that it's hard to distinguish between what's actual government opinion versus what's bad journalism, but I'm pretty confident there was a time when informed amateurs could see that experts were operating under some probably false or at least dubious assumptions. (I'm happy to elaborate if anyone's interested.)

comment by MichaelStJules · 2020-03-25T23:31:36.510Z · EA(p) · GW(p)
For example, to me, the WHO taking until ~March 12 to call this a pandemic*, when the informed amateurs I listen to were all pretty convinced that this will be pretty bad since at least early March, is at least some evidence that trusting informed amateurs has some value over entirely trusting people usually perceived as experts.

Also, predicting that something will be pretty bad or will be a pandemic is not the same as saying it is now a pandemic. When did it become a pandemic according to the WHO's definition?

Expanding a quote I found on the wiki page in the transcript here from 2009:

Dr Fukuda: An easy way to think about pandemic – and actually a way I have some times described in the past – is to say: a pandemic is a global outbreak. Then you might ask yourself: “What is a global outbreak”? Global outbreak means that we see both spread of the agent – and in this case we see this new A(H1N1) virus to most parts of the world – and then we see disease activities in addition to the spread of the virus. Right now, it would be fair to say that we have an evolving situation in which a new influenza virus is clearly spreading, but it has not reached all parts of the world and it has not established community activity in all parts of the world. It is quite possible that it will continue to spread and it will establish itself in many other countries and multiple regions, at which time it will be fair to call it a pandemic at that point. But right now, we are really in the early part of the evolution of the spread of this virus and we will see where it goes.

But see also WHO says it no longer uses 'pandemic' category, but virus still emergency from February 24, 2020.

comment by Max_Daniel · 2020-03-23T16:31:48.605Z · EA(p) · GW(p)

Thank you for pointing this out! It sounds like my guess was probably just wrong.

My guess was based on a crude prior on international organizations, not anything I know about the WHO specifically. I clarified the epistemic status in the OP.

comment by Max_Daniel · 2020-06-26T12:46:17.072Z · EA(p) · GW(p)

[Context: This is a research proposal I wrote two years ago for an application. I'm posing it here because I might want to link to it. I plan to spend a few weeks looking into a subquestion: how heavy-tailed is EA talent, and what does this imply for EA community building?]

Research proposal: Assess claims that "impact is heavy-tailed"

Why is this valuable?

EAs frequently have to decide how much resources to invest to estimate the utility of their available options; e.g.:

  • How much time to invest to identify the best giving opportunity?
  • How much research to do before committing to a small set of cause areas?
  • When deciding whether to hire someone now or a more talented person in the future, when is it worth to wait?

One major input to such questions is how heavy-tailed the distribution of altruistic impact is: The better the best options are relative to a random option, the more valuable it is to identify the best options.

Claims like “impact is heavy-tailed” are widely accepted in the EA community—with major strategic consequences (e.g. [1], “Talent is high variance”)—but have sometimes been questioned [2 [LW · GW], 3, 4, 5].

These claims are often made in an imprecise way, which makes it hard to estimate the extent of their practical implications (should you spend a month or a year doing research before deciding?), and hard to check if one actually disagrees about them. E.g., is the claim that we can now see that Einstein did much more for progress in physics than 90% of the world population at his time, or that in 1900 our subjective expected value for the progress Einstein would make would have been much higher than the value for a random physics graduate student, or something in between?

Suggested approach

1. Collect several claims of this type that have been made.
2. Review statistical measures of heavy-tailedness.
3. Limit the project’s scope appropriately. E.g., focus just on the claim that “talent is heavy-tailed” and its implications for community building.
4. Refine claims into precise candidate versions, i.e. something like “looking backwards, the empirical distribution of the number of published papers by researcher looks like it was sampled from a distribution that doesn’t have finite variance” rather than “researcher talent is heavy-tailed”.
5. Assess the veracity of those claims, based on published arguments about them and general properties of heavy-tailed distributions (e.g. [6]). Perhaps gather additional data.
6. Write up the results in an accessible way that highlights the true, precise claims and their practical implications.


  • There probably are good reasons for why “impact is heavy-tailed” is widely accepted. I’m therefore unlikely to produce actionable results.
  • The proposed level of analysis may be too general.
comment by Tobias_Baumann · 2020-06-30T16:31:20.129Z · EA(p) · GW(p)

This could be relevant. It's not about the exact same question (it looks at the distribution of future suffering, not of impact) but some parts might be transferable.

comment by Max_Daniel · 2020-07-20T14:22:02.633Z · EA(p) · GW(p)

[A failure mode of culturally high mental health awareness.]

In my experience, there is a high level of mental health awareness in the EA community. That is, people openly talk about mental health challenges such as depression, and many will know about how to help people facing such challenges (e.g. by helping them to get professional treatment). At least more so than in other communities I've known.

I think this is mostly great, and on net much preferable over low mental health awareness.

However, I recently realized one potential failure mode: There is a risk of falsely overestimating another individual's mental health awareness. For example, suppose I talk to an EA who appears to struggle with depression; I might then think "surely they know that depression is treatable, and most likely they're already doing CBT", concluding there isn't much I can do to help. I might even think "it would be silly for me to mention CBT because it's common knowledge that depression can often be treated that way, and stating facts that are common knowledge is at best superfluous and at worst insulting (because I'd imply the other person might lack some kind of basic knowledge)".

Crucially, this would be a mistake even if I was correct that the person I'm talking to was, by virtue of exposure to the EA community, more likely than usual to have heard of CBT. This is because of a large asymmetry in value: It can be extremely valuable for both one's personal well-being and one's expected impact on the world to e.g. start treatment for depression; the cost of saying something obvious or even slightly annoying pales by comparison.

This suggests a few lessons:

  • If you may be able to help someone cope with mental health challenges, don't forfeit that opportunity just because you assume they've already got all the help they could. (Of course, there may be other valid reasons: e.g., it could be too costly, or inappropriate in a particular context.) Even if you've recently talked to many similar people of whom this was true.
  • If you're still worried about stating the obvious, it may help to be explicit about why. E.g., "I know you may be aware of this, but I'd like to mention something because I think it could have outsized importance if not".
comment by Max_Daniel · 2020-07-20T13:47:03.905Z · EA(p) · GW(p)

[A rebuttal of one argument for hard AI takeoff due to recursive self-improvement which I'm not sure anyone was ever making.

Wrote this as a comment in a Google doc, but thought I might want to link to it sometimes.]

I'm worried that naive appeals to self-improvement are sometimes due to a fallacious inference from current AIs / computer programs to advanced AIs. I.e. the implicit argument is something like:

1. It's much easier to modify code than to do brain surgery.
2. Therefore, advanced AI (self-)improvement will be much easier than human (self-)improvement.

But my worry is that the actual reason why "modifying code" seems more feasible to us is the large complexity difference between current computer programs and human cognition. Indeed, at a physicalist level, it's not at all clear that moving around the required matter on a hard disk or SDD or in RAM or whatever is easier than moving around the required matter in a brain. The difference instead is that we have developed intermediate levels of abstraction (from assembly to high-level programming languages) that massively facilitate the editing process -- they bridge the gap between the hardware level and our native cognition in just the right way. But especially to someone with functionalist or otherwise substrate-neutral inclinations it may seem likely that they key feature that enabled us to construct the intermediate-level abstractions was precisely the small complexity of the "target cognition" compared to our native cognition.

comment by RyanCarey · 2020-07-20T14:08:30.281Z · EA(p) · GW(p)

To evaluate its editability, we can compare AI code to code, and to the human brain, along various dimensions: storage size, understandability, copyability, etc. (i.e. let's decompose "complexity" into "storage size" and "understandability" to ensure conceptual clarity)

For size, AI code seems more similar to humans. AI models are already pretty big, so may be around human-sized by the time a hypothetical AI is created.

For understandability, I would expect it to be more like code, than to a human brain. After all, it's created with a known design and objective that was built intentionally. Even if the learned model has a complex architecture, we should be able to understand its relatively simpler training procedure and incentives.

And then, an AI code will, like ordinary code - and unlike the human brain - be copyable, and have a digital storage medium, which are both potentially critical factors for editing.

Size (i.e. storage complexity) doesn't seem like a very significant factor here.

I'd guess the editability of AI code would resemble the editability of code moreso than that of a human brain. But even if you don't agree, I think this points at a better way to analyse the question.

comment by Max_Daniel · 2020-07-20T14:34:04.474Z · EA(p) · GW(p)

Agree that looking at different dimensions is more fruitful.

I also agree that size isn't important in itself, but it might correlate with understandability.

I may overall agree with AI code understandability being closer to code than the human brain. But I think you're maybe a bit quick here: yes, we'll have a known design and intentional objective on some level. But this level may be quite far removed from "live" cognition. E.g. we may know a lot about developmental psychology or the effects of genes and education, but not a lot about how to modify an adult human brain in order to make specific changes. The situation could be similar from an AI system's perspective when trying to improve itself.

Copyability does seem like a key difference that's unlikely to change as AI systems become more advanced. However, I'm not sure if it points to rapid takeoffs as opposed to orthogonal properties. (Though it does if we're interested in how quickly the total capacity of all AI system grows, and assume hardware overhang plus sufficiently additive capabilities between systems.) To the extent it does, the mechanism seems to be relevantly different from recursive self-improvement - more like "sudden population explosion".

comment by Max_Daniel · 2020-07-20T14:36:38.582Z · EA(p) · GW(p)

Well, I guess copyability would help with recursive self-improvement as follows: it allows to run many experiments in parallel that can be used to test the effects of marginal changes.

comment by JP Addison (jpaddison) · 2020-07-20T14:32:46.622Z · EA(p) · GW(p)

I would expect advanced AI systems to still be improveable in a way that humans are not. You might lose all ability to see inside the AI's thinking process, but you could still make hyperparameter tweaks. Humans you can also make hyperparameter tweaks, but unless your think AIs will take 20 years to train, it still seems easier than comparable human improvement.

comment by Max_Daniel · 2020-07-20T16:48:52.271Z · EA(p) · GW(p)

Fair point. It seems that the central property of AI systems this arguments rests on is their speed, or the time until you get feedback. I agree it seems likely that AI training time (and then ability to evaluate performance on withheld test data or similar) in wall-clock speed will be shorter than feedback loops for humans (e.g. education reforms, genetic engineering, ...).

However, some ways in which this could fail to enable rapid self-improvement:

  • The speed advantage could be offset by other differences, e.g. even less interpretable "thinking processes".
  • Performance at certain tasks may be bottlenecked by feedback from slow real-world interactions. (If sim2real transfer doesn't work well for some tasks.)
comment by Max_Daniel · 2020-07-20T13:21:24.075Z · EA(p) · GW(p)

**Would some moral/virtue intuitions prefer small but non-zero x-risk?**

[Me trying to lean into a kind of philosophical reasoning I don't find plausible. Not important, except perhaps as cautionary tale for what kind of things could happen if you wanted to base the case for reducing x-risk on purely non-consequentialist reasons.]

(Inspired by a conversation with Toby Newberry about something else.)

The basic observation: we sometimes think that a person achieving some success is particularly praiseworthy, remarkable, virtuous, or similar if they could have failed. (Or if they needed to expend a lot of effort, suffered through hardship, etc.)

Could this mean that we removed one source of value if we reduced x-risk to zero? Achieving our full potential would then no longer constitute a contingent achievement - it would be predetermined, with failure no longer on the table.

We can make the thought more precise in a toy model: Suppose that at some time t_0 x-risk is permanently reduced to zero. The worry is that acts happening after t_0 (or perhaps acts of agents born after t_0), even if they produce value, are less valuable in one respect: In their role of being a part of us eventually achieving our full potentially, they can no longer fail. More broadly, humanity's great generation-spanning project (whatever that is) can no longer fail. Those humans living after t_0 therefore have a less valuable part in that project. They merely help push along a wagon that was set firmly in its tracks by their ancestors. Their actions may have various valuable consequences, but they no longer increase the probability of humanity's grand project succeeding.

(Similarly, if we're inclined to regard humanity as a whole, or generations born after t_0, as moral agents in their own right we might worry that zero x-risk detracts from the value of their actions.)

Some intuition pumps:

  • Suppose that benevolent parents arrange things (perhaps with the help of fantastic future technology) in such a way that their child will with certainty have a highly successful career. (They need not remove that child's agency or autonomy: perhaps the child can choose if she'll become a world-class composer, scientist, political leader, etc. - but no matter what she does, she'll excel and will be remembered for their great achievements.) We can even suppose that it will seem to the child as if it could have failed, and that she did experience small-scale setbacks: perhaps she went rock climbing, slipped, and broke her leg; but unbeknownst to her she was watched by invisible drones that would have caught her if she would have otherwise died. And so on for every single moment of her life.
  • Suppose that the matriarch of a family dynasty arranges things in such a way that the family business's success will henceforth be guaranteed. Stock price, cash flow, and so on may still go up and down, but the firm can no longer go bankrupt and will broadly be the most profitable company in its sector, forever. Her heirs might on some occasions think that their management has turned things around, and prevented the company's downfall. But really the shots were called by a great grandmother generations ago.

We may think that there is something objectionable, even dystopian about these situations. At the very least, we may think that the apparent successes of the child prodigy or the family business leaders count for less because, in one respect, they could not have failed.

If we give a lot of weight to such worries we may not want to eliminate x-risk. Instead, perhaps, we'd conclude that it's best to carefully manage x-risk: at any point, it should not be so high that we run an irresponsible risk of squandering all future potential - but it also should not be so low that our children are robbed of more value than we protect.


Some reasons why I think this is either implausible or irrelevant:

  • I think the intuitions underlying the basic observation aren't about axiology, probably not even about normative ethics in any sense. I think they are best explained away by evolutionary debunking arguments, or taken to be heuristics about instrumental value, or (if one wants to take them as seriously as plausibly defensible) about some primitive notion of 'virtue' or 'praiseworthiness' that's different from axiology.
  • This at least partly relies on intuitions about the import of "could have done otherwise" claims, which may be confused or deceptive anyway. For example, we might think that Frankfurt-style examples show that such claims aren't actually relevant for moral responsibility, despite appearing to be at first glance. And we might think this is a reason to also not take them seriously in this context.
  • Even if there's something to this, on plausible views the loss of value from removing x-risk would be overcompensated by the gains.
  • Reducing x-risk to exactly zero, as opposed to something tiny, is implausible anyway. (Though we might worry that all of this applies if x-risk falls below some positive threshold, or that there is some gradual loss of value as x-risk falls.)
  • The worry straightforwardly only applies in scenarios where we can permanently reduce x-risk to zero, no matter what future agents do. This seems extremely implausible. Even if we could reduce x-risk to zero, most likely it would require the continuing effort of future agents to keep it at zero.
comment by Larks · 2020-08-06T03:48:57.691Z · EA(p) · GW(p)

Interesting post. I think I have a couple of thoughts, please forgive the uneditted nature.

One issue is whether more than one person can get credit for the same event. If this is the case, then both the climber girl and the parents can get credit for her surviving the climb (after all, both their actions were sufficient). Similarly, both we and the future people can get credit for saving the world.

If not, then only one person can get the credit for every instance of world saving. Either we can harvest them now, or we can leave them for other people to get. But the latter strategy involves the risk that they will remain unharvested, leading to a reduction in the total quantity of creditworthiness mankind accrues. So from the point of view of an impartial maximiser of humanity's creditworthiness, we should seize as many as we can, leaving as little as possible for the future.

Secondly, as a new parent I see the appeal of the invisible robots of deliverance! I am keen to let the sproglet explore and stake out her own achievements, but I don't think she loses much when I keep her from dying. She can get plenty of moral achievement from ascending to new heights, even if I have sealed off the depths.

Finally, there is of course the numerical consideration that even if facing a 1% risk of extinction carried some inherent moral glory, it would also reduce the value of all subsequent things by 1% (in expectation). Unless you think the benefit from our children, rather than us, overcoming that risk is large compared to the total value of the future of humanity, it seems like we should probably deny them it.

comment by Max_Daniel · 2020-08-06T07:38:07.927Z · EA(p) · GW(p)

Thanks, this all makes sense to me. Just one quick comment:

So from the point of view of an impartial maximiser of humanity's creditworthiness, we should seize as many as we can, leaving as little as possible for the future.

If I understand you correctly, your argument for this conclusion assumed that the total number of world-saving instances is fixed independently of anyone's actions. But I think in practice this is wrong, i.e. the number of world-saving opportunities is endogenous to people's actions including in particular whether they reap current world-saving opportunities.

Oversimplified example: perhaps currently there is one world-saving instance per year from Petrov-style incidents, i.e. countries not launching a nuclear strike in response to a false alarm of a nuclear attack. But if there was a breakthrough in nuclear disarmament that reduced nuclear stockpiles to zero this would also eliminate these future world-saving opportunities.

[Oversimplified b/c in fact a nuclear exchange isn't clearly an x-risk.]

comment by Larks · 2020-08-06T18:38:12.132Z · EA(p) · GW(p)

Hey, yes - I would count that nuclear disarmament breakthrough as being equal to the sum of those annual world-saving instances. So you're right that the number of events isn't fixed, but their measure (as in the % of the future of humanity saved) is bounded.

comment by matthew.vandermerwe · 2020-08-07T08:48:56.550Z · EA(p) · GW(p)

Nice post. I’m reminded of this Bertrand Russell passage:

“all the labours of the ages, all the devotion, all the inspiration, all the noonday brightness of human genius, are destined to extinction in the vast death of the solar system, and that the whole temple of Man's achievement must inevitably be buried beneath the debris of a universe in ruins ... Only within the scaffolding of these truths, only on the firm foundation of unyielding despair, can the soul's habitation henceforth be safely built.” —A Free Man’s Worship, 1903

I take Russell as arguing that the inevitability (as he saw it) of extinction undermines the possibility of enduring achievement, and that we must therefore either ground life’s meaning in something else, or accept nihilism.

At a stretch, maybe you could run your argument together with Russell's — if we ground life’s meaning in achievement, then avoiding nihilism requires that humanity neither go extinct nor achieve total existential security.

comment by Lukas_Gloor · 2020-08-06T13:36:37.584Z · EA(p) · GW(p)

Related: Relationships in a post-singularity future can also be set up to work well, so that the setup overdetermines any efforts by the individuals in them.

To me, that takes away the whole point. I don't think this would feel less problematic if somehow future people decided to add some noise to the setup, such that relationships occasionally fail.

The reason I find any degree of "setup" problematic is because this seems like emphasizing the self-oriented benefits one gets out of relationships, and de-emphasizing the from-you-independent identity of the other person. It's romantic to think that there's a soulmate out there who would be just as happy to find you as you are about finding them. It's not that romantic to think about creating your soulmate with the power of future technology (or society doing this for you).

This is the "person-affecting intuition for thinking about soulmates." If the other person exists already, I'd be excited to meet them, and would be motivated to put in a lot of effort to make things work, as opposed to just giving up on myself in the face of difficulties. By contrast, if the person doesn't exist yet or won't exist in a way independent of my actions, I feel like there's less of a point/appeal to it.