Hmm, I think this is fair, rereading that comment.
I feel a bit confused here, since at the scale that Robin is talking about, timelines and takeoff speeds seem very inherently intertwined (like, if Robin predicts really long timelines, this clearly implies a much slower takeoff speed, especially when combined with gradual continuous increases). I agree there is a separate competitiveness dimension that you and Robin are closer on, which is important for some of the takeoff dynamics, but on overall takeoff speed, I feel like you are closer to Eliezer than Robin (Eliezer predicting weeks to months to cross the general intelligence human->superhuman gap, you predicting single-digit years to cross that gap, and Hanson predicting decades to cross that gap). Though it's plausible that I am missing something here.
In any case, I agree that my summary of your position here is misleading, and will edit accordingly.
Definitely in-expectation I would expect the week doing ELK to have had pretty good effects on your community-building, though I don't think the payoff is particularly guaranteed, so my guess would be "Yes".
Thinks like engaging with ELK, thinking through Eliezer's List O' Doom, thinking through some of the basics of biorisk seem all quite valuable to me, and my takes on those issues are very deeply entangled with a lot of community-building decisions I make, so I expect similar effects for you.
My hot-take for the EA Forum team (and for most of CEA in-general) is that it would probably increase its impact on the world a bunch if people on the team participated more in object-level discussions and tried to combine their models of community-building more with their models of direct-work.
I've tried pretty hard to stay engaged with the AI Alignment literature and the broader strategic landscape during my work on LessWrong, and I think that turned out to be really important for how I thought about LW strategy.
I indeed think it isn't really possible for the EA Forum team to not be making calls about what kind of community-building work needs to happen. I don't think anyone else at CEA really has the context to think about the impact of various features on the EA Forum, and the team is inevitably going to have to make a lot of decisions that will have a big influence on the community, in a way that makes it hard to defer.
Then the post gives some evidence that, at each stage of his career, Yudkowsky has made a dramatic, seemingly overconfident prediction about technological timelines and risks - and at least hasn’t obviously internalised lessons from these apparent mistakes.
I am confused about you bringing in the claim of "at each stage of his career", given that the only two examples you cited that seemed to provide much evidence here were from the same (and very early) stage of his career. Of course, you might have other points of evidence that point in this direction, but I did want to provide some additional pushback on the "at each stage of his career" point, which I think you didn't really provide evidence for.
I do think finding evidence for each stage of his career would of course be time-consuming, and I understand that you didn't really want to go through all of that, but it seemed good to point out explicitly.
Ultimately, I don’t buy the comparison. I think it’s really out-of-distribution for someone in their late teens and early twenties to pro-actively form the view that an emerging technology is likely to kill everyone within a decade, found an organization and devote years of their professional life to address the risk, and talk about how they’re the only person alive who can stop it.
FWIW, indeed in my teens I basically did dedicate a good chunk of my time and effort towards privacy efforts out of a concern for US and UK-based surveillance-state concerns. I was in high-school, so making it my full-time efforts was a bit hard, though I did help found a hackerspace in my hometown that had a lot of privacy concerns baked into the culture, and I did write a good number of essays on this. I think the key difference between me and Eliezer here is more the fact that Eliezer was home-schooled and had experience doing things on his own, and not some kind of other fact about his relationship to the ideas being very different.
It's plausible you should update similarly on me, which I think isn't totally insane (I do think I might have, as Luke put it, the "taking ideas seriously gene", which I would also associate with taking other ideas to their extremes, like religious beliefs).
Hmm, I think these are good points. My best guess is that I don't think we would have a strong connection to Hanson without Eliezer, though I agree that that kind of credit is harder to allocate (and it gets fuzzy what we even mean by "this community" as we extend into counterfactuals like this).
I do think the timeline here provides decent evidence in favor of less credit allocation (and I think against the stronger claim "we wouldn't have a culture of [forecasting and predictions] without Eliezer"). My guess is in terms of causing that culture to take hold, Eliezer is probably still the single most-responsible individual, though I do now expect (after having looked into a bunch of comment threads from 1996 to 1999 and seeing many familiar faces show up) that a lot of the culture would show up without Eliezer.
Yes, definitely much more than Philip Tetlock, given that our community had strong norms of forecasting and making bets before Tetlock had done most of his work on the topic (Expert Political Forecasting was out, but as far as I can tell was not a major influence on people in the community, though I am not totally confident of that).
Does that particular quote from Yudkowsky not strike you as slightly arrogant?
I am generally strongly against a culture of fake modesty. If I want people to make good decisions, they need to be able to believe things about them that might sound arrogant to others. Yes, it sounds arrogant to an external audience, but it also seems true, and it seems like whether it is true should be the dominant fact on whether it is good to say.
I mean... it is true that Eliezer really did shape the culture in the direction of forecasting and predictions and that kind of stuff. My best guess is that without Eliezer, we wouldn't have a culture of doing those things (and like, the AI Alignment community as is probably wouldn't exist). You might disagree with me and him on this, in which case sure, update in that direction, but I don't think it's a crazy opinion to hold.
I am not sure about the question. Yeah, this is a quote from the linked post, so he wrote those sections.
Also, yeah, seems like Eliezer has had a very large effect on whether this community uses things like probability distributions, models things in a bayesian way, makes lots of bets, and pays attention to things like forecasting track records. I don't think he gets to take full credit for those norms, but my guess is he is the single individual who most gets to take credit for those norms.
Eliezer writes a bit about his early AI timeline and nanotechnology opinions here, though it sure is a somewhat obscure reference that takes a bunch of context to parse:
Luke Muehlhauser reading a previous draft of this (only sounding much more serious than this, because Luke Muehlhauser): You know, there was this certain teenaged futurist who made some of his own predictions about AI timelines -
Eliezer: I'd really rather not argue from that as a case in point. I dislike people who screw up something themselves, and then argue like nobody else could possibly be more competent than they were. I dislike even more people who change their mind about something when they turn 22, and then, for the rest of their lives, go around acting like they are now Very Mature Serious Adults who believe the thing that a Very Mature Serious Adult believes, so if you disagree with them about that thing they started believing at age 22, you must just need to wait to grow out of your extended childhood.
Luke Muehlhauser (still being paraphrased): It seems like it ought to be acknowledged somehow.
Eliezer: That's fair, yeah, I can see how someone might think it was relevant. I just dislike how it potentially creates the appearance of trying to slyly sneak in an Argument From Reckless Youth that I regard as not only invalid but also incredibly distasteful. You don't get to screw up yourself and then use that as an argument about how nobody else can do better.
Humbali: Uh, what's the actual drama being subtweeted here?
Eliezer: A certain teenaged futurist, who, for example, said in 1999, "The most realistic estimate for a seed AI transcendence is 2020; nanowar, before 2015."
Humbali: This young man must surely be possessed of some very deep character defect, which I worry will prove to be of the sort that people almost never truly outgrow except in the rarest cases. Why, he's not even putting a probability distribution over his mad soothsaying - how blatantly absurd can a person get?
Eliezer: Dear child ignorant of history, your complaint is far too anachronistic. This is 1999 we're talking about here; almost nobody is putting probability distributions on things, that element of your later subculture has not yet been introduced. Eliezer-2002 hasn't been sent a copy of "Judgment Under Uncertainty" by Emil Gilliam. Eliezer-2006 hasn't put his draft online for "Cognitive biases potentially affecting judgment of global risks". The Sequences won't start until another year after that. How would the forerunners of effective altruism in 1999 know about putting probability distributions on forecasts? I haven't told them to do that yet! We can give historical personages credit when they seem to somehow end up doing better than their surroundings would suggest; it is unreasonable to hold them to modern standards, or expect them to have finished refining those modern standards by the age of nineteen.
Though there's also a more subtle lesson you could learn, about how this young man turned out to still have a promising future ahead of him; which he retained at least in part by having a deliberate contempt for pretended dignity, allowing him to be plainly and simply wrong in a way that he noticed, without his having twisted himself up to avoid a prospect of embarrassment. Instead of, for example, his evading such plain falsification by having dignifiedly wide Very Serious probability distributions centered on the same medians produced by the same basically bad thought processes.
But that was too much of a digression, when I tried to write it up; maybe later I'll post something separately.
While also including some other points, I do read it as a pretty straightforward "Yes, I was really wrong. I didn't know about cognitive biases, and I did not know about the virtue of putting probability distributions on things, and I had not thought enough about the art of thinking well. I would not make the same mistakes today.".
One quick response, since it was easy (might respond more later):
Overall, then, I do think it's fair to consider a fast-takeoff to be a core premise of the classic arguments. It wasn't incidental or a secondary consideration.
I do think takeoff speeds between 1 week and 10 years are a core premise of the classic arguments. I do think the situation looks very different if we spend 5+ years in the human domain, but I don't think there are many who believe that that is going to happen.
I don't think the distinction between 1 week and 1 year is that relevant to the core argument for AI Risk, since it seems in either case more than enough cause for likely doom, and that premise seems very likely to be true to me. I do think Eliezer believes things more on the order of 1 week than 1 year, but I don't think the basic argument structure is that different in either case (though I do agree that the 1 year opens us up to some more potential mitigating strategies).
Hmm, I think that part definitely has relevance. Clearly we would trust Eliezer less if his response to that past writing was "I just got unlucky in my prediction, I still endorse the epistemological principles that gave rise to this prediction, and would make the same prediction, given the same evidence, today".
If someone visibly learns from forecasting mistakes they make, that should clearly update us positively on them not repeating the same mistakes.
It seems that half of these examples are from 15+ years ago, from a period for which Eliezer has explicitly disavowed his opinions (and the ones that are not strike me as most likely correct, like treating coherence arguments as forceful and that AI progress is likely to be discontinuous and localized and to require relatively little compute).
Let's go example-by-example:
1. Predicting near-term extinction from nanotech
This critique strikes me as about as sensible as digging up someone's old high-school essays and critiquing their stance on communism or the criminal justice system. I want to remind any reader that this is an opinion from 1999, when Eliezer was barely 20 years old. I am confident I can find crazier and worse opinions for every single leadership figure in Effective Altruism, if I am willing to go back to what they thought while they were in high-school. To give some character, here are some things I believed in my early high-school years:
The economy was going to collapse because the U.S. was establishing a global surveillance state
Nuclear power plants are extremely dangerous and any one of them is quite likely to explode in a given year
We could have easily automated the creation of all art, except for the existence of a vaguely defined social movement that tries to preserve the humanity of art-creation
These are dumb opinions. I am not ashamed of having had them. I was young and trying to orient in the world. I am confident other commenters can add their own opinions they had when they were in high-school. The only thing that makes it possible for someone to critique Eliezer on these opinions is that he was virtuous and wrote them down, sometimes in surprisingly well-argued ways.
If someone were to dig up an old high-school essay of mine, in-particular one that has at the top written "THIS IS NOT ENDORSED BY ME, THIS IS A DUMB OPINION", and used it to argue that I am wrong about important cause prioritization questions, I would feel deeply frustrated and confused.
For context, on Eliezer's personal website it says:
My parents were early adopters, and I’ve been online since a rather young age. You should regard anything from 2001 or earlier as having been written by a different person who also happens to be named “Eliezer Yudkowsky”. I do not share his opinions.
2. Predicting that his team had a substantial chance of building AGI before 2010
Given that this is only 2 years later, all my same comments apply. But let's also talk a bit about the object-level here.
This is the quote on which this critique is based:
Our best guess for the timescale is that our final-stage AI will reach transhumanity sometime between 2005 and 2020, probably around 2008 or 2010. As always with basic research, this is only a guess, and heavily contingent on funding levels.
This... is not a very confident prediction. This paragraph literally says "only a guess". I agree, if Eliezer said this today, I would definitely dock him some points, but this is again a freshman-aged Eliezer, and it was more than 20 years ago.
But also, I don't know, predicting AGI by 2020 from the year 2000 doesn't sound that crazy. If we didn't have a whole AI winter, if Moore's law had accelerated a bit instead of slowed down, if more talent had flowed into AI and chip-development, 2020 doesn't seem implausible to me. I think it's still on the aggressive side, given what we know now, but technological forecasting is hard, and the above sounds more like a 70% confidence interval instead of a 90% confidence interval.
3. Having high confidence that AI progress would be extremely discontinuous and localized and not require much compute
This opinion strikes me as approximately correct. I still expect highly discontinuous progress, and many other people have argued for this as well. Your analysis that the world looks more like Hanson's world described in the AI foom debate also strikes me as wrong (and e.g. Paul Christiano has also said that Hanson's predictions looked particularly bad in the FOOM debate. EDIT: I think this was worded too strong, and while Paul had some disagreements with Robin, on the particular dimension of discontinuity and competitiveness, Paul thinks Robin came away looking better than Eliezer). Indeed, I would dock Hanson many more points in that discussion (though, overall, I give both of them a ton of points, since they both recognized the importance of AI-like technologies early, and performed vastly above baseline for technological forecasting, which again, is extremely hard).
This seems unlikely to be the right place for a full argument on discontinuous progress. However, continuous takeoff is very far from consensus in the AI Alignment field, and this post seems to try to paint it as such, which seems pretty bad to me (especially if it's used in a list with two clearly wrong things, without disclaiming it as such).
4. Treating early AI risk arguments as close to decisive
My point, here, is not necessarily that Yudkowsky was wrong, but rather that he held a much higher credence in existential risk from AI than his arguments justified at the time. The arguments had pretty crucial gaps that still needed to be resolved, but, I believe, his public writing tended to suggest that these arguments were tight and sufficient to justify very high credences in doom.
I think the arguments are pretty tight and sufficient to establish the basic risk argument. I found your critique relatively uncompelling. In particular, I think you are misrepresenting that a premise of the original arguments was a fast takeoff. I can't currently remember any writing that said it was a necessary component of the AI risk arguments that takeoff happens fast, or at least whether the distinction between "AI vastly exceeds human intelligence in 1 week vs 4 years" is that crucial to the overall argument, which is as far as I can tell the range that most current opinions in the AI Alignment field falls into (and importantly, I know of almost no one who believes that it could take 20+ years for AI to go from mildly subhuman to vastly superhuman, which does feel like it could maybe change the playing field, but also seems to be a very rarely held opinion).
Indeed, I think Eliezer was probably underconfident in doom from AI, since I currently assign >50% probability to AI Doom, as do many other people in the AI Alignment field.
Coherence arguments do indeed strike me as one of the central valid arguments in favor of AI Risk. I think there was a common misunderstanding that did confuse some people, but that misunderstanding was not argued for by Eliezer or other people at MIRI, as far as I can tell (and I've looked into this for 5+ hours as part of discussions with Rohin and Richard).
The central core of coherence arguments, which are based in arguments of competetiveness and economic efficiency strike me as very strong, robustly argued for, and one of the main reasons for why AI Risk will be dangerous. The Neumann-Morgensterm theorem does play a role here, though it's definitely not sufficient to establish a strong case, and Rohin and Richard have successfully argued against that, though I don't think Eliezer has historically argued that the Neumann-Morgenstern theorem is sufficient to establish an AI-alignment relevant argument on its own (though Dutch-book style arguments are very suggestive for the real structure of the argument).
Given my disagreements with the above, I think doing so would be a mistake. But even without that, let's look at the merits of this critique.
For the two "clear cut" examples, Eliezer has posted dozens of times on the internet that he has disendorsed his views from before 2002. This is present on his personal website, the relevant articles are no longer prominently linked anywhere, and Eliezer has openly and straightforwardly acknowledged that his predictions and beliefs from the relevant period were wrong.
For the disputed examples, Eliezer still believes all of these arguments (as do I), so it would be disingenuous for Eliezer to "acknowledge his mixed track record" in this domain. You can either argue that he is wrong, or you can argue that he hasn't acknowledged that he has changed his mind and was previously wrong, but you can't both argue that Eliezer is currently wrong in his beliefs, and accuse him of not telling others that he is wrong. I want people to say things they believe. And for the only two cases where you have established that Eliezer has changed his mind, he has extensively acknowledged his track record.
Some comments on the overall post:
I really dislike this post. I think it provides very little argument, and engages in extremely extensive cherry-picking in a way that does not produce a symmetric credit-allocation (i.e. most people who are likely to update downwards on Yudkowsky on the basis of this post, seem to me to be generically too trusting, and I am confident I can write a more compelling post about any other central figure in Effective Altruism that would likely cause you to update downwards even more).
I think a good and useful framing on this post could have been "here are 3 points where I disagree with Eliezer on AI Risk" (I don't think it would have been useful under almost any circumstance to bring up the arguments from the year 2000). And then to primarily spend your time arguing about the concrete object-level. Not to start a post that is trying to say that Eliezer is "overconfident in his beliefs about AI" and "miscalibrated", and then to justify that by cherry-picking two examples from when Eliezer was barely no longer a teenager, and three arguments on which there is broad disagreement within the AI Alignment field.
I also dislike calling this post "On Deference and Yudkowsky's AI Risk Estimates", as if this post was trying to be an unbiased analysis of how much to defer to Eliezer, while you just list negative examples. I think this post is better named "against Yudkowsky on AI Risk estimates". Or "against Yudkowsky's track record in AI Risk Estimates". Which would have made it clear that you are selectively giving evidence for one side, and more clearly signposted that if someone was trying to evaluate Eliezer's track record, this post will only be a highly incomplete starting point.
I have many more thoughts, but I think I've written enough for now. I think I am somewhat unlikely to engage with replies in much depth, because writing this comment has already taken up a lot of my time, and I expect given the framing of the post, discussion on the post to be unnecessarily conflicty and hard to navigate.
I have some hope that splitting out votes into two dimensions (approval and agreement) might help with situations like this. At least it seems to have helped with some recent AI-adjacent threads on LW that were also pretty divisive.
I like the content changes. I think the design seems worse than the previous one. In-particular the information density has gone down a lot, which feels in-tension with OpenPhil's target audience. See e.g. this screenshot:
This screenshot feels to me almost like a screenshot from a tablet, but it's taken from my pretty zoomed out and large laptop screen. The site starts feeling a lot less cluttered and navigable at 75% zoom to me:
Some other comments:
I don't think I like the new logo. I am not exactly sure why, but my first reaction to seeing it was something like "this is a logo for a Christian church". I think the cross-shape in the middle, and just the general association of "radiating light" with "christianity" seems like a candidate. I preferred the previous logo here. The logo is also just generically non-distinct (i.e. if you gave me just its outline on a piece of clothing or paper, there is little chance I would be able to recognize it as the OpenPhil logo). It's also almost completely invisible and indistinguishable in my tab-bar, which is another litmus test I use for logos.
Overall the site feels like it's gotten slightly harder to navigate. There is lots of padding, whitespace doesn't seem to be used super effectively, and text-contrast issues are more present than in the old site (for example, on the frontpage, the navigation at the top has pretty bad contrast against the first image that's displayed in the carousel).
A lot of pages feel kind of unpolished. The team page is just a giant list of names in a layout that feels very overwhelming and not well-optimized for this number of items in the list.
The site is extremely slow, given that it's just a static site that presumably doesn't have a lot of reason for dynamic content. When I click on a menu item, I have to wait (on a fast interconnection, on an M1 laptop) for almost a two full seconds until I see a page transition, and then another 500ms for the page transition to complete. Generally, clicking around on the website feels very sluggish.
I feel like a lot of content hierarchy isn't very clearly established. The grants database, which is the page I use most frequently, now has a white header with lots of white UI elements that blend into each other as you scroll, without any drop-shadow, that make me very confused where different containers are supposed to be located (I like playing with whitespace in order to establish hierarchy, but I think the current design fails at that)
Bullet lists in-particular seem to have kind of bad formatting:
The link styling is (IMO) both overwhelming and not super intuitive, with the the primary way links are made distinct is by giving them a black underline. I think it's pretty standard across the internet to have links be colored, and one should stick to that style unless there is a good reason. Potentially the links could look better if the underline was a bit further below the text, and dashed or something, but probably the links should just be colored (I've tried making underline-only links work on LW, and gave up after a while)
The tag section has kind of bad text contrast, and feels a bit hard to read to me (grey text on grey background)
Why do all three of these items have like 200px of whitespace at the bottom? I think it looks bad, and contributes to the overall low information-density on the site:
It's worse here:
These menus have an inconsistent click area. E.g. the whitespace to the right of the text "TEAM" should be clickable, but it's not, and this has caused me to click into nothingness many times while playing around with the site:
I like the site more, and it seems to more accurately convey what OpenPhil is about. I could probably leave more comments on content, but I figured I have a particular comparative advantage about commenting on webdesign.
Enough people look at the All-Posts page that this is rarely an issue, at least on LessWrong where I've looked at the analytics for this. Indeed, many of the most active voters prefer to use the all-posts page, and a post having negative karma tends to actually attract a bit more attention than a post having low karma.
My current model is that seeing predictable updating is still bayesian evidence of people following subpar algorithm, though it's definitely not definite evidence.
To formalize this, assume you have two hypotheses about how Metaculus users operate:
H1: They perform correct bayesian updates
H2: They update sluggishly to new evidence
First, let's discuss priors between these two hypotheses. IIRC we have a decent amount of evidence that sluggish updating is a pretty common occurrence in forecasting contexts, so raising sluggish updating in this context doesn't seem unreasonable to me. Also, anecdotally, I find it hard to avoid sluggish updating, even if I try to pay attention to it, and would expect that it's a common epistemic mistake.
Now, predictable updating is clearly more likely under H2 than under H1. The exact odds ratio of course depends on the questions being asked, but my guess is that in the context of Metaculus, something in the range of 2:1 for the data observed seems kind of reasonable to me (though this is really fully made up). This means the sentence that if you don't have the problem of sluggish updating, you frequently see people update slowly in your direction, seems correct and accurate.
I do think Eliezer is just wrong when he says that a proper bayesian would have beliefs that look like a "random epistemic walk", unless that "epistemic" modifier there is really doing a lot of work that is not-intuitive. If I am not super confused, the only property that sequences of beliefs should fulfill based on conservation of expected evidence is the Martingale property, which is a much broader class than random walks.
I am confused why the title of this post is: "The biggest risk of free-spending EA is not optics or epistemics, but grift" (emphasis added). As Zvi talks about extensively in his moral mazes sequence, the biggest problems with moral mazes and grifters is that many of their incentives actively point away from truth-seeking behavior and towards trying to create confusing environments in which it is hard to tell who is doing real work and who is not. If it was just the case that a population of 50% grifters and 50% non-grifters would be half as efficient as a population of 0% grifters and 100% non-grifters, that wouldn't be that much of an issue. The problem is that a population of 50% grifters and 50% non-grifters probably has approximately zero ability to get anything done, or react to crises, and practically everyone within that group (including the non-grifters) will have terrible models of the world.
I don't think it's that bad if we end up wasting a lot of resources, compared to what I think is the more likely outcome, which is that the presence of grifters will deteriorate our ability to get accurate information about the world, and build accurate shared models of the world. The key problem is epistemics, and I feel like your post makes that point pretty well, but then it has a title that actively contradicts that point, which feels confusing to me.
(1) Calling yourself “longtermist” bakes empirical or refutable claims into an identity, making it harder to course-correct if you later find out you’re wrong.
Isn't this also true of "Effective Altruist"? And I feel like from my epistemic vantage point, "longtermist" bakes in many fewer assumptions than "Effective Altruist". I feel like there are just a lot of convergent reasons to care about the future, and the case for it seems more robust to me than the case for "you have to try to do the most good", and a lot of the hidden assumptions in EA.
I think a position of "yeah, agree, I also think people shouldn't call themselves EAs or rationalists or etc." is pretty reasonable and I think quite defensible, but I feel a bit confused what your actual stance here is given the things you write in this post.
Almost all nonprofit grants usually require everyone to take very low salaries. There are very few well-paying nonprofit projects. My guess is EA is the most widely-known community that might pay high salaries for relatively illegible nonprofit projects (and maybe the only widely-known funder/community that pays high-salaries for nonprofit projects in-general).
Reading this, I guess I'll just post the second half of this memo that I wrote here as well, since it has some additional points that seem valuable to the discussion:
When I play forward the future, I can imagine a few different outcomes, assuming that my basic hunches about the dynamics here are correct at all:
I think it would not surprise me that much if many of us do fall prey to the temptation to use the wealth and resources around us for personal gain, or as a tool towards building our own empire, or come to equate "big" with "good". I think the world's smartest people will generally pick up on us not really aiming for the common good, but I do think we have a lot of trust to spend down, and could potentially keep this up for a few years. I expect eventually this will cause the decline of our reputation and ability to really attract resources and talent, and hopefully something new and good will form in our ashes before the story of humanity ends.
But I think in many, possibly most, of the worlds where we start spending resources aggressively, whether for personal gain, or because we do really have a bold vision for how to change the future, the relationships of the central benefactors to the community will change. I think it's easy to forget that for most of us, the reputation and wealth of the community is ultimately borrowed, and when Dustin, or Cari or Sam or Jaan or Eliezer or Nick Bostrom see how their reputation or resources get used, they will already be on high-alert for people trying to take their name and their resources, and be ready to take them away when it seems like they are no longer obviously used for public benefit. I think in many of those worlds we will be forced to run projects in a legible way; or we will choose to run them illegibly, and be surprised by how few of the "pledged" resources were ultimately available for them.
And of course in many other worlds, we learn to handle the pressures of an ecosystem where trust is harder to come by, and we scale, and find new ways of building trust, and take advantage of the resources at our fingertips.
Or maybe we split up into different factions and groups, and let many of the resources that we could reach go to waste, as they ultimately get used by people who don't seem very aligned to us, but some of us think this loss is worth it to maintain an environment where we can think more freely and with less pressure.
Of course, all of this is likely to be far too detailed to be an accurate prediction of what will happen. I expect reality will successfully surprise me, and I am not at all confident I am reading the dynamics of the situation correctly. But the above is where my current thinking is at, and is the closest to a single expectation I can form, at least when trying to forecast what will happen to people currently in EA leadership.
To also take a bit more of an object-level stance, I currently very tentatively believe that I don't think this shift is worth it. I don't actually really have any plans that seem hopeful or exciting to me that really scale with a lot more money or a lot more resources, and I would really prefer to spend more time without needing to be worried about full-time people trying to scheme how to get specifically me to like them.
However, I do see the hope and potential in actually going out and spending the money and reputation we have to maybe get much larger fractions of the world's talent to dedicate themselves to ensuring a flourishing future and preventing humanity's extinction. I have inklings and plans that could maybe scale. But I am worried that I've already started trying to primarily answer the question "but what plans can meaningfully absorb all this money?" instead of the question of "but what plans actually have the highest chance of success?", and that this substitution has made me worse, not better, at actually solving the problem.
I think historically we've lacked important forms of ambition. And I am excited about us actually thinking big. But I currently don't know how to do it well. Hopefully this memo will make the conversations about this better, and maybe will help us orient towards this situation more healthily.
I feel like this post mostly doesn't talk about what feels like to me the most substantial downside of trying to scale up spending in EA, and increased availability of funding.
I think the biggest risk of the increased availability of funding, and general increase in scale, is that it will create a culture where people will be incentivized to act more deceptively towards others and that it will attract many people who will be much more open to deceptive action in order to take resources we currently have.
Here are some paragraphs from an internal memo I wrote a while ago that tried to capture this:
I think it it was Marc Andreessen who first hypothesized that startups usually go through two very different phases:
Pre Product-market fit: At this stage, you have some inkling of an idea, or some broad domain that seems promising, but you don't yet really have anything that solves a really crucial problem. This period is characterized by small teams working on their inside-view, and a shared, tentative, malleable vision that is often hard to explain to outsiders.
Post Product-market fit: At some point you find a product that works for people. The transition here can take a while, but by the end of it, you have customers and users banging on your door relentlessly to get more of what you have. This is the time of scaling. You don't need to hold a tentative vision anymore, and your value proposition is clear to both you and your customers. Now is the time to hire people and scale up and make sure that you don't let the product-market fit you've discovered go to waste.
I think it was Paul Graham or someone else close to YC (or maybe Ray Dalio) who said something like the following (NOT A QUOTE, since I currently can't find the direct source):
> The early stages of an organization are characterized by building trust. If your company is successful, and reaches product-market fit, these early founders and employees usually go on to lead whole departments. Use these early years to build trust and stay in sync, because when you are a thousand-person company, you won't have the time for long 10-hour conversations when you hang out in the evening.
> As you scale, you spend down that trust that you built in the early days. As you succeed, it's hard to know who is here because they really believe in your vision, and who just wants to make sure they get a big enough cut of the pie. That early trust is what keeps you agile and capable, and frequently as we see founders leave an organization, and with that those crucial trust relationships, we see the organization ossify, internal tensions increase, and the ability to effectively correspond to crises and changing environments get worse.
It's hard to say how well this model actually applies to startups or young organizations (it matches some of my observations, though definitely far from perfectly), and even more dubious how well it applies to systems like our community, but my current model is that it captures something pretty important.
I think whether we want it or not, I think we are now likely in the post-product-market fit part of the lifecycle of our community, at least when it comes to building trust relationships and onboarding new people. I think we have become high-profile enough, and have enough visible resources (especially with FTX's latest funding announcements), and have gotten involved in enough high-stakes politics, that if someone shows up next year at EA Global, you can no longer confidently know whether they are there because they have a deeply shared vision of the future with you, or because they want to get a big share of the pie that seems to be up for the taking around here.
I think in some sense that is good. When I see all the talk about megaprojects and increasing people's salaries and government interventions, I feel excited and hopeful that maybe if we play our cards right, we could actually bring any measurable fraction of humanity's ingenuity and energy to bear on preventing humanity's extinction and steering us towards a flourishing future, and most of those people of course will be more motivated by their own self-interest than their altruistic motivation.
But I am also afraid that with all of these resources around, we are transforming our ecosystem into a market for lemons. That we will see a rush of ever greater numbers of people into our community, far beyond our ability to culturally onboard them, and that nuance and complexity will have to get left at the wayside in order to successfully maintain any sense of order and coherence.
I think it is not implausible that for a substantial fraction of the leadership of EA, within 5 years, there will be someone in the world whose full-time job and top-priority it is to figure out how to write a proposal, or give you a pitch at a party, or write a blogpost, or strike up a conversation, that will cause you to give them money, or power, or status. For many months, they will sit down many days a week and ask themselves the question "how can I write this grant proposal in a way that person X will approve of" or "how can I impress these people at organization Y so that I can get a job there?", and they will write long Google Docs to their colleagues about their models and theories of you, and spend dozens of hours thinking specifically about how to get you to do what they want, while drawing up flowcharts that will include your name, your preferences, and your interests.
I think almost every publicly visible billionaire has whole ecosystems spring up around them that try to do this. I know some of the details here for Peter Thiel, and the "Thielosphere", which seems to have a lot of these dynamics. Almost any academic at a big lab will openly tell you that among the most crucial pieces of knowledge that any new student learns when they join, is how to write grant proposals that actually get accepted. When I ask academics in competitive fields about the content of their lunch conversations in their labs, the fraction of their cognition and conversations that goes specifically to "how do I impress tenure review committees and grant committees" and "how do I network myself into an academic position that allows me to do what I want" ranges from 25% to 75% (with the median around 50%).
I think there will still be real opportunities to build new and flourishing trust relationships, and I don't think that it will be impossible for us to really come to trust someone who joins our efforts after we have become 'cool,' but I do think it will be harder. I also think we should cherish and value the trust relationships we do have between the people who got involved with things earlier, because I do think that lack of doubt of why someone's here is a really valuable resource, and one that I expect is more and more likely to be a bottleneck in the coming years.
Yeah, the Charity Entrepreneurship grant is what I was talking about. But yeah, classifying that one as meta isn't crazy to me, though I think I would classify it more as Global Poverty (since I don't think it involved any general EA community infrastructure).
Oh, I get it now. That seems like a misleading summary, given that that program was primarily aimed at EA community infrastructure (which received 66% of the funding), the statistic cited here is only for a single grants round, and one of the five concrete examples listed seems to be a relatively big global poverty grant.
I still expect there to be some skew here, but I would take bets that the actual numbers for EA Grants look substantially less skewed than 1:16.
The EA Grants program granted ~16x more money to longtermist projects as global poverty and animal welfare projects combined
This seems wrong to me. The LTFF and the EAIF don't get 16x the money that the Animal Welfare and Global Health and Development funds get. Maybe you meant to say that the EAIF has granted 16x more money to longtermist projects?
I don't think this is true for the safety teams at Deepmind, but think it was true for some of the safety team at OpenAI, though I don't think all of it (I don't know what the current safety team at OpenAI is like, since most of it left to Anthropic).
I agree with this. Please don't work in AI capabilities research, and in particular don't work in labs directly trying to build AGI (e.g. OpenAI or Deepmind). There are few jobs that cause as much harm, and historically the EA community has already caused great harm here. (There are some arguments that people can make the processes at those organizations safer, but I've only heard negative things about people working in jobs that are non-safety related who tried to do this, and I don't currently think you will have much success changing organizations like that from a ground-level engineering role)
I think I disagree with this. School is a very specific, highly-structured environment. Few people actually have the choice between "staying in school or working at Org X". I think the usual choice is between "staying in school" and "figure out what to do with your life in a self-directed way", which I think is a really meaningful choice. It probably involves trialing at some organizations. It probably also involves spending a bunch of time reading.
It indeed is kind of unclear what it means because the person asking the question probably doesn't have much experience being self-directed. I expect if people's diagram looks like the last one you draw, I expect them to make worse decisions than if it looks like the second-to-last one you draw, because the most likely outcome is that they don't work at either ORG1, ORG2, or ORG3, but instead do something quite different.
I think it's important for people to consider plans that look like "change the basic circumstances of my life, then reorient". This post feels like it pushes people to only consider options they already have planned out, which (in my opinion) gets rid of most of the benefit of dropping out of school (which is usually the first time someone in their life is actually capable of fully directing their attention to what they want to do with their life).
Just click the "Request Feedback" button when you are writing a new post, and a chat window will pop up asking you what kind of feedback you want, and within 24 hours someone will have left comments and suggestions on your post.
Some abstractions that feel like they do real work on AI Alignment (compared to CIRL stuff):
Intent alignment vs. impact alignment
Natural abstraction hypothesis
Coherent Extrapolated Volition
None of these are paradigms, but all of them feel like they do substantially reduce the problem, in a way that doesn't feel true for CIRL. It is possible I have a skewed perception of actual CIRL stuff, based on your last paragraph though, so it's plausible we are just talking about different things.
FWIW, I don't think the problem with assistance games is that it assumes that ML is not going to get to AGI. The issues seem much deeper than that (mostly of the "grain of truth" sort, and from the fact that in CIRL-like formulations, the actual update-rule for how to update your beliefs about the correct value function is where 99% of the problem lies, and the rest of the decomposition doesn't really seem to me to reduce the problem very much, but instead just shunts it into a tiny box that then seems to get ignored, as far as I can tell).
I've considered this as a feature for a while. I haven't really made up my mind on it, some considerations:
I am worried about having a Streisand effect, where if you say something mildly controversial in public, it's actually less bad than if you tried at all to keep it private, where the second thing makes it sound a lot more juicy (compare "I found the following by infiltrating the Effective Altruist's private forum and here is a screenshot" vs. "I found the following on the Effective Altruism subreddit")
I do think it could be quite valuable for stuff that isn't really controversial, but that in some sense... is an advanced topic? But I am not sure whether a logged-in status is actually the right barrier here. Like, many topics I would like to be able to discuss, but I would just kind of prefer that it's a bit inconvenient to discuss, and that it wouldn't cloud the impression that newcomers have when they first show up to the forum.
I am kind of worried that lots of people would then make lots of posts for logged-in users, while forgetting a large group of readers that is actively following a lot of content on the EA Forum and is pretty plugged-into stuff, but isn't usually logged-in, and I haven't found a good way to make that tradeoff salient to authors, and currently think giving people the option would cause a lot of people to make a reflectively non-endorsed mistake.
This was not intentional on the part of GW or saturn2, it's simply that GW has always cached the user ID & name (because why wouldn't it) and whoever implemented the 'anonymous' feature apparently didn't think through the user ID part of it.
I did think of it! But having documents without ownership sure requires a substantial rewrite of a lot of LW code in a way that didn't seem worth the effort. And any hope for real anonymity for historical comments was already lost with lots of people scraping the site. If we ever had any official "post anonymously" features, I would definitely care to fix these issues, but this is a deleted account, and posting from a deleted account is itself more like a bug and not an officially supported feature (we allow deleted accounts to still login so they can recover any content from things like PMs, and I guess we left open the ability to leave comments).
I mean, it seems like given the potential upside of the project, the downside from animal testing would have to be quite large to be worth avoiding (or the cost of avoiding it very low). The comment also implies a consensus about EA that seems straightforwardly wrong, i.e. that we have strong rules to avoid harm for other beings. Indeed, I feel like a very substantial part of the EA mindset is to be capable of considering tradeoffs that involve hurting some beings and causing some harm, if the benefits outweigh the costs.
eating veg sits somewhere between "avoid intercontinental flights" and "donate to effective charities" in terms of expected impact, and I'm not sure where to draw the line between "altruistic actions that seem way too costly and should be discouraged" and "altruistic actions that seem a reasonable early step in one's EA journey and should be encouraged"
I am very confused by this statement. I feel like we've generally universally agreed that we don't really encourage people as a community to take altruistic actions if we don't really think it competes with the best alternatives that person has. Almost all altruistic interventions lie between "avoid intercontinental flights" and "donate to effective charities", and indeed, we encourage ~0% of that range for participants of the EA community. So just based on that observation, our prior should clearly also tend towards being unopinionated on this topic.
On this principle, why would the answer here be different than our answer on whether you should locate your company in a place with higher-tax burden because it sets a good example of cooperating on global governance? Or whether to buy products that are the result of exploitative working practices? Or buy from companies with bad security practices? Or volunteer at your local homeless shelter? All of these are in effectiveness between "avoid intercontinental flights" and "donate to effective charities", as far as I can tell (unless you think that "avoiding intercontinental flights" is somehow a much better value proposition, when it seems like one of the least cost-effective interventions I can think of, given the extremely high cost of avoiding intercontinental flights).
It seems kinda wild if this is the dominating factor. The number of meals provided through the year to a veg*n by other people is so, so much smaller than the number of meals that the veg*n provides for themselves. Both the veg*n and the large event organizers get "economies of scale".
Huh, I am surprised you say this. I have long provided daily lunches for my employees, and also provide lunches and dinners for anyone in the Lightcone Offices, so my current guess is that for a good chunk of the professional Berkeley community, ~half of meals are provided by other people. I do think most of those meals are easier and less one-off, which does decrease the cost, and enables more economies of scale, but they are still quite difficult to get right.
I do also think personal search costs are quite high for being vegan, and overall agree that it seems like they should be higher than the cost for event organizers. However, I do feel like given that we are already reasoning in a more deontological framework, I do think it makes a big difference on whether you are imposing a cost on other people, especially if we are implicitly enforcing a norm that you should eat vegan in EA contexts (which we are currently doing at EA events).
Indeed, I know of many people who have told me that they don't like attending EA events because they expect the food there to not meet their needs, because they aren't vegan, and this seems pretty bad to me from a community-building perspective (I myself have also started bringing snacks and soylent-backups-bottles to EA events, because EA events consistently enough have failed to adequately provide food to me, and left me hungry, that I gave up). So there is also an additional cost that I wouldn't quite describe as a search cost, which comes from the additional (frequent) request to only serve vegan or vegetarian food, which makes finding food for people who aren't used to eating vegan/vegetarian much harder.
I do overall want to say that the rise of impossible-burgers has made this problem a bunch easier, and is one of the reasons why I am pretty excited about meat-alternatives. It does indeed feel that I can now kind of get an OK vegetarian meal that is acceptable to almost everyone by just replacing 30% of the food with impossible-meat patties, which tend to have enough protein and fit the expected flavor profile of non-vegetarians much better.
Yeah, I think this is mostly true, though my experiences have been relatively different in organizing things for the rationality community, which is much closer to 30% vegan/vegetarian, instead of the 90% vegan/vegetarian I am used to in EA spaces, which is at least some evidence that this is still a cost on the margin.
My guess is de-facto that the vegetarians and vegans are still often disappointed with the food options when I organize things for the rationality community, even when I do my best, but 30% having that experience is still very different than 90% having that experience. I do have people reliably complain to me that the food is bad and doesn't have enough protein (and the caterers seem to never add enough protein, no matter how many times I ask). I think there is also some difference in expectations, where vegetarians and vegans expect an OK to mediocre option at an event that's not mostly vegetarians and vegans, but do seem to expect something really great if it's all vegetarian and vegan, though de-facto almost no caterer I found is actually much better when they just provide vegetarian/vegan food.
An important consideration that seems missing here is the increased search cost for other people trying to provide food for you, and the increased complexity of trying to coordinate on food for a large group of people, given a much more limited dietary range.
In my experience, organizing events for EAs is easily 25% more difficult than organizing it for any other group I've organized events for, just due to the wide range of dietary preferences that is represented by people, and the reliable problems that show up if you try to somehow make vegan catering work . These problems have probably been around 30% of the stress that I've experienced in organizing events over the past 7 years, and I've consistently seen other organizers spend a substantial chunk of their organizing time trying to solve catering problems that I don't really think are a problem for most other communities.
In my personal life, I've also found it quite difficult to spontaneously decide to eat out, while still reliable filling dietary preferences, especially if the diet is further restricted by some food allergies, so the amount of eating out I've done with EAs is also a good bit lower because of the high prevalence of various dietary preferences.
These tradeoffs might be worth it (though I am not currently particularly sold), but it's a tradeoff that I've rarely seen acknowledged, and that has made me sad (as someone who has spent a substantial chunk of the last 8 years of their life organizing EA events).
Yeah, my model is that this ontology already seems very well within the type of consideration that I expect to be covered by existing funders, and that the current frontier of undiscovered considerations looks substantially more messier or counterintuitive than this.
I think this doesn't really work super well, primarily because knowledge about funding gaps is pretty anti-inductive. While there is some institutional momentum, my current read is that if one of the large funders notices a funding gap in an ontology as simple as the one proposed here, they start moving into the space and will try to close it. E.g. while OpenPhil used to primarily only fund larger organizations, this has now changed, and OpenPhil is making many more grants to individuals and small institutions.
I think there are many funding gaps, but I expect the ontology outlined in this post to not hold up very well at actually describing them. If you can describe funding gaps as simple as this, then I expect this to change quite quickly after someone notices.
As a random anecdote, I followed something like the reasoning in the post for 3 years between 2013 and 2016, and slept 7.5 hours a day, followed by a half-hour nap in the midday. Without any alarms and with blackout curtains I sleep around 9 hours a day.
I mostly didn't feel very sleep deprived from sleeping the 7.5 + 0.5 hours, but after I switched towards a 9-hour sleep schedule, I noticed a very large change in my emotional variability, and in-particular noticed that I was feeling substantially less anxious, and substantially less depressed, and also substantially less hypomanic at different times. I still notice that I feel much higher emotional variability if I don't sleep full 9 hours.
I do think the change to my sleep schedule also came with a lot of other changes in my life, so this is far from even a single clear anecdote, but it did update me that at least for myself, I am quite hesitant to sleep less than 9 hours.