Personal feelings: I thought Karnofsky was one of the good ones! He has opinions on AI safety, and I agree with most of them! Nooooooooooo!
Object-level: My mental model of the rationality community (and, thus, some of EA) is "lots of us are mentally weird people, which helps us do unusually good things like increasing our rationality, comprehending big problems, etc., but which also have predictable downsides."
Given this, I'm pessimistic that, in our current setup, we're able to attract the absolute "best and brightest and also most ethical and also most epistemically rigorous people" that exist on Earth.
Ignoring for a moment that it's just hard to find people with all of those qualities combined... what about finding people with actual-top-percentile any of those things?
The most "ethical" (like professional-ethics, personal integrity, not "actually creates the most good consequences) people are probably doing some cached thing like "non-corrupt official" or "religious leader" or "activist".
The most "bright" (like raw intelligence/cleverness/working-memory) people are probably doing some typical thing like "quantum physicist" or "galaxy-brained mathematician".
The most "epistemically rigorous" people are writing blog posts, which may or may not even make enough money for them to do that full-time. If they're not already part of the broader "community" (including forecasters and I guess some real-money traders), they might be an analyst tucked away in government or academia.
A broader-problem might be something like: promote EA --> some people join it --> the other competent people think "ah, EA has all those weird problems handled, so I can keep doing my normal job" --> EA doesn't get the best and brightest.
I think I agree with both of these, actually: EA needs unusually good leaders, possibly better than we can even expect to attract.
(Compare EA with, say, being an elite businessperson or politician or something.)
Ah, thank you!
paraphrased: "morality is about the interactions that we have with each other, not about our effects on future people, because future people don't even exist!"
If that's really the core of what she said about that... yeah maybe I won't watch this video. (She does good subtitles for her videos, though, so I am more likely to download and read those!)
Agree, I don't see many "top-ranking" or "core" EAs writing exhaustive critiques (posts, not just comments!) of these critiques. (OK, they would likely complain that they have better things to do with their time, and they often do, but I have trouble recalling any aside from (debatably) some of the responses to AGI Ruins / Death With Dignity.)
Agreed. When people require literally everything to be written in the same place by the same author/small-group, it disincentives writing potentially important posts.
Strong agree with most of these points; the OP seems to not... engage on the object-level of some of its changes. Like, not proportionally to how big the change is or how good the authors think it is or anything?
Reminder for many people in this thread:
"Having a small clique of young white STEM grads creates tons of obvious blindspots and groupthink in EA, which is bad."
is not the same belief as
"The STEM/techie/quantitative/utilitarian/Pareto's-rule/Bayesian/"cold" cluster-of-approaches to EA, is bad."
You can believe both. You can believe neither. You can believe just the first one. You can believe the second one. They're not the same belief.
I think the first one is probably true, but the second one is probably false.
Thinking the first belief is true, is nowhere near strong enough evidence to think the second one is also true.
Who should do the audit? Here's some criteria I think could help:
- Orgs that don't get a high/any % of their funding from the individuals/groups under scrutiny.
- People who've been longtime community members with some level of good reputation in it.
- Orgs that do kinda "meta" things about the EA movement, like CEA or Nonlinear (disclosure: I used to volunteer for Nonlinear).
In society, a fundamental problem is the tradeoff between effort spent and information gained.
I could imagine a cursory "audit" that would catch blatant badness, while anything more subtle could take (for instance) experienced lawyers, forensic accountants, and other experts.
Not to mention, the access they'd need to these figures' businesses, organizations, relationships, communications... potentially anything and everything.
Most people wouldn't give such access, but I think you're right that with the unusual situation (1-2 people being a key nexus for funding/influence, inside a movement that tries to be more self-correcting than most), it makes more sense here.
Good point, I think I've heard this perspective before but forgot.
My thoughts on "both": in that case, I wonder if it's more like a merge, or more like a Jekyll/Hyde thing
Agree but leaning more towards Option B. Wish this was discussed more explicitly, since it's the question that determines whether this was "naive utilitarian went too far" (bad) or "sociopath using EA to reputation-launder" (bad). Since EA as a movement is soul-searching right now, it's pretty important to figure out exactly which thing happened, as that informs us what to change about EA to prevent this from happening again.
A comment I initially posted elsewhere in private: I have to wonder how much was reputation-laundering from the beginning... maybe it was just reputational-laundering among his friend group?
Like, if I was a competitive sociopath, who landed in an EA social group for auxiliary reasons, but wanted to launder my reputation with them, it wouldn't be as easy as reputation-laundering from the POV of the general public.
Think: putting your name on a university building VS pretending to be a semi-competent longtermist.
For anyone wondering about Sam's mental state: don't forget the somewhat high chance that he was intoxicated in some way during the interview.
This comment and the associated discussion in the linked post, have inspired me to write a post on this specific subtopic.
Sad there isn't much engagement here, I'll try my best.
This seems... not very backed-up with evidence. As in, I haven't seen William MacAskill, Nick Bostrom, Eliezer Yudkowsky, Nick Beckstead, Toby Ord, or... well, any longtermist "thought leaders" advocate this.
I think the paper in the linked tweet is making the example that trade-offs matter, not that we need to go out and make the largest and most uncomfortable ones (which would be abhorrent and wildly unnecessary).
Consider: Through historical privilege, atrocities, etc., people in richer countries tend to be more powerful. Therefore, in the long term, we'd generally expect them to have more influence on the future. Therefore, it makes sense to get them to use their power to make the future good'n'equitable, which may involve saving their lives.
"Endangers" here suggests "callously disregard". "Willfully endangers" suggests "kill or otherwise throw under the bus". Neither is actually being suggested as far as I can tell.
To put it another way: Abraham Lincoln was more powerful than any single person held in slavery in the Confederate South. If an abolitionist Secret Service officer could only save Lincoln or a slave, it makes more sense to save Lincoln. BUT that does not mean they want to kill the slave.
The closest analogy in longtermism would be something like "they're intentionally ignoring the poor brown people who will be killed by runaway climate change, in favor of sci-fi nerdy doom scenarios". But this ignores a few key factors:
If the sci-fi doom scenario is a real threat, the trade-off makes sense. That does not make it easy or comfortable, and it really shouldn't.
If a longtermist focuses their career on a "sci-fi" cause, that does not mean they therefore think all of society's resources/focus should go into that cause. Society, as a whole, has lots of resources.
Similarly, real existing longtermist billionaire donors don't even focus all their (or "their") wealth on one cause area.
Longtermists are still concerned about climate change, [even in the scenarios where (https://80000hours.org/problem-profiles/climate-change/#indirect-risks) it doesn't cause human extinction by itself! And even if there was no danger of climate change causing or aiding extinction... it's still bad! Longtermists still agree that it's bad! Longtermists care about the future, even if that future is (relatively) "near-term".
Without using... quantitative reasoning.
Up to now I've used terms like "more" and "trade-off", which I hope has been qualitative enough. But also... quantitative reasoning is good, actually. As in, if you don't use it, you are liable to make decisions that save fewer lives and allow more suffering to happen.
Again, just because a trade-off exists (or might exist), does not mean it's the "best deal" to take. Like, sure, somebody could construct a thought experiment where they genocide poor brown people, and then ??? happens, and then the long-term future is saved. But real longtermist organizations don't appear to be doing that. They mostly do (theoretical! cheap!) research, publishing ideas, and trying to persuade people of the ideas (the horror!). This is a really good deal! Again, society has enough resources for this to happen without just neglecting poor brown people.
This final part is more of a nitpick. From the linked Twitter thread:
Longtermism is closely linked to "total utilitarianism," which scholars have noted seems to imply that wealth should actively be transferred from the poor to the rich.
Maybe Cowen (noted in the linked thread) is saying this. If so, I agree with Torres that that seems pretty dumb, so I won't defend it. The problem is the phrase "seems to imply". Total utilitarianism counts all the pleasure/pain experienced by everyone, which includes taking into account the diminishing happiness-returns of giving one person more and more resources).
Sure, there's the "utility monster" objection, where you have some being who keeps getting linearly and not-diminishingly happy as we shovel resources into it. But that being seems absent from real life. And it's still better to have the utility monster and everyone else happy, unless you think the utility monster also derives more happiness from additional resources, which is even less realistic than the above idea. (Also, you can make a "utility monster" for average utilitarianism too, since they push up the average. You can do it with lots of theories, potentially...). Out of the classic "bullets to bite" for utilitarianism, this is one of the easier ones to swallow.
Cowen might've meant something like "billionaires are occasionally longtermist, therefore the government should tax them less / give them more money, so they can donate more". This is dumb and I won't defend it. Note, however, that if billionaires did not exist, longtermists would likely be trying to persuade somebody to donate resources to longtermist causes, even if "somebody" means "a majority of people" or "a central government" or something. As long as any resources exist, longtermism would recommend they be used for long-term good.
This discussion has also spurred me to write a longer post about common criticisms of longtermism, which I may or may not finish and release. (Disclosure: I am a fairly-hard longtermist, complete with AI safety).
EDIT 24Nov2022: I've been reading this article about Torres. While I think my above points stand even under the least-charitable readings of longtermist arguments, I've shifted my credence more towards "the people quoted don't even share those least-charitably-described ideas". Also, I'd be interested to read your reply to my writing above.
Point 1: I said "Different from MIRI but correlated with each other". You're right that I should've done a better job of explaining that. Basically, "Yudkowksy approaches (MIRI) vs Christiano approaches (my incomplete read of most of the non-MIRI orgs). I concede 60% of this point.
Point 2: !!! Big if true, thank you! I read most of johnswentworths' guide to being an independent researcher, and the discussion of grants was promising. I'm getting a visceral sense of this from seeing (and entering) more contests, bounties, prizes, etc. for alignment work. I'm working towards the day when I can 100% concede this point. (And, based on other feedback and encouragement I've gotten, that day is coming soon.)
Good point about the secrecy, I hadn't heard of the ABC thing. The secrecy is "understandable" to the extent that AI safety is analogous to the Manhattan Project, but less useful to the extent that AIS is analogous to... well, the development of theoretical physics.
Not sure how relevant, but this reminds me of stories from inside Valve, the noted semi-anarchistly-organized game developer. People can move to any project they want, and there are few/no formal position titles. However, some employees have basically said that, because decision-making is sorta by consensus and some people have seniority and people can organize informally anyway, the result is a "shadow clique/cabal" that has disproportionate power. Which, come to think of it, would probably happen in the average anarchist commune of sufficient size.
TLDR just because the cliques don't exist formally, doesn't mean they don't exist.
Oh yeah, there's clustering networks showing mutual followers of e.g. Twitch streamers, it shouldn't be too hard to make this for the EA sphere on twitter.
Somebody ought to start an independent organization specifically dedicated to red-teaming other people and groups' ideas.
I could start this after I graduate in the Fall, or potentially during the summer.
DM me if you want to discuss organization / funding.
The Gates documentary was part of what pushed me towards "okay, earning-to-give is unlikely to be my best path, because there seems to be a shortage in people smart enough to run massive (or even midsized) projects well." I guess the lack of red-teaming is a subset of constrainedness (although is it more cognitive bias on the funders, vs lack of "people / orgs who can independently red-team ideas"? Prolly both).
FWIW, Elon Musk famously kiiiiiiinda had a theory-of-change/impact before starting SpaceX. In the biography (and the WaitButWhy posts about him), it notes how he thought about funding a smaller mission of sending mice to Mars, and used a material cost spreadsheet to estimate the adequacy of existing space travel technology. He also aggressively reached out to experts in the field to look for the "catch", or whether he was missing something.
This is still nowhere near good red-teaming/proving-his-hunch-wrong, though. He also didn't seem to do nearly as much talking-to-experts knowledge-base-building for his other projects (e.g. Neuralink).
And most groups don't even do that.
Find your old student's house, catch them escaping out a window during a drug bust, recruit them into your RV.
*quadratic voting mechanism
this is pretty friggin epic
Related thought: people having different definitions of "justice", where that word points to overlapping-but-not-identical clusters of moral intuitions.
Animal welfare maps best on a cluster like "concern for the least-well-off" or "power for the powerless" or "the Rawls thing where if you imagined it happening to you, you'd hate it and want to escape it" or "ending suffering caused by the whims of other agents." That last one is particularly noticeable, since we usually have a moral intuition that suffering caused by other agents is basically preventable-thus-more-tragic.
Yoooo, senpai noticed!
We'll also mirror this on our collaborative blog TMB soon.
Agreed (I shoulda done that when editing it :P
Thank you for putting this (and solutions) in clear words
Imho some kind of /r/EffectiveMemes would be the best bet
I am naturally an angsty person, and I don't carry much reputational risk Relate! Although you're anonymous, I'm just ADD.
Point 1 is interesting to me:
- longtermist/AI safety orgs could require a diverse ecosystem of groups working based on different approaches. This would mean the "current state of under-funded-ness" is in flux, uncertain, and leaning towards "some lesser-known group(s) need money".
- lots of smaller donations could indicate/signal interest from lots of people, which could help evaluators or larger donors with something.
Another point: since I think funding won't be the bottleneck in the near future, I've refocused my career somewhat to balance more towards direct research.
(Also, partly inspired by your "Irony of Longtermism" post, I'm interested in intelligence enhancement for existing human adults, since the shorter timelines don't leave room for embryo whatevers, and intelligence would help in any timeline.)
I post one article by a friend about memes, look away for 5 seconds, and now this!
BOUNTY IDEA (also sent in the form): Exploring Human Value Codification.
Offered to a paper or study that demonstrates a mathematical (or otherwise engineering-ready) framework to measure human's real preference-ordering directly. Basically a neuroscience experiment or proposal thereof.
End goal: Using this framework / results from experiment(s) done based on it, you can generate novel stimuli that seem similar to each other, and reliably predict which ones human subjects will prefer more. (Gradients of pleasure, of course, no harm being done). And, of course, the neuroscientific understanding of how this preference ordering came about.
Prize amount: $5-10k for the proposal, more to fund a real experiment, order of magnitude probably in the right ballpark.
“Thanks for the response, reading your posts was one of the biggest inspirations for me writing this, its overall demeanor reminded me of what I see as this older strain of EA public interface in a way I hadn’t thought of in a while. On the point of MacAskill responding, I think the information you’ve given is helpful, but I do think there would have been some value in public commentary even if Torres personally wasn’t going to change his mind because of it, for instance it would have addressed concerns the piece gave outsiders who read it, and it would have both legitimized and responded to the concerns of insiders who might have resonated with some of what Torres said. As it happens, I think the community did respond to it somewhat significantly, but in a pretty partial, snubbish way. Robert Wiblin for instance appeared to subtweet the piece like twice:
Culminating in his recent 80k interview which he strongly advertised as a response to these concerns (again, without naming the article):
A similar story can be said of MacAskill himself, shortly after the piece came out he made some comments on EA Forum apparently correcting misconceptions about longtermism the piece brought up without engaging with the piece directly:
Maybe Torres doesn’t deserve direct engagement even if some of his concerns do (or maybe he does), but it seems hard to deny that its publication had some non-trivial impact on the internal conversations of the movement, including in some ways there was already an appetite for. Though again I can’t expect more direct engagement (especially from those personally attacked), it does seem to me more thorough, direct engagement from prominent figures would have been better in many ways than most of the actual reaction.”
“Yeah, I was wondering when that might come up. I have a general resistance to making extraneous accounts, especially if they are anything like social media accounts. I find it stressful and think I would over-obsessively check/use them in a way that would wind up being harmful. Even just having this post up and the ability to respond through Nick has occupied my attention and anxiety a good deal the last few days, or I might do more cross-posts/enable comments on our blog. That said, I did consider it. EA forum seems like it would not be so bad if I was going to have an account somewhere, and there’s still a decent chance that I will make one at some point. When I asked Nick about the issue, he said he already had an account and was very willing to post it for me (by the way, thanks again Nick!). I still considered making one because I thought it might seem weird if it was posted by him instead, but for better or worse I wound up taking him up on it.”
I mostly agree with the AI risk worldview described in footnote 5, but this is certainly an interesting analysis! (Although not super-useful for someone in a non-MIT/non-Jane-Street/not-elite-skilled reference class, but I still wonder about the flexibility of that...)
“The white supremacy part doesn’t have this effect for me. Yes there is a use of this word to refer to overt, horrible bigotry, but there is also a use of this word meaning something closer to ‘structures that empower, or maintain the power, of white people disproportionately in prominent decision-making positions’. It is reasonable to say that this latter definition may be a bad way of wording things, you could even argue a terrible way, but since this use has both academic, and more recently some mainstream, usage, it hardly seems fair to assume bad faith because of it. Some of the other stuff in this thread is more troubling, it seems there is a deep rabbit hole here, and it’s possible that Torres is generally a bad actor. Again, I don’t want to be too confident in this particular case. Although it seems we have very different ways of viewing these criticisms even when we are looking at the same thing, I will allow that you seem to have more familiarity with them.”
Devin's response: “I would be careful about calling this a bad faith attack. It may seem low quality or biased, but low quality is very different from bad faith and bias is probably something most of our defenders are guilty of to a decent degree as well. I’m not an expert on this case, but my own understanding is basically that Torres wrote a more academic, EA-targeted version of this before, got no responses or engagement he found adequate, despite reaching out to try to get it, and decided to take his case to a broader audience. I think there’s a ton wrong with his analysis including stuff a more balanced view of his subjects should have easily caught, but I see every indication he was trying to criticize in good faith. Then again, I am not super familiar with this case, and maybe I’m totally wrong. But one of the broader points of my piece is something like this: we can’t engage with all critics without being overwhelmed, indeed we can’t even engage with all the critics who really deserve some engagement without being overwhelmed. It is much much better to just admit this than to act like we are engaging with everyone who deserves it by getting trigger happy with accusations of bad faith and unreasonableness. Even when each of these is true, they are far too tempting an excuse once they enter your arsenal.”
Devin's response (also to DavidNash): “Sorry, there might be a misunderstanding here. The William MacAskill example is supposed to be more a framing device and specific case I’ve been thinking about, not any sort of proof that there’s a problem. As I mention in my epistemic status section, the overall claims I make about EA aren’t defended here, I rely on readers to just share this same impression of current fatigue with critics relative to early EA on reflection. If you don’t, that’s fine, but this piece isn’t going to try to convince you otherwise. On MacAskill more specifically, I agree that he isn’t at all obligated to respond, but my point in bringing him up is that, given his earlier behavior, if there hadn’t been a change in him between then and now, I would have expected he would respond. There are plenty of explanations other than a simple fatigue story, I’m intrigued by barkbellowroar’s comment bellow for instance, but my theory here is that it may be in part related to this broader trend in the movement.”
The LOTR analogy was intriguing to me, thank you!
People vastly overestimate the stability of their motivation and mental life. "...even when you take into account Hofstadter's Law." Seems very likely in my case.
The rest was helpfully calibrating, thank you
“Thanks for the comments. Sorry, I wrote a good deal of this stream of conscious so it isn’t really structured as an argument. More a way of me to connect some personal thoughts/experiences together in a hopefully productive way. I can see how that wouldn’t be super accessible. The basic argument embedded in it though is:
Effective Altruism, like many idealistic movements, started out taking critics very very seriously and trying to reach out to/be charitable to them as much as possible, which is a good thing
Effective Altruism, like most movements that grow older, is not quite like that anymore, it seems to respond with less frequency and generosity to critics than it used to, which is unfortunate but understandable
Understandable as it is, we should at least take a bit more notice of it if that’s the path we are going down because…
Many movements move on from here to ridiculing criticisms by treating common criticisms as though they were obviously, memeably false, and that everyone in the know gets that (I didn’t use examples, but the one most on my mind was the midwit meme format, which only requires the argument being ridiculed, your stated, undefended position, and some cartoons, to make it look like you’ve made a point). This is bad and we should be careful not to start doing it.”
I feel both held back and out of my depth in this, so this and the comments have helped my perspective. Thank you for writing this!
I feel like I'm on both sides of this, so I'll take the fast.ai course and then immediately jump into whatever seems interesting in PyTorch