Posts
Comments
There's no comparison to prior full-press Diplomacy agents, but if I'm reading the prior-work cites right, this is because basically none of them work - not only do they not beat humans, they apparently don't even always improve over themselves playing the game as if it was no-press Diplomacy (ie not using dialogue at all). That gives an idea how big a jump this is for full-press Diplomacy.
There's a commentated video by someone who plays as the only human in an otherwise all-Cicero game, which at least makes it seem like the dialogue is doing a lot.
Worth noting that this was "Blitz" Diplomacy with only five-minute negotiation rounds. Still very impressive though.
Some behavioral traits such as general intelligence show very high heritability – over 0.70 – in adults, which is about as heritable as human height.
I'm very confused about what numbers such as this mean in practice, since the most natural interpretation ("70% of the trait is genetically determined") is wrong, but there aren't very many clear explanations of what the correct interpretation is. When I tried asking this on LW, the top-voted answer was that it's a number that's mostly useful if you're doing animal breeding, but probably not useful for much else.
You mention a lot of heritability numbers, could you clarify what it is that we're intended to infer from them? (It seems to me that the main thing we can infer from heritability numbers is that if a trait has heritability above zero, then there's some genetic influence on it, but since you mention some traits having "very high" heritability, I presume that you find there to be some other information too.)
I'm not sure if there is any reason that should be strongly persuasive to a disinterested third party, at the moment. I think the current evidence is more on the level of "anecdotally, it seems like a lot of rationalists and EAs get something out of things like IFS".
But given that one can try out a few sessions of a therapy and see whether you seem to be getting anything out of it, that seems to be okay? Anecdotal evidence isn't enough to strongly show that one should definitely do a particular kind of therapy. But it can be enough to elevate a therapy to the level of things that might be worth giving a shot to see if anything comes out of it.
If they are better, why haven't they been more widely adopted by mainstream medicine?
Part of it is that the effectiveness of therapy is often hard and slow to study, so it's hard to get unambiguous evidence of one being better than another. E.g. many therapists, even if working within a particular school of therapy, will apply it in an idiosyncratic style that draws upon their entire life/job experience and also knowledge of any other therapy styles they might have. That makes it hard to verify the extent to which all the therapists in the study really are doing the same thing.
My impression is that CBT got popular in part because it can be administered in a relatively standardized form, making it easy to study. But that means that it's not necessarily any better than the other therapies, it's just easier to get legible-seeming evidence on.
Another issue is that therapy can often take a long time, and lots of the quantitative measures (e.g. depression questionnaires, measures of well-being) used for measuring its effects are relatively crude and imprecise and it can be hard to know exactly what one should even be measuring.
So overall it can just be relatively hard to compare the effectiveness of therapies, and particular ones get more popular more slowly, e.g. by therapist A hearing that therapist B has been doing a new thing that seems to get better results that A does and then getting curious about it. Or by word of mouth through therapy clients when a new approach is found that seems to work better than conventional therapies, as has been happening with IFS.
Oh wow, that is a really great paper! Thank you very much for linking it.
why anyone would choose one big ritual like 'Doom circles' instead of just purposefully inculcating a culture of opennes to giving / receiving critique that is supportive and can help others?
These don't sound mutually exclusive to me; you can have a formal ritual about something and also practice doing some of the related skills on a more regular basis.
That said, for many people, it would be emotionally challenging if they needed to be ready to receive criticism all the time. A doom circle is something where you can receive feedback at such a time when you have emotionally prepared for it, and then go back to receiving less criticism normally.
It might be better if everyone was capable of always receiving it, but it's also good to have options for people who aren't quite there yet. A doom circle is a thing that can be done in less than an hour, whereas skill-building is much more of a long-term project.
low-context feedback is often not helpful;
That's true, but also: since people generally know that low-context feedback can be unhelpful, they might hold back offering any such feedback, even when it would be useful! Having an explicit context for offering the kind of critical feedback that you know might be incorrect gives people the chance to receive even the kinds of impressions that they otherwise wouldn't have the chance to hear.
feedback is ultimately just an opinion; you should be able to take and also discard it.
Yes, in any well-run doom circle, this exact thing would be emphasized in the beginning (levin mentioned this in their comment and it was probably done in the beginning of the circles I've participated in, as well; I don't remember what the exact preliminaries were, but it certainly matches my general sense of the spirit of the circle).
That's correct, most of the people in the circle (including the person with the wizard line, I think) I'd only met a couple of days before.
That's great, thank you :D
It's my impression that in writing workshops where people bring their writing to be criticized, it's also a common rule that the writers are not allowed to respond to the feedback. I believe the rule exists exactly because of what you say: because another person's feedback may be off or biased for a variety of reasons. If there was a discussion about it, the recipient of the feedback might get defensive and want to explain why the feedback was flawed. That would risk the conversation taking an unpleasant tone and also any correct feedback not being properly heard.
When the rule is instead "you are required to listen to the feedback, but it's then totally up to you what you do with it and can choose to ignore it as totally deluded if you wish", that gives people the full license to do so. Unfair-feeling criticism won't put them in a position where they have to choose between defending themselves and losing face - not defending yourself won't make anyone lose face since nobody is allowed to defend themselves. That helps make it easier to consider whether some of the unfair-seeming criticism might actually have a point. And on the other hand, if the criticism really was unfair, then someone defending themselves might give the other people the temptation to argue against the defense in order for them to save face and justify their criticism as correct... and someone trying to argue to you that no really, you really are doomed (or "your writing really is bad"), would totally break any atmosphere of mutual kindness and support.
I would like to encourage environments in EA which are mutually supportive and kind.
For what it's worth, my experience of Doom Circles is that they felt explicitly supportive and kind. It felt like other people were willing to pay a social cost to give me honest feedback in a way that they would otherwise feel hesitant to do, and I appreciated them doing that for me.
The incentives are to provide harsh critique to appear novel and insightful of the other person.
I wouldn't say that there's an incentive to be harsh, since while you are providing feedback, others will also be providing feedback to you; so if you're overly harsh to others, you might get the same in return. In my experience at least, the shared vulnerability created an incentive to be kind but honest.
I feel like this post is lacking an explanation of what's good about this practice, so I'll share my experiences.
I think I've attended a couple of Doom Circles (weirdly, my memory claims that there would have been one at my 2014 CFAR workshop but the post says they were first done is 2015, so maybe I'm mixing that up with some later event). I've usually thought of myself as pretty thin-skinned (much less these days but much more back then), but my recollection is that the experience was clearly positive.
It's very rare to actually get critical feedback that you can trust to be both honest and well-intentioned. The fact that it was purely opt-in and with an intimate atmosphere of people who would also hear their doom in turn made it feel safe and like there was a definite feeling of bonding with the others, in a "we're all doing a thing that might be slightly unpleasant but we also trust each other to be able to hear it and are making ourselves vulnerable both in hearing and speaking the doom and at the end we'll have gone through a slightly challenging experience together" kind of vibe. It definitely felt to me like a privilege to be able to participate.
I've forgotten most of the dooms I personally got, but I recall one I got from a doom circle in my 2018 CFAR mentorship training. I'd had a bunch of social anxiety when I'd initially gotten to the training and then it had gradually subsided over time. I think someone then referenced that and said something along the lines of "you're being too tame when you could be more powerful, you should lean into weirdness more and be like a cackling mad wizard". I can't say I'd be totally sure about what they meant, but there was something about that which stayed with me and which I've occasionally thought about. Maybe something like, even though being a cackling mad wizard isn't quite the type of identity that'd be my thing, it felt significant to hear that someone thinks that I do have more power than I'm letting myself believe, and that I could play the role of a cackling mad wizard if I chose to.
That's another thing that can also make Doom Circles positive. Because the doom you can see facing someone else can also be a thing like their own potential that they're not letting themselves see and are thus letting go to waste. Normally, it would still be weird to state that kind of a thing aloud. But if you take the opportunity to say it in a Doom Circle, it can actually become something that ends up landing as a compliment that they'll think about for years afterwards - "huh, would I really have it in me to be a cackling mad wizard?"
Huh! That's surprising.
Fair. In that case this seems like a necessary prerequisite result for doing that deeper investigation, though, so valuable in that respect.
At least for myself, it wouldn't have been obvious in advance that there would be exactly two factors, as opposed to (say) one, three or four.
Perhaps more educated people are more happy with their career and thus more reluctant to change it?
Or just more invested in it - if you've spent several years acquiring a degree in a topic, you may be quite reluctant to go do something completely different.
For future studies, might be worth rephrasing this item in a way where this doesn't act as a confounder for the results? I'd expect people in their early twenties to answer it quite differently than people in their early forties.
I was thinking that if they insist on requiring it (and I get around actually participating), I'll just iterate on some prompts on wombo.art or similar until I get something decent.
Because it also mentions woo, so I think it’s talking about a broader class if unjustified beliefs than you think.
My earlier comment mentioned that "there are also lots of different claims that seem (or even are) irrational but are pointing to true facts about the world." That was intended to touch upon "woo"; e.g. meditation used to be, and to some extent still is, considered "woo", but there nonetheless seem to be reasonable grounds to think that there's nonetheless something of value to be found in meditation (despite there also being various crazy claims around it).
My above link mentions a few other examples (out-of-body experiences, folk traditions, "Ki" in martial arts) that have claims around them that are false if taken as the literal truth, but are still pointing to some true aspect of the world. Notably, a policy of "reject all woo things" could easily be taken to imply rejecting all such things as superstition that's not worth looking at, thus missing out on the parts of the woo that were actually valuable.
IME, the more I look into them, the more I come to find that "woo" things that I'd previously rejected as not worth looking at because of them being obviously woo and false, are actually pointing to significantly valuable things. (Even if there is also quite a lot of nonsense floating around those same topics.)
I agree, but in that case you should say make it clear how your interpretation differs from the author’s.
That's fair.
What makes you think it isn't? To me it seems both like a reasonable interpretation of the quote (private guts are precisely the kinds of positions you can't necessarily justify, and it's talking about having beliefs you can't justify) as well as a dynamic that feels like one that I recognize as one that has been occasionally present in the community. Fortunately posts like the one about private guts have helped push back against it.
Even if this interpretation wasn't actually the author's intent, choosing to steelman the claim in that way turns the essay into a pretty solid one, so we might as well engage with the strongest interpretation of it.
There are a few different ways of interpreting the quote, but there's a concept of public positions and private guts. Public positions are ones that you can justify in public if pressed on, while private guts are illegible intuitions you hold which may nonetheless be correct - e.g. an expert mathematician may have a strong intuition that a particular proof or claim is correct, which they will then eventually translate to a publicly-verifiable proof.
As far as I can tell, lizards probably don’t have public positions, but they probably do have private guts. That suggests those guts are good for predicting things about the world and achieving desirable world states, as well as being one of the channels by which the desirability of world states is communicated inside a mind. It seems related to many sorts of ‘embodied knowledge’, like how to walk, which is not understood from first principles or in an abstract way, or habits, like adjective order in English. A neural network that ‘knows’ how to classify images of cats, but doesn’t know how it knows (or is ‘uninterpretable’), seems like an example of this. “Why is this image a cat?” -> “Well, because when you do lots of multiplication and addition and nonlinear transforms on pixel intensities, it ends up having a higher cat-number than dog-number.” This seems similar to gut senses that are difficult to articulate; “why do you think the election will go this way instead of that way?” -> “Well, because when you do lots of multiplication and addition and nonlinear transforms on environmental facts, it ends up having a higher A-number than B-number.” Private guts also seem to capture a category of amorphous visions; a startup can rarely write a formal proof that their project will succeed (generally, if they could, the company would already exist). The postrigorous mathematician’s hunch falls into this category, which I’ll elaborate on later.
As an another example, in the recent dialog on AGI alignment, Yudkowsky frequently referenced having strong intuitions about how minds work that come from studying specific things in detail (and from having "done the homework"), but which he does not know how to straightforwardly translate into a publicly justifiable argument.
Private guts are very important and arguably the thing that mostly guides people's behavior, but they are often also ones that the person can't justify. If a person felt like they should reject any beliefs they couldn't justify, they would quickly become incapable of doing anything at all.
Separately, there are also lots of different claims that seem (or even are) irrational but are pointing to true facts about the world.
This is indeed a wonderful story!
This version has nicer line breaks, in my opinion.
Here's an audio version read by Leonard Nimoy.
Draft and re-draft (and re-draft). The writing should go through many iterations. You make drafts, you share them with a few people, you do something else for a week. Maybe nobody has read the draft, but you come back and you’ve rejuvenated your wonderful capacity to look at the work and know why it’s terrible.
Kind of related to this: giving a presentation about the ideas in your article is something that you can use as a form of a draft. If you can't get anyone to listen to a presentation, or don't want to give one quite yet, you can pick some people whose opinion you value and just make a presentation where you imagine that they're in the audience.
I find that if I'm thinking of how to present the ideas in a paper to an in-person audience, it makes me think about questions like "what would be a concrete example of this idea that I could start the presentation with, that would grab the audience's attention right away". And then if I come up with a good way of presenting the ideas in my article, I can rewrite the article to use that same presentation.
(Unfortunately myself I have mostly taken this advice in its reverse form. I've first written a paper and then given a presentation of it afterwards, at which point I've realized that this is actually what I should have said in the paper itself.)
Depends on exactly which definition of s-risks you're using; one of the milder definitions is just "a future in which a lot of suffering exists", such as humanity settling most of the galaxy but each of those worlds having about as much suffering as the Earth has today. Which is arguably not a dystopian outcome or necessarily terrible in terms of how much suffering there is relative to happiness, but still an outcome in which there is an astronomically large absolute amount of suffering.
Fair point. Though apparently measures of 'life satisfaction' and 'meaning' produce different outcomes:
So, how did the World Happiness Report measure happiness? The study asked people in 156 countries to “value their lives today on a 0 to 10 scale, with the worst possible life as a 0 and the best possible life as a 10.” This is a widely used measure of general life satisfaction. And we know that societal factors such as gross domestic product per capita, extensiveness of social services, freedom from oppression, and trust in government and fellow citizens can explain a significant proportion of people’s average life satisfaction in a country.
In these measures the Nordic countries—Finland, Sweden, Norway, Denmark, Iceland—tend to score highest in the world. Accordingly, it is no surprise that every time we measure life satisfaction, these countries are consistently in the top 10. [...]
... some people might argue that neither life satisfaction, positive emotions nor absence of depression are enough for happiness. Instead, something more is required: One has to experience one’s life as meaningful. But when Shigehiro Oishi, of the University of Virginia, and Ed Diener, of the University of Illinois at Urbana-Champaign, compared 132 different countries based on whether people felt that their life has an important purpose or meaning, African countries including Togo and Senegal were at the top of the ranking, while the U.S. and Finland were far behind. Here, religiosity might play a role: The wealthier countries tend to be less religious on average, and this might be the reason why people in these countries report less meaningfulness.
It has been suggested that people are succumbing to a focusing illusion when they think that having children will make them happy, in that they focus on the good things without giving much thought to the bad.
Worth noting that you might get increased meaningfulness in exchange for the lost happiness, which isn't necessarily an irrational trade to make. E.g. Robin Hanson:
Stats suggest that while parenting doesn’t make people happier, it does give them more meaning. And most thoughtful traditions say to focus more on meaning that happiness. Meaning is how you evaluate your whole life, while happiness is how you feel about now. And I agree: happiness is overrated.
Parenting does take time. (Though, as Bryan Caplan emphasized in a book, less than most think.) And many people I know plan to have an enormous positive influences on the universe, far more than plausible via a few children. But I think they are mostly kidding themselves. They fear their future selves being less ambitious and altruistic, but its just as plausible that they will instead become more realistic.
Also, many people with grand plans struggle to motivate themselves to follow their plans. They neglect the motivational power of meaning. Dads are paid more, other things equal, and I doubt that’s a bias; dads are better motivated, and that matters. Your life is long, most big world problems will still be there in a decade or two, and following the usual human trajectory you should expect to have the most wisdom and influence around age 40 or 50. Having kids helps you gain both.
Thanks. It looks to me that much of what's being described at these links is about the atmosphere among the students at American universities, which then also starts affecting the professors there. That would explain my confusion, since a large fraction of my academic friends are European, so largely unaffected by these developments.
there could be a number of explanations aside from cancel culture not being that bad in academia.
I do hear them complain about various other things though, and I also have friends privately complaining about cancel culture in non-academic contexts, so I'd generally expect this to come up if it were an issue. But I could still ask, of course.
We also discussed some possible reasons for why there might be a disappointing future in the sense of having a lot of suffering, in sections 4-5 of Superintelligence as a Cause or Cure for Risks of Astronomical Suffering. A few excerpts:
4.1 Are suffering outcomes likely?
Bostrom (2003a) argues that given a technologically mature civilization capable of space colonization on a massive scale, this civilization "would likely also have the ability to establish at least the minimally favorable conditions required for future lives to be worth living", and that it could thus be assumed that all of these lives would be worth living. Moreover, we can reasonably assume that outcomes which are optimized for everything that is valuable are more likely than outcomes optimized for things that are disvaluable. While people want the future to be valuable both for altruistic and self-oriented reasons, no one intrinsically wants things to go badly.
However, Bostrom has himself later argued that technological advancement combined with evolutionary forces could "lead to the gradual elimination of all forms of being worth caring about" (Bostrom 2004), admitting the possibility that there could be technologically advanced civilizations with very little of anything that we would consider valuable. The technological potential to create a civilization that had positive value does not automatically translate to that potential being used, so a very advanced civilization could still be one of no value or even negative value.
Examples of technology’s potential being unevenly applied can be found throughout history. Wealth remains unevenly distributed today, with an estimated 795 million people suffering from hunger even as one third of all produced food goes to waste (World Food Programme, 2017). Technological advancement has helped prevent many sources of suffering, but it has also created new ones, such as factory-farming practices under which large numbers of animals are maltreated in ways which maximize their production: in 2012, the amount of animals slaughtered for food was estimated at 68 billion worldwide (Food and Agriculture Organization of the United Nations 2012). Industrialization has also contributed to anthropogenic climate change, which may lead to considerable global destruction. Earlier in history, advances in seafaring enabled the transatlantic slave trade, with close to 12 million Africans being sent in ships to live in slavery (Manning 1992).
Technological advancement does not automatically lead to positive results (Häggström 2016). Persson & Savulescu (2012) argue that human tendencies such as “the bias towards the near future, our numbness to the suffering of great numbers, and our weak sense of responsibility for our omissions and collective contributions”, which are a result of the environment humanity evolved in, are no longer sufficient for dealing with novel technological problems such as climate change and it becoming easier for small groups to cause widespread destruction. Supporting this case, Greene (2013) draws on research from moral psychology to argue that morality has evolved to enable mutual cooperation and collaboration within a select group (“us”), and to enable groups to fight off everyone else (“them”). Such an evolved morality is badly equipped to deal with collective action problems requiring global compromises, and also increases the risk of conflict and generally negative-sum dynamics as more different groups get in contact with each other.
As an opposing perspective, West (2017) argues that while people are often willing to engage in cruelty if this is the easiest way of achieving their desires, they are generally “not evil, just lazy”. Practices such as factory farming are widespread not because of some deep-seated desire to cause suffering, but rather because they are the most efficient way of producing meat and other animal source foods. If technologies such as growing meat from cell cultures became more efficient than factory farming, then the desire for efficiency could lead to the elimination of suffering. Similarly, industrialization has reduced the demand for slaves and forced labor as machine labor has become more effective. At the same time, West acknowledges that this is not a knockdown argument against the possibility of massive future suffering, and that the desire for efficiency could still lead to suffering outcomes such as simulated game worlds filled with sentient non-player characters (see section on cruelty-enabling technologies below). [...]
4.2 Suffering outcome: dystopian scenarios created by non-value-aligned incentives.
Bostrom (2004, 2014) discusses the possibility of technological development and evolutionary and competitive pressures leading to various scenarios where everything of value has been lost, and where the overall value of the world may even be negative. Considering the possibility of a world where most minds are brain uploads doing constant work, Bostrom (2014) points out that we cannot know for sure that happy minds are the most productive under all conditions: it could turn out that anxious or unhappy minds would be more productive. [...]
More generally, Alexander (2014) discusses examples such as tragedies of the commons, Malthusian traps, arms races, and races to the bottom as cases where people are forced to choose between sacrificing some of their values and getting outcompeted. Alexander also notes the existence of changes to the world that nearly everyone would agree to be net improvements - such as every country reducing its military by 50%, with the savings going to infrastructure - which nonetheless do not happen because nobody has the incentive to carry them out. As such, even if the prevention of various kinds of suffering outcomes would be in everyone’s interest, the world might nonetheless end up in them if the incentives are sufficiently badly aligned and new technologies enable their creation.
An additional reason for why such dynamics might lead to various suffering outcomes is the so-called Anna Karenina principle (Diamond 1997, Zaneveld et al. 2017), named after the opening line of Tolstoy’s novel Anna Karenina: "all happy families are all alike; each unhappy family is unhappy in its own way". The general form of the principle is that for a range of endeavors or processes, from animal domestication (Diamond 1997) to the stability of animal microbiomes (Zaneveld et al. 2017), there are many different factors that all need to go right, with even a single mismatch being liable to cause failure.
Within the domain of psychology, Baumeister et al. (2001) review a range of research areas to argue that “bad is stronger than good”: while sufficiently many good events can overcome the effects of bad experiences, bad experiences have a bigger effect on the mind than good ones do. The effect of positive changes to well-being also tends to decline faster than the impact of negative changes: on average, people’s well-being suffers and never fully recovers from events such as disability, widowhood, and divorce, whereas the improved well-being that results from events such as marriage or a job change dissipates almost completely given enough time (Lyubomirsky 2010).
To recap, various evolutionary and game-theoretical forces may push civilization in directions that are effectively random, random changes are likely to bad for the things that humans value, and the effects of bad events are likely to linger disproportionately on the human psyche. Putting these considerations together suggests (though does not guarantee) that freewheeling development could eventually come to produce massive amounts of suffering.
yet academia is now the top example of cancel culture
I'm a little surprised by this wording? Certainly cancel culture is starting to affect academia as well, but I don't think that e.g. most researchers think about the risk of getting cancelled when figuring out the wording for their papers, unless they are working on some exceptionally controversial topic?
I have lots of friends in academia and follow academic blogs etc., and basically don't hear any of them talking about cancel culture within that context. I did recently see a philosopher recently post a controversial paper and get backlash for it on Twitter, but then he seemed to basically shrug it off since people complaining on Twitter didn't really affect him. This fits my general model that most of the cancel culture influence on academia comes from people outside academia trying to affect it, with varying success.
I don't doubt that there are individual pockets with academia that are more cancely, but the rest of academia seems to me mostly unaffected by them.
On the positive side, a recent attempt to bring cancel culture to EA was very resoundingly rejected, with 111 downvotes and strongly upvoted rebuttals.
I don't know, but I get the impression that SWB questions are susceptible to framing effects in general: for example, Biswas-Diener & Diener (2001) found that when people in Calcutta were asked for their life satisfaction in general, and also for their satisfaction in 12 subdomains (material resources, friendship, morality, intelligence, food, romantic relationship, family, physical appearance, self, income, housing, and social life), they gave on average a slightly negative rating for the global satisfaction, while also giving positive ratings for all the subdomains. (This result was replicated at least by Cox 2011 in Nicaragua.)
Biswas-Diener & Diener 2001 (scale of 1-3):
The mean score for the three groups on global life satisfaction was 1.93 (on the negative side just under the neutral point of 2). [...] The mean ratings for all twelve ratings of domain satisfaction fell on the positive (satisfied) side, with morality being the highest (2.58) and the lowest being satisfaction with income (2.12).
Cox 2011 (scale of 1-7):
The sample level mean on global life satisfaction was 3.8 (SD = 1.7). Four is the mid-point of the scale and has been interpreted as a neutral score. Thus this sample had an overall mean just below neutral. [...] The specific domain satisfactions (housing, family, income, physical appearance, intelligence, friends, romantic relationships, morality, and food) have means ranging from 3.9 to 5.8, and a total mean of 4.9. Thus all nine specific domains are higher than global life satisfaction. For satisfaction with the broader domains (self, possessions, and social life) the means ranged from 4.4 to 5.2, with a mean of 4.8. Again, all broader domain satisfactions are higher than global life satisfaction. It is thought that global judgments of life satisfaction are more susceptible to positivity bias and that domain satisfaction might be more constrained by the concrete realities of an individual’s life (Diener et al. 2000)
In particular, Elon Musk claims that BCIs may allow us to integrate with AI such that AI will not need to outcompete us (Young, 2019). It is unclear at present by what exact mechanism a BCI would assist here, how it would help, whether it would actually decrease risk from AI, or if it is a valid claim at all. Such a ‘solution’ to AGI may also be entirely compatible with global totalitarianism, and may not be desirable. The mechanism by which integrating with AI would lessen AI risk is currently undiscussed; and at present, no serious academic work has been done on the topic.
We have a bit of discussion about this (predating Musk's proposal) in section 3.4. of Responses to Catastrophic AGI Risk; we're also skeptical, e.g. this excerpt from our discussion:
De Garis [82] argues that a computer could have far more processing power than a human brain, making it pointless to merge computers and humans. The biological component of the resulting hybrid would be insignificant compared to the electronic component, creating a mind that was negligibly different from a 'pure' AGI. Kurzweil [168] makes the same argument, saying that although he supports intelligence enhancement by directly connecting brains and computers, this would only keep pace with AGIs for a couple of additional decades.
The truth of this claim seems to depend on exactly how human brains are augmented. In principle, it seems possible to create a prosthetic extension of a human brain that uses the same basic architecture as the original brain and gradually integrates with it [254]. A human extending their intelligence using such a method might remain roughly human-like and maintain their original values. However, it could also be possible to connect brains with computer programs that are very unlike human brains and which would substantially change the way the original brain worked. Even smaller differences could conceivably lead to the adoption of 'cyborg values' distinct from ordinary human values [290].
Bostrom [49] speculates that humans might outsource many of their skills to non-conscious external modules and would cease to experience anything as a result. The value-altering modules would provide substantial advantages to their users, to the point that they could outcompete uploaded minds who did not adopt the modules. [...]
Moravec [194] notes that the human mind has evolved to function in an environment which is drastically different from a purely digital environment and that the only way to remain competitive with AGIs would be to transform into something that was very different from a human.
Let's look at some of your references. You say that Scott has endorsed eugenics; let's look up the exact phrasing (emphasis mine):
Even though I like both basic income guarantees and eugenics, I don’t think these are two things that go well together – making the income conditional upon sterilization is a little too close to coercion for my purposes. Still, probably better than what we have right now.
"I don't like this, though it would probably be better than the even worse situation that we have today" isn't exactly a strong endorsement. Note the bit about disliking coercion which should already suggest that Scott doesn't like "eugenics" in the traditional sense of involuntary sterilization, but rather non-coercive eugenics that emphasize genetic engineering and parental choice.
Simply calling this "eugenics" with no caveats is misleading; admittedly Scott himself sometimes forgets to make this clarification, so one would be excused for not knowing what he means... but not when linking to a comment where he explicitly notes that he doesn't want to have coercive forms of eugenics.
Next, you say that he has endorsed "Charles Murray, a prominent proponent of racial IQ differences". Looking up the exact phrasing again, Scott says:
The only public figure I can think of in the southeast quadrant with me is Charles Murray. Neither he nor I would dare reduce all class differences to heredity, and he in particular has some very sophisticated theories about class and culture. But he shares my skepticism that the 55 year old Kentucky trucker can be taught to code, and I don’t think he’s too sanguine about the trucker’s kids either. His solution is a basic income guarantee, and I guess that’s mine too. Not because I have great answers to all of the QZ article’s problems. But just because I don’t have any better ideas1,2.
What is "the southeast quadrant"? Looking at earlier in the post, it reads:
The cooperatives argue that everyone is working together to create a nice economy that enriches everybody who participates in it, but some people haven’t figured out exactly how to plug into the magic wealth-generating machine, and we should give them a helping hand (“here’s government-subsidized tuition to a school where you can learn to code!”) [...] The southeast corner is people who think that we’re all in this together, but that helping the poor is really hard.
So Scott endorses Murray's claims that... cognitive differences may have a hereditary component, that it might be hard to teach the average trucker and his kids to become programmers, and that we should probably implement a basic income so that these people will still have a reasonable income and don't need to starve. Also, the position that he ascribes to both himself and Murray is the attitude that we should do our best to help everyone, and that it's basically good for everyone try to cooperate together. Not exactly ringing endorsements of white supremacy.
Also one of the foonotes to "I don't have any better ideas" is "obviously invent genetic engineering and create a post-scarcity society, but until then we have to deal with this stuff", which again ties to the part where to the extent that Scott endorses eugenics, he endorses liberal eugenics.
Finally, you note that Scott identifies with the "hereditarian left". Let's look at the article that Scott links to when he says that this term "seems like as close to a useful self-identifier as I’m going to get". It contains an explicit discussion of how the possibility of cognitive differences between groups does not in any sense imply that one of the groups would have more value, morally or otherwise, than the other:
I also think it’s important to stress that contemporary behavioral genetic research is — with very, very few exceptions — almost entirely focused on explaining individual differences within ancestrally homogeneous groups. Race has a lot to do with how behavioral genetic research is perceived, but almost nothing to do with what behavioral geneticists are actually studying. There are good methodological reasons for this. Twin studies are, of course, using twins, who almost always self-identify as the same race. And genome-wide association studies (GWASs) typically use a very large group of people who all have the same self-identified race (usually White), and then rigorously control for genetic ancestry differences even within that already homogeneous group. I challenge anyone to read the methods section of a contemporary GWAS and persist in thinking that this line of research is really about race differences.
Despite all this, racists keep looking for “evidence” to support racism. The embrace of genetic research by racists reached its apotheosis, of course, in Nazism and the eugenics movements in the U.S. After all, eugenics means “good genes”– ascribing value and merit to genes themselves. Daniel Kevles’ In the Name of Eugenics: Genetics and the Uses of Human Heredity should be required reading for anyone interested in both the history of genetic science and in how this research has been (mis)used in the United States. This history makes clear that the eugenic idea of conceptualizing heredity in terms of inherent superiority was woven into the fabric of early genetic science (Galton and Pearson were not, by any stretch, egalitarians) and an idea that was deliberately propagated. The idea that genetic influence on intelligence should be interpreted to mean that some people are inherently superior to other people is itself a racist invention.
Fast-forward to 2017, and nearly everyone, even people who think that they are radical egalitarians who reject racism and white supremacy and eugenic ideology in all its forms, has internalized this “genes == inherent superiority” equation so completely that it’s nearly impossible to have any conversation about genetic research that’s not tainted by it. On both the right and the left, people assume that if you say, “Gene sequence differences between people statistically account for variation in abstract reasoning ability,” what you really mean is “Some people are inherently superior to other people.” Where people disagree, mostly, is in whether they think this conclusion is totally fine or absolutely repugnant. (For the record, and this should go without saying, but unfortunately needs to be said — I fall in the latter camp.) But very few people try to peel apart those ideas. (A recent exception is this series of blog posts by Fredrik deBoer.) The space between, which says, “Gene sequence differences between people statistically account for variation in abstract reasoning ability” but also says “This observation has no bearing on how we evaluate the inherent value or worth of people” is astoundingly small. [...]
But must genetic research necessarily be interpreted in terms of superiority and inferiority? Absolutely not. To get a flavor of other possible interpretations, we can just look at how people describe genetic research on nearly any other human trait.
Take, for example, weight. Here, is a New York Times article that quotes one researcher as saying, “It is more likely that people inherit a collection of genes, each of which predisposes them to a small weight gain in the right environment.” Substitute “slight increase in intelligence” for “small weight gain” in that sentence and – voila! You have the mainstream scientific consensus on genetic influences on IQ. But no one is writing furious think pieces in reaction to scientists working to understand genetic differences in obesity. According to the New York Times, the implications of this line of genetic research is … people shouldn’t blame themselves for a lack of self-control if they are heavy, and a “one size fits all” approach to weight loss won’t be effective.
As another example, think about depression. The headline of one New York Times article is “Hunting the Genetic Signs of Postpartum Depression with an iPhone App.” Pause for a moment and consider how differently the article would be received if the headline were “Hunting the Genetic Signs of Intelligence with an iPhone App.” Yet the research they describe – a genome-wide association study – is exactly the same methodology used in recent genetic research on intelligence and educational attainment. The science isn’t any different, but there’s no talk of identifying superior or inferior mothers. Rather, the research is justified as addressing the needs of “mothers and medical providers clamoring for answers about postpartum depression.” [...]
1. The idea that some people are inferior to other people is abhorrent.
2. The mainstream scientific consensus is that genetic differences between people (within ancestrally homogeneous populations) do predict individual differences in traits and outcomes (e.g., abstract reasoning, conscientiousness, academic achievement, job performance) that are highly valued in our post-industrial, capitalist society.
3. Acknowledging the evidence for #2 is perfectly compatible with belief #1.
4. The belief that one can and should assign merit and superiority on the basis of people’s genes grew out of racist and classist ideologies that were already sorting people as inferior and superior.
5. Instead of accepting the eugenic interpretation of what genetic research means, and then pushing back against the research itself, people – especially people with egalitarian and progressive values — should stop implicitly assuming that genes==inherent merit.
So you are arguing that Scott is a white supremacist, and your pieces of evidence include:
- A comment where Scott says that he doesn't want to have coercive eugenics
- An essay where Scott talks about the best ways of helping people who might be cognitively disadvantaged, and suggests that we should give them a basic income guarantee
- A post where Scott links to and endorses an article which focuses on arguing that considering some people as inferior to others is abhorrent, and that we should reject the racist idea of genetics research having any bearing to how inherently valuable people are
Also the sleight of hand where the author implies that Scott is a white supremacist, and supports this not by referencing anything that Scott said, but by referencing things that unrelated people hanging out on the SSC subreddit have said and which Scott has never shown any signs of endorsing. If Scott himself had said anything that could be interpreted as an endorsement of white supremacy, surely it would have been mentioned in this post, so its absence is telling.
As Tom Chivers recently noted:
It’s part of the SSC ethos that “if you don’t understand how someone could possibly believe something as stupid as they do”, then you should consider the possibility that that’s because you don’t understand, rather than because they’re stupid; the “principle of charity”. So that means taking ideas seriously — even ones you’re uncomfortable with. And the blog and its associated subreddit have rules of debate: that you’re not allowed to shout things down, or tell people they’re racist; you have to politely and honestly argue the facts of the issue at hand. It means that the sites are homes for lively debate, rare on the modern internet, between people who actually disagree; Left and Right, Republican and Democrat, pro-life and pro-choice, gender-critical feminists and trans-activist, MRA and feminist.
And that makes them vulnerable. Because if you’re someone who wants to do a hatchet job on them, you can easily go through the comments and find something that someone somewhere will find appalling. That’s partly a product of the disagreement and partly a function of how the internet works: there’s an old law of the internet, the “1% rule”, which says that the large majority of online comments will come from a hyperactive 1% of the community. That was true when I used to work at Telegraph Blogs — you’d get tens of thousands of readers, but you’d see the same 100 or so names cropping up every time in the comment sections.
(Those names were often things like Aelfric225 or TheUnBrainWashed, and they were usually really unhappy about immigration.)
That’s why the rationalists are paranoid. They know that if someone from a mainstream media organisation wanted to, they could go through those comments, cherry-pick an unrepresentative few, and paint the entire community as racist and/or sexist, even though surveys of the rationalist community and SSC readership found they were much more left-wing and liberal on almost every issue than the median American or Briton. And they also knew that there were people on the internet who unambiguously want to destroy them because they think they’re white supremacists.
Not to be rude, but what context do you recommend would help for interpreting the statement, "I like both basic income guarantees and eugenics," or describing requiring poor people to be sterilized to receive basic income as "probably better than what we have right now?"
The part from the middle of that excerpt that you left out certainly seems like relevant context: "Even though I like both basic income guarantees and eugenics, I don’t think these are two things that go well together – making the income conditional upon sterilization is a little too close to coercion for my purposes. Still, probably better than what we have right now." (see my top-level comment)
Malevolent humans with access to advanced technology—such as whole brain emulation or other forms of transformative AI—could cause serious existential risks and suffering risks.
Possibly relevant: Machiavellians Approve of Mind Upload Technology Directly and Through Utilitarianism (Laakasuo et al. 2020), though it mainly tested whether machiavellians express moral condemnation of mind uploading, rather than their interest directly.
In this preregistered study, we have two novel findings: 1) Utilitarian moral preferences are strongly and psychopathy is mildly associated with positive approval of MindUpload; and 2) that Machiavellianism – essentially a calculative self-interest related trait – is strongly associated with positive approval of Mind Upload, even after controlling for Utilitarianism and the previously known predictor of Sexual Disgust (and conservatism). In our preregistration, we had assumed that the effect would be dependent on Psychopathy (another Dark Triad personality dimension), rather than Machiavellianism. However, given how closely related Machiavellianism and Psychopathy are, we argue that the results match our hypothesis closely. Our results suggest that the perceived risk of callous and selfish individuals preferring Mind Upload should be taken seriously, as previously speculated by Sotala & Yampolskiy (2015)
You seem to be working under the assumption that we have either emotional or logical motivations for doing something. I think that this is mistaken: logic is a tool for achieving our motivations, and all of our motivations ultimately ground in emotional reasons. In fact, it has been my experience that focusing too much on trying to find "logical" motivations for our actions may lead to paralysis, since absent an emotional motive, logic doesn't provide any persuasive reason to do one thing over another.
You said that people act altruistically because "ultimately they're doing it to not feel bad, to feel good, or to help a loved one". I interpret this to mean that these are all reasons which you think are coming from the heart. But can you think of any reason for doing anything which does *not* ultimately ground in something like these reasons?
I don't know you, so I don't want to suggest that I think that I know how your mind works... but reading what you've written, I can't help getting the feeling that the thought of doing something which is motivated in emotion rather than logic makes you feel bad, and that the reason why you don't want to do things which are motivated by emotion is that you have an emotional aversion to it. In my experience, it's very common for people to have an emotional aversion to what they think emotional reasoning is, causing them to convince themselves that they are making their decisions based on logic rather than emotion. If someone has a strong (emotional) conviction that logic is good and emotion is bad, then they will be strongly motivated to try to ground all of their actions in logical reasoning. All the while being unmotivated to notice the reason why they are so invested in logical reasoning. I used to do something like this, which is how I became convinced of the inadequacy of logical reasoning for resolving conflicts such as these. I tried and failed for a rather long time before switching tactics.
The upside of this is that you don't really need to find a logical reason for acting altruistically. Yes, many people who are driven by emotion end up acting selfishly rather than altruistically. But since everyone is ultimately driven by emotions, then as long as you believe that there are people who act altruistically, then that implies that it's possible to act altruistically while being motivated emotionally.
What I would suggest, would be to embrace everything being driven by emotion, and then trying to find a solution which satisfies all of your emotional needs. You say that studying to get a PhD in machine learning would make you feel bad, and also that not doing it is also bad. I don't think that either of these feelings is going to just go away: if you just chose to do a machine learning PhD, or just chose to not do it, then the conflict would keep bothering you regardless, and you'd feel unhappy either way you chose. I'd recommend figuring out the reasons why you would hate the machine learning path, and also the conditions under which you feel bad about not doing enough altruistic work, and then figuring out a solution which would satisfy all of your emotional needs. (CFAR's workshops teach exactly this kind of thing .)
I should also remark that I was recently in a somewhat similar situation as you: I felt that the right thing to do would be to work on AI stuff, but also that I didn't want to. Eventually I came to the conclusion that the reason why I didn't want it was that a part of my mind was convinced that the kind of AI work that I could do, wouldn't actually be as impactful as other things that I could be doing - and this judgment has mostly held up under logical analysis. This is not to say that doing the ML PhD would genuinely be a bad idea for you as well, but I do think that it would be worth examining the reasons for why exactly you wouldn't want to do studies. Maybe your emotions are actually trying to tell you something important? (In my experience, they usually are, though of course it's also possible for them to be mistaken.)
One particular question that I would ask is: you say you would enjoy working in AI, but you wouldn't enjoy learning the stuff that you need to do in order to work in AI. This might make sense in a field where you are required to study something that's entirely unrelated to what's useful for your job. But particularly once you get around doing doing your graduate studies, much of that stuff will be directly relevant for your work. If you think that you would hate to be in an environment where you get to spend most of your time learning about AI, why do you think that you would enjoy a research job, which also requires you to spend a lot of time learning about AI?
My perspective here is that many forms of fairness are inconsistent, and fall apart on significant moral introspection as you try to make your moral preferences consistent. I think the skin-color thing is one of them, which is really hard to maintain as something that you shouldn't pay attention to, as you realize that it can't be causally disentangled from other factors that you feel like you definitely should pay attention to (such as the person's physical strength, or their height, or the speed at which they can run).
I think that a sensible interpretation of "is the justice system (or society in general) fair" is "does the justice system (or society) reward behaviors that are good overall, and punish behaviors that are bad overall"; in other words, can you count on society to cooperate with you rather than defect on you if you cooperate with it. If you get jailed based (in part) on your skin color, then if you have the wrong skin color (which you can't affect), there's an increased probability of society defecting on you regardless of whether you cooperate or defect. This means that you have an extra incentive to defect since you might get defected on anyway. This feels like a sensible thing to try to avoid.
On the other hand, there are also arguments for why one should work to prevent extinction even if one did have the kind of suffering-focused view that you're arguing for; see e.g. this article. To briefly summarize some of its points:
If humanity doesn't go extinct, then it will eventually colonize space; if we don't colonize space, it may eventually be colonized by an alien species with even more cruelty than us.
Whether alternative civilizations would be more or less compassionate or cooperative than humans, we can only guess. We may however assume that our reflected preferences depend on some aspects of being human, such as human culture or the biological structure of the human brain[48]. Thus, our reflected preferences likely overlap more with a (post-)human civilization than alternative civilizations. As future agents will have powerful tools to shape the world according to their preferences, we should prefer (post-)human space colonization over space colonization by an alternative civilization.
A specific extinction risk is the creation of unaligned AI, which might first destroy humanity and then go on to colonize space; if it lacked empathy, it might create a civilization where none of the agents cared about the suffering of others, causing vastly more suffering to exist.
Space colonization by an AI might include (among other things of value/disvalue to us) the creation of many digital minds for instrumental purposes. If the AI is only driven by values orthogonal to ours, it would likely not care about the welfare of those digital minds. Whether we should expect space colonization by a human-made, misaligned AI to be morally worse than space colonization by future agents with (post-)human values has been discussed extensively elsewhere. Briefly, nearly all moral views would most likely rather have human value-inspired space colonization than space colonization by AI with arbitrary values, giving extra reason to work on AI alignment especially for future pessimists.
Trying to prevent extinction also helps avoid global catastrophic risks (GCRs); GCRs could set social progress back, causing much more violence and other kinds of suffering than we have today.
Global catastrophe here refers to a scenario of hundreds of millions of human deaths and resulting societal collapse. Many potential causes of human extinction, like a large scale epidemic, nuclear war, or runaway climate change, are far more likely to lead to a global catastrophe than to complete extinction. Thus, many efforts to reduce the risk of human extinction also reduce global catastrophic risk. In the following, we argue that this effect adds substantially to the EV of efforts to reduce extinction risk, even from the very-long term perspective of this article. This doesn’t hold for efforts to reduce risks that, like risks from misaligned AGI, are more likely to lead to complete extinction than to a global catastrophe. [...]
Can we expect the “new” value system emerging after a global catastrophe to be robustly worse than our current value system? While this issue is debated[60], Nick Beckstead gives a strand of arguments suggesting the “new” values would in expectation be worse. Compared to the rest of human history, we currently seem to be on a unusually promising trajectory of social progress. What exactly would happen if this period was interrupted by a global catastrophe is a difficult question, and any answer will involve many judgements calls about the contingency and convergence of human values. However, as we hardly understand the driving factors behind the current period of social progress, we cannot be confident it would recommence if interrupted by a global catastrophe. Thus, if one sees the current trajectory as broadly positive, one should expect this value to be partially lost if a global catastrophe occurs.
Efforts to reduce extinction risk often promote coordination, peace and stability, which can be useful for reducing the kinds of atrocities that you're talking about.
Taken together, efforts to reduce extinction risk also promote a more coordinated, peaceful and stable global society. Future agents in such a society will probably make wiser and more careful decisions, reducing the risk of unexpected negative trajectory changes in general. Safe development of AI will specifically depend on these factors. Therefore, efforts to reduce extinction risk may also steer the world away from some of the worst non-extinction outcomes, which likely involve war, violence and arms races.
Do you have a short summary of why he thinks that someone answering the question of "would you have preferred to die right after child birth?" with "No?" is not strong evidence that they should have been born?
I don't know what Benatar's response to this is, but - consider this comment by Eliezer in a discussion of the Repugnant Conclusion:
“Barely worth living” can mean that, if you’re already alive and don’t want to die, your life is almost but not quite horrible enough that you would rather commit suicide than endure. But if you’re told that somebody like this exists, it is sad news that you want to hear as little as possible. You may not want to kill them, but you also wouldn’t have that child if you were told that was what your child’s life would be like.
As a more extreme version, suppose that we could create arbitrary minds, and chose to create one which, for its entire existence, experienced immense suffering which it wanted to stop. Say that it experienced the equivalent of being burned with a hot iron, for every second of its existence, and never got used to it. Yet, when asked whether it wanted to die, or would have preferred to die right after it was born, we'd design it in such a way that it would consider death even worse and respond "no". Yet it seems obvious to me that it outputting this response is not a compelling reason to create such a mind.
If people already exist, then there are lots of strong reasons about respecting people's autonomy etc. for why we should respect their desire to continue existing. But if we're making the decision about what kinds of minds should come to existence, those reasons don't seem to be particularly compelling. Especially not since we can construct situations in which we could create a mind that preferred to exist, but where it nonetheless seems immoral to create it.
You can of course reasonably argue that whether a mind should exist, depends on whether they would want to exist and some additional criteria about e.g. how happy they would be. Then if we really could create arbitrary minds, then we might as well (and should) create ones that were happy and preferred to exist, as opposed to ones which were unhappy and preferred to exist. But in that case we've already abandoned the simplicity of just basing our judgment on asking whether they're happy with having survived to their current age.
I surely prefer to exist and would be pretty sad about a world in which I wasn't born (in that I would be willing to endure significant additional suffering in order to cause a world in which I was born).
This doesn't seem coherent to me; once you exist, you can certainly prefer to continue existing, but I don't think it makes sense to say "if I didn't exist, I would prefer to exist". If we've assumed that you don't exist, then how can you have preferences about existing?
If I ask myself the question, "do I prefer a world where I hadn't been born versus a world where I had been born", and imagine that my existence would actually hinge on my answer, then that means that I will in effect die if I answer "I prefer not having been born". So then the question that I'm actually answering is "would I prefer to instantly commit a painless suicide which also reverses the effects of me having come into existence". So that's smuggling in a fair amount of "do I prefer to continue existing, given that I already exist". And that seems to me unavoidable - the only way we can get a mind to tell us whether or not it prefers to exist, is by instantiating it, and then it will answer from a point of view where it actually exists.
I feel like this makes the answer to the question "if a person doesn't exist, would they prefer to exist" either "undefined" or "no" ("no" as in "they lack an active desire to exist", though of course they also lack an active desire to not-exist). Which is probably for the better, given that there exist all kinds of possible minds that would probably be immoral to instantiate, even though once instantiated they'd prefer to exist.
In the past [EAF/FRI] have been rather negative utilitarian, which I have always viewed as an absurd and potentially dangerous doctrine. If you are interested in the subject I recommend Toby Ord’s piece on the subject. However, they have produced research on why it is good to cooperate with other value systems, making me somewhat less worried.
(I work for FRI.) EA/FRI is generally "suffering-focused", which is an umbrella term covering a range of views; NU would be the most extreme form of that, and some of us do lean that way, but many disagree with it and hold some view which would be considered much more plausible by most people (see the link for discussion). Personally I used to lean more NU in the past, but have since then shifted considerably in the direction of other (though still suffering-focused) views.
Besides the research about the value of cooperation that you noted, this article discusses reasons why the expected value of x-risk reduction could be positive even from a suffering-focused view; the paper of mine referenced in your post also discusses why suffering-focused views should care about AI alignment and cooperate with others in order to ensure that we get aligned AI.
And in general it's just straightforwardly better and (IMO) more moral to try to create a collaborative environment where people who care about the world can work together in support of their shared points of agreement, rather than trying to undercut each other. We are also aware of the unilateralist's curse, and do our best to discourage any other suffering-focused people from doing anything stupid.
The following is roughly how I think about it:
If I am in a situation where I need help, then for purely selfish reasons, I would prefer people-who-are-capable-of-helping-me to act in such a way that has the highest probability of helping me. Because I obviously want my probability of getting help, to be as high as possible.
Let's suppose that, as in your original example, I am one of three people who need help, and someone is thinking about whether to act in a way that helps one person, or to act in a way that helps two people. Well, if they act in a way that helps one person, then I have a 1/3 chance of being that person; and if they act in a way that helps two people, then I have a 2/3 chance of being one of those two people. So I would rather prefer them to act in a way that helps as many people as possible.
I would guess that most people, if they need help and are willing to accept help, would also want potential helpers to act in such a way that maximizes their probability of getting help.
Thus, to me, reason and empathy would say that the best way to respect the desires of people who want help, is to maximize the amount of people you are helping.
Hi Daniel,
you argue in section 3.3 of your paper that nanoprobes are likely to be the only viable route to WBE, because of the difficulty in capturing all of the relevant information in a brain if an approach such as destructive scanning is used.
You don't however seem to discuss the alternative path of neuroprosthesis-driven uploading:
we propose to connect to the human brain an exocortex, a prosthetic extension of the biological brain which would integrate with the mind as seamlessly as parts of the biological brain integrate with each other. [...] we make three assumptions which will be further fleshed out in the following sections:
• There seems to be a relatively unified cortical algorithm which is capable of processing different types of information. Most, if not all, of the information processing in the brain of any given individual is carried out using variations of this basic algorithm. Therefore we do not need to study hundreds of different types of cortical algorithms before we can create the first version of an exocortex.
• We already have a fairly good understanding on how the cerebral cortex processes information and gives rise to the attentional processes underlying consciousness. We have a good reason to believe that an exocortex would be compatible with the existing cortex and would integrate with the mind.
• The cortical algorithm has an inbuilt ability to transfer information between cortical areas. Connecting the brain with an exocortex would therefore allow the exocortex to gradually take over or at least become an interface for other exocortices.
In addition to allowing for mind coalescence, the exocortex could also provide a route for uploading human minds. It has been suggested that an upload can be created by copying the brain layer-by-layer [Moravec, 1988] or by cutting a brain into small slices and scanning them [Sandberg & Bostrom, 2008]. However, given our current technological status and understanding of the brain, we suggest that the exocortex might be a likely intermediate step. As an exocortex-equipped brain aged, degenerated and eventually died, an exocortex could take over its functions, until finally the original person existed purely in the exocortex and could be copied or moved to a different substrate.
This seems to avoid the objection of it being too hard to scan the brain in all detail. If we can replicate the high-level functioning of the cortical algorithm, then we can do so in a way which doesn't need to be biologically realistic, but which will still allow us to implement the brain's essential functions in a neural prosthesis (here's some prior work that also replicates some aspect of brain's functioning and re-implements it in a neuroprosthesis, without needing to capture all of the biological details). And if the cortical algorithm can be replicated in a way that allows the person's brain to gradually transfer over functions and memories as the biological brain accumulates damage, the same way that function in the biological brain gets reorganized and can remain intact even as it slowly accumulates massive damage61127-1), then that should allow the entirety of the person's cortical function to transfer over to the neuroprosthesis. (of course, there are still the non-cortical parts of the brain that need to be uploaded as well)
A large challenge here is in getting the required amount of neural connections between the exocortex and the biological brain; but we are already getting relatively close, taking into account that the corpus callosum that connects the two hemispheres "only" has on the order of 100 million connections:
Earlier this year, the US Defense Advanced Research Projects Agency (DARPA) launched a project called Neural Engineering System Design. It aims to win approval from the US Food and Drug Administration within 4 years for a wireless human brain device that can monitor brain activity using 1 million electrodes simultaneously and selectively stimulate up to 100,000 neurons. (source)
Also, one forthcoming paper of mine released as a preprint; and another paper that was originally published informally last year but published in somewhat revised and peer-reviewed form this year:
- Sotala, Kaj (2018). Disjunctive Scenarios of Catastrophic AI Risk. AI Safety and Security (Roman Yampolskiy, ed.), CRC Press. Forthcoming.
- Sotala, Kaj (2017) How Feasible is the Rapid Development of Artificial Superintelligence? Physica Scripta 92 (11), 113001.
Both were done as part of my research for the Foundational Research Institute; maybe include us in your organizational comparison next year? :)
There seem to be a lot of leads that could help us figure out the high-value interventions, though: i) knowledge about what causes it and what has contributed to changes of it over time ii) research directions that could help further improve our understanding of what causes it / what doesn't cause it iii) various interventions which already seem like they work in a small-scale setting, though it's still unclear how they might be scaled up (e.g. something like Crucial Conversations is basically about increasing trust and safety in one-to-one and small-group conversations) iv) and of course psychology in general is full of interesting ideas for improving mental health and well-being that haven't been rigorously tested, which also suggests that v) any meta-work that would improve psychology's research practices would also be even more valuable than we previously thought.
As for the "pointing out a problem people have been aware of for millenia", well, people have been aware of global poverty for millenia too. Then we got science and randomized controlled trials and all the stuff that EAs like, and got better at fixing the problem. Time to start looking at how we could apply our improved understanding of this old problem, to fixing it.
Thanks for the reference! That sounds valuable.
I think whether suffering is a 'natural kind' is prior to this analysis: e.g., to precisely/objectively explain the functional role and source of something, it needs to have a precise/crisp/objective existence.
I take this as meaning that you agree that accepting functionalism is orthogonal to the question of whether suffering is "real" or not?
If it is a placeholder, then I think the question becomes, "what would 'something better' look like, and what would count as evidence that something is better?
What something better would look like - if I knew that, I'd be busy writing a paper about it. :-) That seems to be a part of the problem - everyone (that I know of) agrees that functionalism is deeply unsatisfactory, but very few people seem to have any clue of what a better theory might look like. Off the top of my head, I'd like such a theory to at least be able to offer some insight into what exactly is conscious, and not have the issue where you can hypothesize all kinds of weird computations (like Aaronson did in your quote) and be left confused about which of them are conscious and which are not, and why. (roughly, my desiderata are similar to Luke Muehlhauser's)
That's super-neat! Thanks.
Wait, are you equating "functionalism" with "doesn't believe suffering can be meaningfully defined"? I thought your criticism was mostly about the latter; I don't think it's automatically implied by the former. If you had a precise enough theory about the functional role and source of suffering, then this would be a functionalist theory that specified objective criteria for the presence of suffering.
(You could reasonably argue that it doesn't look likely that functionalism will provide such a theory, but then I've always assumed that anyone who has thought seriously about philosophy of mind has acknowledged that functionalism has major deficiencies and is at best our "least wrong" placeholder theory until somebody comes up with something better.)