Comment by lukas_gloor on What we talk about when we talk about life satisfaction · 2019-02-05T20:24:26.868Z · score: 10 (4 votes) · EA · GW

I feel like the issues with "How satisfied are you with your life, on a scale of 0 to 10?" run even a bit deeper than indicated. The series "The Americans" is about a married couple of Soviet spies living a painful and difficult life in America during the Cold War. The wife still believes in the mission, her husband not so much (but continues doing the job for the sake of his wife). Let's say the wife rates her life satisfaction 9/10 because she convinced herself that she's bravely doing highly important work. The husband rates his life satisfaction 1/10. They'd both score about the same in terms of more objective metrics like socioeconomic status, how much time they spend on hobbies or with their kids, how much stress they have, etc. But they interpret things differently because the wife ascribes meaning to her hardships and is proud of her accomplishments, while the husband feels trapped and like he wasted his life and endangered his children for no good reason.

Comment by lukas_gloor on Why I expect successful (narrow) alignment · 2018-12-30T04:30:33.905Z · score: 6 (6 votes) · EA · GW

"Between 1 and 10%" also feels surprisingly low to me for general AI-related catastrophes. I at least would have thought that experts are less optimistic than that.

But pending clarification, I wouldn't put much weight on this estimate given that the interviews mentioned in the 80k problem area profile you link to seemed to be about informing the entire problem profile rather than this estimate specifically. So it's not clear e.g. whether the interviews included a question about all-things-considered risk for AI-related catastrophe that was asked to Nick Bostrom, an anonymous leading professor of computer science, Jaan Tallinn, Jan Leike, Miles Brundage, Nate Soares, and Daniel Dewey.

Comment by lukas_gloor on Why I prioritize moral circle expansion over artificial intelligence alignment · 2018-12-25T23:24:45.258Z · score: 6 (5 votes) · EA · GW

Next to the counterpoints mentioned by Gregory Lewis, I think there is an additional reason why MCE seems less effective than more targeted interventions to improve the quality of the long-term future: Gains from trade between humans with different values become easier to implement as the reach of technology increases. As long as a non-trivial fraction of humans end up caring about animal wellbeing or digital minds, it seems likely it would be cheap for other coalitions to offer trades. So whether 10% of future people end up with an expanded moral circle or 100% may not make much of a difference to the outcome: It will be reasonably good either way if people reap the gains from trade.

One might object that it is unlikely that humans would be able to cooperate efficiently, given that we don't see this type of cooperation happening today. However, I think it's reasonable to assume that staying in control of technological progress beyond the AGI transition requires a degree of wisdom and foresight that is very far away from where most societal groups are at today. And if humans do stay in control, then finding a good solution for value disagreements may be the easier problem, or at worst similarly hard. So it feels to me that most likely, we either we get a future that goes badly for reasons related to lack of coordination and sophistication in the pre-AGI stage, or we get a future where humans set things up wisely enough to actually design an outcome that is nice (or at least not amongst the 10% of worst outcomes) by the lights of nearly everyone.

Brian Tomasik made the point that conditional on human values staying in control, we may be very unlikely to get something like broad moral reflection. Instead, values could be determined by a very small group of individuals who happened to be in power by the time AGI arrives (as opposed to individuals ending up there because they were unusually foresighted and also morally motivated). This feels possible too, but it seems to not be the likely default to me because I suspect that you'd need to necessarily increase your philosophical sophistication in order to stay in control of AGI, and that probably gives you more pleasant outcomes (correlational claim). Iterated amplification for instance, as an approach to AI alignment, has several uses for humans: Humans are not only where the resulting values come from, but they're also in charge of keeping the bootstrapping process on track and corrigible. And as this post on factored cognition illustrates, this requires sophistication to set up. So if that's the bar that AGI creators need to pass before they can determine how "human values" are to be extrapolated, maybe we shouldn't be too pessimistic about the outcome. It seems kind of unlikely that someone would go through all of that only to be like "I'm going to implement my personal best guess about what matters to me, with little further reflection, and no other humans get a say here." Similarly, it also feels unlikely that people would go through with all that and not find a way to make subparts of the population reasonably content about how sentient subroutines are going to be used.

Now, I feel a bit confused about the feasibility of AI alignment if you were to do it somewhat sloppily and with lower standards. I think that there's a spectrum from "it just wouldn't work at all and not be competitive" (and then people would have to try some other approach) to "it would produce a capable AGI but it would be vulnerable to failure modes like adversarial exploits or optimization daemons, and so it would end up with not human values". These failure modes, to the very small degree I currently understand them, sound like they would not be sensitive to whether the human whose approval you tried to approximate had an expanded moral circle or not. I might be wrong about that. If people mostly want sophisticated alignment procedures because they care about preserving the option for philosophical reflection, rather than because they also think that you simply run into large failure modes otherwise, then it seems like (conditional on some kind of value alignment) whether we get an outcome with broad moral reflection is not so clear. If it's technically easier to build value-aligned AI with very parochial values, then MCE could make a relevant difference to these non-reflection outcomes.

But all in all my argument is that it's somewhat strange to assume that a group of people could succeed at building an AGI optimized for its creators' values, without having to put in so much thinking about how to get this outcome right that they'd almost can't help but become reasonably philosophically sophisticated in the process. And sure, philosophically sophisticated people can still have fairly strange values by your own lights, but it seems like there's more convergence. Plus I'd at least be optimistic about their propensity to strive towards positive-sum outcomes, given how little scarcity you'd have if the transition does go well.

Of course, maybe value-alignment is going to work very differently from what people currently think. The main way I'd criticize my above points is that they're based on heavy-handed inside-view thinking about how difficult I (and others I'm updating towards) expect the AGI transition to be. If AGI will be more like the Industrial Revolution rather than something that is even more difficult to stay remotely in control of, or if some other technology proves to be more consequential than AGI, then my argument has less force. I mainly see this as yet another reason to caveat that the ex ante plausible-seeming position that MCE can have a strong impact on AGI outcomes starts to feel more and more conjunctive the more you zoom in and try to identify concrete pathways.

Comment by lukas_gloor on Should donor lottery winners write reports? · 2018-12-23T00:41:20.838Z · score: 5 (4 votes) · EA · GW

Other reasons why someone competent at picking grants may not feel comfortable with the thought of having to write a report might be because writing specifically isn't their strength or because exposing their thinking to public scrutiny might be anxiety-inducing.

Comment by lukas_gloor on Why I'm focusing on invertebrate sentience · 2018-12-12T13:41:42.751Z · score: 5 (4 votes) · EA · GW

There's some related discussion here.

Lock in can also apply to "value-precursors" that determine how one goes about moral reflection, or which types of appeals one ends up finding convincing. I think these would get locked in to some degree (because something has to be fixed for it to be meaningful to talk about goalposts at all), and by affecting the precursors, moral or meta-philosophical reflection before aligned AGI can plausibly affect the outcomes post-AGI. It's not very clear however whether that's important, and from whose perspective it is important, because some of the things that mask as moral uncertainty might be humans having underdetermined values.

Comment by lukas_gloor on Launching the EAF Fund · 2018-11-29T00:18:47.355Z · score: 7 (5 votes) · EA · GW

Yeah. I put it the following way in another post:

Especially when it comes to the prevention of s-risks affecting futures that otherwise contain a lot of happiness, it matters a great deal how the risk in question is being prevented. For instance, if we envision a future that is utopian in many respects except for a small portion of the population suffering because of problem x, it is in the interest of virtually all value systems to solve problem x in highly targeted ways that move probability mass towards even better futures. By contrast, only few value systems (ones that are strongly or exclusively about reducing suffering/bad things) would consider it overall good if problem x was “solved” in a way that not only prevented the suffering due to problem x, but also prevented all the happiness from the future scenario this suffering was embedded in.

So it'd be totally fine to address all sources of unnecessary suffering (and even "small" s-risks embedded in an otherwise positive future) if there are targeted ways to bring about uncontroversial improvements. :) In practice, it's sometimes hard to find interventions that are targeted enough because affecting the future is very very difficult and we only have crude levers. Having said that, I think many things that we're going to support with the fund are actually quite positive for positive-future-oriented value systems as well. So there certainly are some more targeted levers.

There are instances where it does feel justified to me to also move some probability mass away from s-risks towards extinction (or paperclip scenarios), but that should be reserved either for uncontroversially terrible futures, or for those futures where most of the disvalue for downside-focused value systems comes from. I doubt that this includes futures where 10^10x more people are happy than unhappy.

And of course positive-future-oriented EAs face analogous tradeoffs of cooperation with other value systems.

Comment by lukas_gloor on Launching the EAF Fund · 2018-11-28T23:13:14.529Z · score: 15 (8 votes) · EA · GW
At least for me, this would be a pretty amazing outcome, and not something which should be prevented.

Yeah, we're going to change the part that equates "worst case" with "s-risks". Your view is common and reflects many ethical perspectives.

We were already thinking about changing the definition of "s-risk" based on similar feedback, to make it more intuitive and cooperative in the way you describe. It probably makes more sense to have it refer to only the few % of scenarios where most of the future's expected suffering comes from (assuming s-risks are indeed heavy-tailed). These actual worst cases are what we want to focus on with the fund.

do I interpret it correctly that in the ethical system held by the fund human extinction is comparatively benign outcome in comparison with risks like creation of 10^25 unhappy minds even if they are offset by much larger [say 10^10x larger] number of happy minds?

No, that's incorrect. Insofar as some fund managers hold this view personally (e.g., I do, while Jonas would agree with you that the latter outcome is vastly better), it won't affect decisions because in any case, we want to avoid doing things that are weakly positive on some plausible moral views and very negative on others. But I can see why you were concerned, and thanks for raising this issue!

Comment by lukas_gloor on 1. What Is Moral Realism? · 2018-06-05T18:27:00.504Z · score: 1 (1 votes) · EA · GW

To turn the tables a bit here, I would say that to reject moral realism, on my account, one would need to say that there is no genuine normative force or property in, say, a state of extreme suffering (consider being fried in a brazen bull for concreteness).

Cool! I think the closest I'll come to discussing this view is in footnote 18. I plan to have a post on moral realism via introspection about the intrinsic goodness (or badness) of certain conscious states.

I agree with reductionism about personal identity and I also find this to be one of the most persuasive arguments in favor of altruistic life goals. I would not call myself an open indvidualist though because I'm not sure what the position is exactly saying. For instance, I don't understand how it differs from empty individualism. I'd understand if these are different framings or different metaphores, but if we assume that we're talking about positions that can be true or false, I don't understand what we're arguing about when asking whether open individualism or true, or when discussing open vs. empty individualism.
Also, I think it's perfectly coherent to have egoistic goals even under a reductionist view of personal identity. (It just turns out that egoism is not a well-defined concept either, and one has to make some judgment calls if one ever expects to encounter edge-cases for which our intuitions give no obvious answers about whether something is still "me.")

With respect to the One Compelling Axiology you mention, Lukas, I am not sure why you would set the bar so high in terms of specificity in order to accept a realist view. I mean, if “all philosophers or philosophically-inclined reasoners” found plausible a simple, yet inexhaustive principle like “reduce unnecessary suffering” why would that not be good enough to demonstrate its "realism" (on your account) when a more specific one would? It is unclear to me why greater specificity should be important, especially since even such an unspecific principle still would have plenty of practical relevance (many people can admit that they are not living in accordance with this principle, even as they do accept it).

Yeah, fair point. I mean, even Railton's own view has plenty of practical relevance in the sense that it highlights that certain societal arrangements lead to more overall well-being or life satisfaction than others. (That's also a point that Sam Harris makes.) But if that's all we mean by "moral realism" then it would be rather trivial. Maybe my criteria are a bit too strict, and I would indeed already regard it as extremely surprising if you get something like One Compelling Axiology that agrees on population ethics while leaving a few other things underdetermined.

Comment by lukas_gloor on 1. What Is Moral Realism? · 2018-05-31T09:17:49.058Z · score: 1 (1 votes) · EA · GW

Inspired by another message of yours, there's at least one important link here that I failed to mention: If moral discourse is about a, b, and c, and philosophers then say they want to make it about q and argue for realism about q, we can object that whatever they may have shown us regarding realism about q, it's certainly not moral realism. And it looks like the Loeb paper also argues that if moral discourse is about mutually incompatible things, that looks quite bad for moral realism? Those are good points!

Comment by lukas_gloor on 1. What Is Moral Realism? · 2018-05-31T08:53:53.532Z · score: 1 (1 votes) · EA · GW

Do you think your argument also works against Railton's moral naturalism, or does my One Compelling Axiology (OTA) proposal introduce something that breaks the idea? The way I meant it, OTA is just a more extreme version of Railton's view.

I think I can see what you're pointing to though. I wrote:

Note that this proposal makes no claims about the linguistic level: I’m not saying that ordinary moral discourse let’s us define morality as convergence in people’s moral views after philosophical reflection under ideal conditions. (This would be a circular definition.) Instead, I am focusing on the aspect that such convergence would be practically relevant: [...]

So yes, this would be a bad proposal for what moral discourse is about. But it's meant like this: Railton claims that morality is about doing things that are "good for others from an impartial perspective." I like this and wanted to work with that, so I adopt this assumption, and further add that I only want to call a view moral realism if "doing what is good for others from an impartial perspective" is well-specified. Then I give some account of what it would mean for it to be well-specified.

In my proposal, moral facts are not defined as that which people arrive at after reflection. Moral facts are still defined as the same thing Railton means. I'm just adding that maybe there are no moral facts in the way Railton means if we introduce the additional requirement that (strong) underdetermination is not allowed.

Comment by lukas_gloor on 1. What Is Moral Realism? · 2018-05-25T16:18:40.566Z · score: 1 (1 votes) · EA · GW

Probably intuitions about this issue depend on which type of moral or religious discourse one is used to. As someone who spent a year at a Christian private school in Texas where creationism was taught in Biology class and God and Jesus were a very tangible part of at least some people's lives, I definitely got a strong sense that the metaphysical questions are extremely important.

By contrast, if the only type of religious claims I'd ever came into contact with had been moderate (picture the average level of religiosity of a person in, say, Zurich), then one may even consider it a bit of a strawman to assume that religious claims are to be taken literally.

I think this concern is somewhat relevant to the broader discussion, too, because you seem to imply that we can't (or even shouldn't?) make any advances on non-metaphysical claims before we haven't figured out the metaphysical ones.

Just to be clear, all I'm saying is that I think it's going to be less useful to discuss "what are moral claims usually about." What we should instead do is instead what Chalmers describes (see the quote in footnote 4). Discussing what moral claims are usually about is not the same as making up one's mind about normative ethics. I think it's very useful to discuss normative ethics, and I'd even say that discussing whether anti-realism or realism is true might be slightly less important than making up one's mind about normative ethics. Sure, it informs to some extent how to reason about morality, but as has been pointed out, you can make some progress about moral questions also from a lens of agnosticism about realism vs. anti-realism.

To go back to the religion analogy, what I'm recommending is to first figure out whether you believe in a God or an afterlife that would relevantly influence your priorities now, and not worry much about whether religious claims are "usually" or "best" to be taken literally or taken metaphorically(?).

Comment by lukas_gloor on 1. What Is Moral Realism? · 2018-05-23T13:13:58.022Z · score: 3 (3 votes) · EA · GW

Why should we care whether or not moral realism is true?

I plan to address this more in a future post, but the short answer is this that for some ways in which moral realism has been defined, it really doesn't matter (much). But there are some versions of moral realism that would "change the game" for those people who currently reject them. And vice-versa, if one currently endorses a view that corresponds to the two versions of "strong moral realism" described in the last section of my post, one's priorities could change noticeably if one changes one's mind towards anti-realism.

What do you think are the implications of moral anti-realism for choosing altruistic activities?

It's hard to summarize this succinctly because for most of the things that are straightforwardly important under moral realism (such as moral uncertainty or deferring judgment to future people who are more knowledgeable about morality), you can also make good arguments in favor of them going from anti-realist premises. Some quick thoughts:

– The main difference is that things become more "messy" with anti-realism.

– I think anti-realists should, all else equal, be more reluctant to engage in "bullet biting" where you abandon some of your moral intuitions in favor of making your moral view "simpler" or "more elegant." The simplicity/elegance appeal is that if you have a view with many parameters that are fine-tuned for your personal intuitions, it seems extremely unlikely that other people would come up with the same parameters if they only thought about morality more. Moral realists may think that the correct answer to morality is one that everyone who is knowledgeable enough would endorse, whereas anti-realists may consider this a potentially impossible demand and therefore place more weight on finding something that feels very intuitively compelling on the individual level. Having said that, I think there are a number of arguments why even an anti-realist might want to adopt moral views that are "simple and elegant." For instance, people may care about doing something meaningful that is "greater than their own petty little intuitions" – I think this is an intuition that we can try to cash out somehow even if moral realism turns out to be false (it's just that it can be cashed out in different ways).

– "Moral uncertainty" works differently under anti-realism, because you have to say what you are uncertain about (it cannot be the one true morality because anti-realism says there is no such thing). One can be uncertain about what one would value after moral reflection under ideal conditions. This kind of "valuing moral reflection" seems like a very useful anti-realist alternative to moral uncertainty. The difference is that "valuing reflection" may be underdefined, so anti-realists have to think about how to distinguish having underdefined values from being uncertain about their values. This part can get tricky.

– There was recently a discussion about "goal drift" in the EA forum. I think it's a bigger problem with anti-realism all else equal (unless one's anti-realist moral view is egoism-related.) But again, there are considerations that go into both directions. :)

Comment by lukas_gloor on 1. What Is Moral Realism? · 2018-05-22T22:49:57.809Z · score: 1 (1 votes) · EA · GW

The idea is that morality is the set of rules that impartial, rational people would advocate as a public system.

Yes, this sounds like constructivism. I think this is definitely a useful framework for thinking about some moral/morality-related questions. I don't think all of moral discourse is best construed as being about this type of hypothetical rule-making, but like I say in the post, I don't think interpreting moral discourse should be the primary focus.

Rationality is understood, roughly speaking, as the set of things that virtually all rational agents would be averse to. This ends up being a list of basic harms--things like pain, death, disability, injury, loss of freedom, loss of pleasure.

Hm, this sounds like you're talking about a substantive concept of rationality, as opposed to a merely "procedural" or "instrumental" concept of rationality (such as it's common on Lesswrong and with anti-realist philosophers like Bernard Williams). Substantive concepts of rationally always go under moral non-naturalism, I think.

My post is a little confusing with respect to the distinction here, because you can be a constructivist in two different ways: Primarily as an intersubjectivist metaethical position, and "secondarily" as a form of non-naturalism. (See my comments on Thomas Sittler's chart.)

People can be incorrect about whether a thing is harmful, just as they can be incorrect about whether a flower is red. But there's nothing much more objective or "facty" about whether the plant is red than that ordinary human language users on earth are disposed to see and label it as red.

Yeah, it should be noted that "strong" versions of moral realism are not committed to silly views such as morality existing in some kind of supernatural realm. I often find it difficult to explain moral non-naturalism in a way that makes it sound as non-weird as when actual moral non-naturalists write about it, so I have to be careful to not strawman these positions. But what you describe may still qualify as "strong" because you're talking about rationality as a substantive concept. (Classifying something as a "harm" is one thing if done in a descriptive sense, but probably you're talking about classifying things as a harm in a sense that has moral connotations – and that gets into more controversial territory.)

The book title "normative bedrock" also sounds relevant because my next post will talk about "bedrock concepts" (Chalmers) at length, and specifically about "irreducible normativity" as a bedrock concept, which I think makes up the core of moral non-naturalism.

Comment by lukas_gloor on 1. What Is Moral Realism? · 2018-05-22T22:10:31.972Z · score: 4 (4 votes) · EA · GW

Cool! I think this is helpful both in itself as well as here as a complement to my post. I also thought about making a chart but was too lazy in the end. If I may, I'll add some comments about how this chart relates to the distinctions I made/kept/found:

"Judgment-dependent cognitivism" corresponds to what I labelled subjectivism and intersubjectivism, and "judgment-_in_dependent cognitivism" corresponds to "objectivism." (Terminology adopted from the Sayre-McCord essay; but see the Scanlon essay for the other terminology.)

I'm guessing "Kantian rationalism" refers to views such as Scanlon's view. I'm didn't go into detail in my post with explaining the difference between constructivism as an intersubjectivist position and constructivism as a version of non-naturalism. I tried to say something about that in footnote 7 but I fear it'll only become more clear in my next post. Tl;dr is that non-naturalists think that we can have externalist reasons for doing something, reasons we cannot "shake off" by lacking internal buy-in or internal motivation. By contrast, someone who merely endorses constructivism as an intersubjectivist (or "judgment-dependent") view, such as Korsgaard for instance, would reject these externalist reasons.

I agree with the way you draw the lines between the realist and the anti-realist camp. The only thing I don't like about this (and this is a criticism not about your chart, but about the way philosophers have drawn these categories in the first place) is that it makes it seem as though we have to choose exactly one view. But by removing the entire discussion from the "linguistic level" (taking a stance on how we interpret moral discourse), we can acknowledge e.g. that subjectivism or intersubjectivism represent useful frameworks for thinking about morality-related questions, whether moral discourse is completely subjectivist or intersubjectivist in nature or not. And even if moral discourse was all subjectivist (which seems clearly wrong to me but let's say it's a hypothetical), for all we'd know that could still allow for the possibility that an objectivist moral reality exists in a meaningful and possibly action-relevant sense. I like Luke's framing of "pluralistic moral reductionism" because it makes clear that there is more than one option.

1. What Is Moral Realism?

2018-05-22T15:49:52.516Z · score: 19 (18 votes)
Comment by lukas_gloor on “EA” doesn’t have a talent gap. Different causes have different gaps. · 2018-05-21T11:03:54.279Z · score: 2 (2 votes) · EA · GW

Talent gap - Middle (~50 people)

If the AI safety/alignment community is altogether around 50 people, that's a large relative gap. Depending on how you count it might be bigger than 50 people, but the talent gap seems large in relative terms in either case. :)

Comment by lukas_gloor on Empirical data on value drift · 2018-04-24T12:04:10.282Z · score: 3 (3 votes) · EA · GW

It is much less clear whether this person would think they've made a mistake in allocating more of themselves away from EA, either at t2-now (they don't regret they now have a family which takes their attention away from EA things), or at t1-past (if their previous EA-self could forecast them being in this situation, they would not be disappointed in themselves). If so, these would not be options that their t1-self should be trying to shut off, as (all things considered) the option might be on balance good. I am sure there are cases where 'life gets in the way' in a manner it is reasonable to regret. But I would be chary [...]

You discuss a case where there is regret from the perspective of both t1 and t2, and a case where there is regret from neither perspective. These are both plausible accounts. But there's also a third option that I think happens a lot in practice: Regret at t1 about the projected future in question, and less/no regret at t2. So the t2-self may talk about "having become more wise" or "having learned something about myself," while the t1-self would not be on board with this description and consider the future in question to be an unfortunate turn of events. (Or the t2-self could even acknowledge that some decisions in the sequence were not rational, but that from their current perspective, they really like the way things are.)

The distinction between moral insight and failure of goal preservation is fuzzy. Taking precautions against goal drift is a form of fanaticism and commonsense heuristics speak against that. OTOH, not taking precautions seems like not taking the things you currently care about seriously (at least insofar as there are things you care about that go beyond aspects related to your personal development).

Unfortunately I don't think there is a safe default. Not taking precautions is tantamount to making the decision to be okay with potential value drift. And we cannot just say we are uncertain about our values, because that could result in mistaking uncertainty with underdetermination. There are meaningful ways of valuing further reflection about one's own values, but those types of "indirect" values, where one values further reflection, they can also suffer from (more subtle) forms of goal drift.

Comment by lukas_gloor on Cause prioritization for downside-focused value systems · 2018-03-16T18:45:58.101Z · score: 1 (1 votes) · EA · GW

There isn't much more except that I got the impression that people in EA who have thought about this a lot think recovery is very likely, and I'm mostly deferring to them. The section about extinction risk is the part of my post where I feel the least knowledgeable. As for additional object-level arguments, I initially wasn't aware of points such as crops and animals already being cultivated/domesticated, metals already mined, and there being alternatives to rapid growth induced by fossil fuels, one of which being slow but steady growth over longer time periods. The way cultural evolution works is that slight improvements from innovations (which are allowed to be disjunctive rather than having to rely on developing a very specific technology) spread everywhere, which makes me think that large populations + a lot of time should go far enough eventually. Note also that if all-out extinction is simply very unlikely to ever happen, then you have several attempts left to reach technological maturity again.

Comment by lukas_gloor on Why I prioritize moral circle expansion over artificial intelligence alignment · 2018-02-26T07:23:25.883Z · score: 4 (4 votes) · EA · GW

I think that there's an inevitable tradeoff between wanting a reflection process to have certain properties and worries about this violating goal preservation for at least some people. This blogpost is not about MCE directly, but if you think of "BAAN thought experiment" as "we do moral reflection and the outcome is such a wide circle that most people think it is extremely counterintuitive" then the reasoning in large parts of the blogpost should apply perfectly to the discussion here.

That is not to say that trying to fine tune reflection processes is pointless: I think it's very important to think about what our desiderata should be for a CEV-like reflection process. I'm just saying that there will be tradeoffs between certain commonly mentioned desiderata that people don't realize are there because they think there is such a thing as "genuinely free and open-ended deliberation."

Comment by lukas_gloor on Announcing Effective Altruism Community Building Grants · 2018-02-25T12:55:52.574Z · score: 3 (3 votes) · EA · GW

EA London estimated with it's first year of a paid staff it had about 50% of the impact of a more established EA organisation such as GWWC or 80K per £ invested.

Are they mostly counting impact on Givewell-recommended charities? I'd imagine that for donors who are mostly interested in the long-term cause area, there'd be a perceived large difference between GWWC and 80k, which is why this sounds like a weird reference class to me. (Though maybe the difference is not huge because GWWC has become more cause neutral over the years?)

Comment by lukas_gloor on Cause prioritization for downside-focused value systems · 2018-02-02T15:43:57.257Z · score: 8 (8 votes) · EA · GW

If I understand it, the project is something like "how do your priorities differ if you focus on reducing bad things over promoting good things?"

This sounds accurate, but I was thinking of it with empirical cause prioritization already factored in. For instance, while a view like classical utilitarianism can be called "symmetrical" when it comes to normatively prioritizing good things and bad things (always with some element of arbitrariness because there are no "proper units" of happiness and suffering), in practice the view turns out to be upside-focused because, given our empirical situation, there is more room for creating happiness/good things than there is future expected suffering left to prevent. (Cf. the astronomical waste argument.)

This would go the other way if we had good reason to believe that the future will be very bad, but I think the classical utilitarians who are optimistic about the future (given their values) are right to be optimistic: If you count the creation of extreme happiness as not-a-lot-less important than the prevention of extreme suffering, then the future will in expectation be very valuable according to your values (see footnote [3]).

but I don't see how you can on to draw anything conclusions about that because downside (as well as upside) morality covers so many different things.

My thinking is that when it comes to interventions that affect the long-term future, different normative views tend to converge roughly into two large clusters for the object-level interventions they recommend. If the future will be good for your value system, reducing exinction risks and existential risk related to "not realizing full potential" will be most important. If your value system makes it harder to attain vast amounts of positive value through bringing about large (in terms of time and/or space) utopian futures, then you want to focus specifically on (cooperative ways of) reducing suffering risks or downside risks generally. The cut-off point is determined by what the epistemically proper degree of optimism or pessimism is with regard to the quality of the long-term future, and to what extent we can have an impact on that. Meaning, if we had reason to believe that the future will be very negative and that effort to make the future contain vast amounts of happiness are very very very unlikely to ever work, then even classical utilitarianism would count as "downside-focused" according to my classification.

Some normative views simply don't place much importance on creating new happy people, in which case they kind of come out as downside-focused by default (except for the consideration I mention in footnote 2). (If these views give a lot of weight to currently existing people, then they can be both downside-focused and give high priority to averting extinction risks, which is something I pointed out in the third-last paragraph in the section on extinction risks.)

Out of the five examples you mentioned, I'd say they fall into the two clusters as follows: Downside-focused: absolute NU, lexical NU, lexical threshold NU and a "negative-leaning" utilitarianism that is sufficiently negative-leaning to counteract our empirical assessment of how much easier it will be to create happiness than to prevent suffering. The rest is upside-focused (maybe with some stuck at "could go either way"). How much is "sufficiently negative-leaning"? It becomes tricky because there are not really any "proper units" of happiness and suffering, so we have to first specify what we are comparing. See footnote 3: My own view is that the cut-off is maybe very roughly at around 100, but I mentioned "100 or maybe 1,000" to be on the conservative side. And these refer to comparing extreme happiness to extreme suffering. Needless to say, it is hard to predict the future and we should take such numbers with a lot of caution, and it seems legitimate for people to disagree. Though I should qualify that a bit: Say, if someone thinks that classical utilitarians should not work on extinction risk reduction because the future is too negative, or if someone thinks even strongly negative-leaning consequentialists should have the same ranking of priorities as classical utilitarians because the future is so very positive, then both of these have to explain away strong expert disagreement (at least within EA; I think outside of EA, people's predictions are all over the place, with economists generally being more optimistic).

Lastly, I don't think proponents of any value system should start to sabotage other people's efforts, especially not since there are other ways to create value according to your own value systems that is altogether much more positive sum. Note that this – the dangers of naive/Machiavellian consequentialism – is a very general problem that reaches far deeper than just value differences. Say you have two EAs who both think creating happiness is 1/10th as important as reducing suffering. One is optimistic about the future, the other has become more pessimistic after reading about some new arguments. They try to talk out the disagreement, but do not reach agreement. Should the second EA now start to sabotage the efforts of the first one, or vice versa? That seems ill-advised; no good can come from going down that path.

Cause prioritization for downside-focused value systems

2018-01-31T14:47:11.961Z · score: 29 (32 votes)
Comment by lukas_gloor on Multiverse-wide cooperation in a nutshell · 2017-11-04T17:11:43.021Z · score: 4 (4 votes) · EA · GW

For every conceivable value system, there is an exactly opposing value system, so that there is no room for gains from trade between the systems (e.g. suffering maximizers vs suffering minimizers).

There is an intuition that "disorderly" worlds with improbable histories must somehow "matter less," but it's very hard to cash out what this could mean. See this post or this proposal. I'm not sure these issues are solved yet (probably not). (I'm assuming that suffering maximizers or other really weird value systems would only evolve, or be generated when lightning hits someone's brain or whatever, in very improbable instances.)

Sidenote: If you assume decision algorithm and values to be orthogonal, why do you suggest to "adjust [the values to cooperate with] by the degree their proponents are receptive to MSR ideas"?

Good point; this shows that I'm skeptical about a strong version of independence where values and decision algorithms are completely uncorrelated. E.g., I find it less likely that deep ecologists would change their actions based on MSR than people with more EA(-typical) value systems. It is open to discussion whether (or how strongly) this has to be corrected for historical path dependencies and founder effects: If Eliezer had not been really into acausal decision theory, perhaps the EA movement would think somewhat differently about the topic. If we could replay history many times over, how often would EA be more or less sympathetic to superrationality than it is currently?

Multiverse-wide cooperation in a nutshell

2017-11-02T10:17:14.386Z · score: 15 (17 votes)
Comment by lukas_gloor on Cognitive Science/Psychology As a Neglected Approach to AI Safety · 2017-08-31T14:04:10.127Z · score: 0 (0 votes) · EA · GW

I totally sympathize with your sentiment and feel the same way about incorporating other people's values in a superintelligent AI. If I just went with my own wish list for what the future should look like, I would not care about most other people's wishes. I feel as though many other people are not even trying to be altruistic in the relevant sense that I want to be altruistic, and I don't experience a lot of moral motivation to help accomplish people's weird notions of altruistic goals, let alone any goals that are clearly non-altruistically motivated. In the same way I'd feel no strong (even lower, admittedly) motivation to help make the dreams of baby eating aliens come true.

Having said that, I am confident that it would screw things up for everyone if I followed a decision policy that does not give weight to other people's strongly held moral beliefs. It is already hard enough to not mess up AI alignment in a way that makes things worse for everyone, and it would become much harder still if we had half a dozen or more competing teams who each wanted to get their idiosyncratic view of the future installed.

BTW note that value differences are not the only thing that can get you into trouble. If you hold an important empirical beliefs that others do not share, and you cannot convince them of it, then it may appear to you as though you're justified to do something radical about it, but that's even more likely to be a bad idea because the reasons for taking peer disagreement seriously are stronger in empirical domains of dispute than in normative ones.

There is a sea of considerations from Kantianism, contractualism, norms for stable/civil societies and advanced decision theory that, while each line of argument seems tentative on its own and open to skepticism, all taken together point very strongly into the same direction, namely that things will be horrible if we fail to cooperate with each other and that cooperating is often the truly rational thing to do. You're probably already familiar with a lot of this, but for general reference, see also this recent paper that makes a particularly interesting case for particularly strong cooperation, as well as other work on the topic, e.g. here and here.

This is why I believe that people interested in any particular version of utilitronium should not override AI alignment procedures last minute just to get an extra large share of cosmic stakes for their own value system, and why I believe that people like me, who care primarily about reducing suffering, should not increase existential risk. Of course, all of this means that people who want to benefit human values in general should take particular caution to make sure that idiosyncratic value systems that may diverge from them also receive consideration and gains from trade.

This piece I wrote recently is relevant to cooperation and the question of whether values are subjective or not, and how much convergence we should expect and to what extent value extrapolation procedures bake in certain (potentially unilateral) assumptions.

Comment by Lukas_Gloor on [deleted post] 2017-08-11T00:10:01.839Z

This blogpost seems relevant. Admittedly it's labelled 'speculative' by the author, but I find the concerns plausible.

Comment by lukas_gloor on Why I think the Foundational Research Institute should rethink its approach · 2017-07-22T10:49:58.300Z · score: 1 (1 votes) · EA · GW

That makes sense. I do think as a general policy, valuing reflection is more positive-sum, and if one does not feel like much is "locked in" yet then it becomes very natural too. I'm not saying that people who value reflection more than I do are doing it wrong; I think I would even argue for reflection being very important and recommend it to new people, if I felt more comfortable that they'd end up pursuing things that are beneficial from all/most plausible perspectives. Though what I find regrettable is that the "default" interventions that are said to be good from as many perspectives as possible oftentimes do not seem great from a suffering-focused perspective.

Comment by lukas_gloor on Why I think the Foundational Research Institute should rethink its approach · 2017-07-22T00:33:58.055Z · score: 2 (2 votes) · EA · GW

I agree with this.

Comment by lukas_gloor on Why I think the Foundational Research Institute should rethink its approach · 2017-07-21T23:16:58.135Z · score: 10 (10 votes) · EA · GW

Is it just something like "preventing suffering is the most important thing to work on (and the disjunction of assumptions that can lead to this conclusion)"?

This sounds right. Before 2016, I would have said that rough value alignment (normatively "suffering-focused") is very-close-to necessary, but we updated away from this condition and for quite some time now hold the view that it is not essential if people are otherwise a good fit. We still have an expectation that researchers think about research-relevant background assumptions in ways that are not completely different from ours on every issue, but single disagreements are practically never a dealbreaker. We've had qualia realists both on the team (part-time) and as interns, and some team members now don't hold strong views on the issue one way or the other. Brian especially is a really strong advocate of epistemic diversity and goes much further with it than I feel most people would go.

People who are not so sure about consciousness anti-realism tend to be less certain about their values as a result, and hence don't focus on suffering as much.

Hm, this does not fit my observations. We had and still have people on our team who don't have strong confidence in either view, and there exists also a sizeable cluster of people who seem highly confident in both qualia realism and morality being about reducing suffering, the most notable example being David Pearce.

The one view that seems unusually prevalent within FRI, apart from people self-identifying with suffering-focused values, is a particular anti-realist perspective on morality and moral reasoning where valuing open-ended moral reflection is not always regarded as the by default "prudent" thing to do. This is far from a consensus and many team members value moral reflection a great deal, but many of us expect less “work” to be done by value-reflection procedures than others in the EA movement seemingly expect. Perhaps this is due to different ways of thinking about extrapolation procedures, or perhaps it’s due to us having made stronger lock-ins to certain aspects of our moral self image.

Paul Christiano’s indirect normativity write-up for instance deals with the "Is “Passing the Buck” Problematic?” objection in an in my view unsatisfying way. Working towards a situation where everyone has much more time to think about their values is more promising the more likely it is that there is “much to be gained,” normatively. But this somewhat begs the question. If one finds suffering-focused views very appealing, other interventions become more promising. There seems to be high value of information on narrowing down one’s moral uncertainty in this domain (much more so, arguably, than with questions of consciousness or which computations to morally care about). One way to attempt to reduce one’s moral uncertainty and capitalize on the value of information is by thinking more about the object-level arguments in population ethics; another way to do it is by thinking more about the value of moral reflection, how much it depends on intuition or self-image-based "lock ins" vs. how much it (either in general or in one's personal case) is based on other things that are more receptive to information gains or intelligence gains.

Personally, I would be totally eager to place the fate of “Which computations count as suffering?” into the hands of some in-advance specified reflection process, even when I feel like I don’t understand the way moral reflection will work out in the details of this complex algorithm. I’d be less confident in my current understanding of consciousness than I’d be confident in being able to pick a reassuring-seeming way of delegating the decision-making to smarter advisors. However, I get the opposite feeling when it comes to questions of population ethics. There, I feel like I have thought about the issue a lot, experience it as easier and more straightforward to think about than consciousness and whether I care about insects or electrons or Jupiter brains, and I have some strong intuitions and aspects of my self-identity about the matter and am unsure in which legitimate ways (as opposed to failures of goal preservation) I could gain evidence that would strongly change my mind. It would feel wrong to me to place the fate of my values into some in-advance specified, open-ended deliberation algorithm where I won’t really understand how it will play out and what initial settings make which kind of difference to the end result (and why). I'd be fine with quite "conservative" reflection procedures where I could be confident that it would likely output something that does not seem too far away from my current thinking, but would be gradually more worried about more open-ended ones.

Comment by lukas_gloor on Why I think the Foundational Research Institute should rethink its approach · 2017-07-21T08:57:37.469Z · score: 11 (11 votes) · EA · GW

Brian's view is maybe best described as eliminativism about consciousness (which may already seem counterintuitive to many) plus a counterintuitive way to draw boundaries in concept space. Luke Muehlhauser said about Brian's way of assigning non-zero moral relevance to any process that remotely resembles aspects of our concept of consciousness:

"Mr. Tomasik’s view [...] amounts to pansychism about consciousness as an uninformative special case of “pan-everythingism about everything."

See this conversation.

So the disagreement there does not appear to be about questions such as "What produces people's impression of there being a hard problem of consciousness?," but rather whether anything that is "non-infinitely separated in multi-dimensional concept space" still deserves some (tiny) recognition as fitting into the definition. As Luke says here, the concept "consciousness" works more like "life" (= fuzzy) and less like "water" (= H2O), and so if one shares this view, it becomes non-trivial to come up with an all-encompassing definition.

While most (? my impression anyway as someone who works there) researchers at FRI place highest credence on functionalism and eliminativism, there is more skepticism about Brian's inclination to never draw hard boundaries in concept space.

Comment by lukas_gloor on Hi, I'm Luke Muehlhauser. AMA about Open Philanthropy's new report on consciousness and moral patienthood · 2017-06-28T20:01:30.702Z · score: 0 (0 votes) · EA · GW

I was thinking about a secondary layer that is hidden as well.

E.g. would a mere competition among neural signals count? Or would it have to be something more "sophisticated," in a certain way?

Hard to say. On Brian's perspective with similarities in multi-dimensional concept space, the competition among neural signals may already qualify to an interesting degree. But let's say we are interested in something slightly more sophisticated, but not sophisticated enough that we're inclined to look at it as "not hidden." (Maybe it would qualify if the hidden nociceptive signals alter subconscious dispositions in interesting ways, though it depends on how that would look like and how it compares to what is going on introspectively with suffering that we have conscious access to.)

Comment by lukas_gloor on Hi, I'm Luke Muehlhauser. AMA about Open Philanthropy's new report on consciousness and moral patienthood · 2017-06-28T17:35:09.228Z · score: 3 (3 votes) · EA · GW

One thing I found extremely nice about your report is that it could serve EAs (and people in general) as a basis for shared terminology in discussions! If two people from different backgrounds wanted to have a discussion about philosophy of mind or animal consciousness, which texts would you recommend they both read in order to prepare themselves? (Not so much in terms of familiarity with popular terminology, but rather useful terminology.) Can you think of anything really good that is shorter than this report?

Comment by lukas_gloor on Hi, I'm Luke Muehlhauser. AMA about Open Philanthropy's new report on consciousness and moral patienthood · 2017-06-28T17:29:28.612Z · score: 0 (0 votes) · EA · GW

Are you aware of any "hidden" (nociception-related?) cognitive processes that could be described as "two systems in conflict?" I find the hidden qualia view very plausible, but I also find it plausible that I might settle on a view on moral relevance where what matters about pain is not the "raw feel" (or "intrinsic undesirability" in Drescher's words), but a kind of secondary layer of "judgment" in the sense of "wanting things to change/be different" or "not accepting some mental component/input." I'm wondering whether most of the processes that would constitute hidden qualia are too simple to fit this phenomenological description or not...

Comment by lukas_gloor on Hi, I'm Luke Muehlhauser. AMA about Open Philanthropy's new report on consciousness and moral patienthood · 2017-06-28T17:14:25.514Z · score: 0 (0 votes) · EA · GW

Did you always find illusionism plausible or was there a moment where it made “click” or just a gradual progression? Do you think reading more about neuroscience makes people more sympathetic to it?

Do you think the p-zombie thought experiment can be helpful to explain the difference between illusionism and realism (“classic qualia” mapping onto the position “p-zombies are conceivable"), or do you find that it is unfair or often leads discussions astray?

Comment by lukas_gloor on What does Trump mean for EA? · 2016-11-10T23:16:30.694Z · score: 14 (16 votes) · EA · GW

I definitely became less interested in politics ever since identifying as an EA or utilitarian. But then Switzerland passed some ridiculous xenophobic propositions, and Brexit happened, and now Trump. And every time I had this worry in the back of my mind that we're doing something wrong.

Carl mentioned "Misallocating a huge mass of idealists' human capital to donation for easily measurable things and away from more effective things elsewhere, sabotages more effective do-gooding for a net worsening of the world" here. This point doesn't just apply to money, but also very much to attention and activism. And the bias may not just be towards things that are easily measurable, but there may also be a bias away from "current" or "urgent" events. These events shape public discourse, which could have important flow through effects. What's the effect if altruistic and driven people disproportionally stop caring about current events and the discussions that surround them?

Perhaps it's negligible, but it's certainly worth thinking about more. And I was glad to see how much attention the recent votes got within EA.

Comment by lukas_gloor on Moral anti-realists don't have to bite bullets · 2016-01-08T17:08:45.007Z · score: 0 (0 votes) · EA · GW

Sorry for the late reply. Good question. I would be more inclined to call it a "mechanism" rather than a (meta-)value. You're right, there has to be something that isn't chosen. Introspectively, it feels to me as though I'm concerned about my self-image as a moral/altruistic person, which is what drove me to hold the values I have. This is highly speculative, but perhaps "having a self-image as x" is what could be responsible for how people pick consequentialist goals?

Comment by lukas_gloor on Moral anti-realists don't have to bite bullets · 2015-12-28T22:04:32.191Z · score: 5 (7 votes) · EA · GW

One question is what we want "morality" to refer to under anti-realism. For me, what seems important and action-guiding is what I want to do in life, so personally I think of normative ethics as "What is my goal?".

Under this interpretation, the difference between biting bullets or not is how much people care about their theories being elegant, simple, parsimonious, vs how much they care about tracking their intuitions as closely as possible. You mention two good reasons for favoring a more intuition-tracking approach.

Alternatively, why might some people still want to bite bullets? Firstly, no one wants to accept a view that seems unacceptable. Introspectively biting a bullet can feel "right", if I am convinced that the alternatives feel worse and if I realize that the aversion-generating intuitions are not intuitions that my rational self-image would endorse. For instance, I might feel quite uncomfortable with the thought to send all my money to people far away, while neglecting poor people in my community. I can accept this feeling as a sign that community matters intrinsically to me, i.e. that I care (somewhat) more strongly about the people close to me. Or I could bite the bullet and label "preference for in-group" as a “moral bias” – biased in relation to what I want my life-goals to be about. Perhaps, upon reflection, I decide that some moral intuitions matter more fundamentally to me, say for instance because I want to live for something that is “altruistic”/"universalizable" from a perspective like Harsanyi’s Veil of Ignorance. Given this fundamental assumption, I’ll be happy to ignore agent-relative moral intuitions. Of course, it isn’t wrong to end up with a mix of both ideas if the intuition “people in my community really matter more to me!” is just as strong strong as the intuition that you want your goal to work behind a veil of ignorance.

On Lesswrong, people often point out that human values are complex, and that those who bite too many bullets are making a mistake. I disagree. What is complex are human moral intuitions. Values, by which I mean "goals" or "terminal values", are chosen, not discovered. (Because consequentialists goals are new and weird and hard for humans to have, so why would they be discoverable in a straightforward manner from all the stuff we start out with?) And just because our intuitions are complex – and totally contradicting each other sometimes – doesn't mean that we're forced to choose goals that look the same. Likewise, I think people who think some form of utiltiarianism must be the thing are making a mistake as well.

Comment by lukas_gloor on The big problem with how we do outreach · 2015-12-28T21:31:28.724Z · score: 1 (1 votes) · EA · GW

Good points. I agree that EA's message if often framed in a way that can seem alienating to people who don't share all its assumptions. And I agree that the people who don't share all the assumptions are not necessarily being irrational.

Some people might argue that non-utilitarians will become utilitarian if they become more rational.

FWIW, I think there's indeed a trend. Teaching rationality can be kind of a dick move (only in a specific sense) because it forces you to think about consequentialist goals and opportunity costs, which is not necessarily good for your self-image if you're not able to look back on huge accomplishments or promising future prospects. As long as your self-image as a "morally good person" is tied to common-sense morality, you can do well by just not being an asshole to the people around you. And where common-sense morality is called into question, you can always rationalize as long as you're not yet being forced to look too closely. So people will say things like "I'm an emotional person" in order to be able to ignore all these arguments these "rationalists" are making, which usually end with "This is why you should change your life and donate". Or they adopt a self-image as someone who is "just not into that philosophy-stuff" and thus will just not bother to think about it anymore once the discussions get too far.

LW or EA discourse breaks down the alternatives. Once it's too late, once your brain spots blatant attempts at rationalizing, this forces people to either self-identify as (effective) altruists or not, or at least state what %age of your utility function corresponds to which. And self-identifying as someone who really doesn't care about people far away, as opposed to someone who still cares but "community comes first" and "money often doesn't reach its destination anyway" and "isn't it all so uncertain and stop with these unrealistic thought experiments already!" and "why are these EAs so dogmatic?", is usually much harder. (At least for those who are empathetic/social/altruistic, or those who are in search of moral meaning to their lives).

I suspect that this is why rationality doesn't correlate with making people happier. It's easier to be happy if your goal is to do alright in life and not be an asshole. It gets harder if your goal is to help fix this whole mess that includes wild animals suffering and worries about the fate of the galaxy.

Arguably, people are being quite rational, on an intuitive level, by not being able to tell you what their precise consequentialist goal is. They're satisficing, and it's all good for them, so why make things more complicated? A heretic could ask: Why create a billion new ways in which they can fail to reach their goals? – Maybe the best thing is to just never think about goals that are hard to reach. Edit: Just to be clear, I'm not saying people shouldn't have consequentialist goals, I'm just pointing out that the picture as I understand it is kind of messy.

Handling the inherent demandingness of consequentialist goals is a big challenge imo, for EAs themselves as well as for making the movement more broadly appealing. I have written some thoughts on this here.

Comment by lukas_gloor on How important is marginal earning to give? · 2015-05-20T13:19:30.036Z · score: 4 (4 votes) · EA · GW

It's correct that the Swiss EA organizations are currently funding-constrained. We haven't pitched any projects to the international community yet, but we're considering it if an opportunity arises where this makes sense.

I also think that funding is going to be less of an issue once more people in the movement transition from still being students to etg.

Comment by lukas_gloor on Room for Other Things: How to adjust if EA seems overwhelming · 2015-04-20T21:06:15.097Z · score: 0 (0 votes) · EA · GW

Just read this, it expresses well what I meant by "humans are not designed to pursue a single goal": http://lesswrong.com/lw/2p5/humans_are_not_automatically_strategic/

Comment by lukas_gloor on Room for Other Things: How to adjust if EA seems overwhelming · 2015-04-12T12:41:49.672Z · score: 0 (0 votes) · EA · GW

If the hostage crisis is a good analogy for your internal self, what is to stop "system 1" from breaking its promises or being clever?

There are problems to every approach. Talk about your commitment to others, they will remind you. I'm not saying this whole strategy always works, but I'm quite sure there are many people for whom it is the best idea to try.

Regarding the "utility quota", what I mean by "personal moral expecations": Basically, this just makes the point that it is useless to beat yourself up over things you cannot change. And yet we often do this, feel sad about things we probably couldn't have done differently. (One interesting hypothesis for this reaction is described here.)

People are great at creating disagreements over nothing, and ethics is complex enough to be opaque, so we would expect moral disagreement in both worlds with a single coherent morality for humanity and worlds without one.

Note that if this were true, you still need reasons why you expect there to be just one human morality. I know what EY wrote on the topic, and I find it question-begging and unconvincing. What EY is saying is that human utility-function_1s are complex and similar. What I'm interested in, and what I think you and EY should also be interested in, are utility-function_2s. But that's another discussion, I've been meaning to write up my views on the topic of metaethics and goal-uncertainty, but I expect it'll take me at least a few months until I get around to it.

This doesn't really prove my case by itself, but it's an awesome quote nevertheless, so I'm including it here (David Hume, Enquiry):

“It might reasonably be expected in questions which have been canvassed and disputed with great eagerness, since the first origin of science and philosophy, that the meaning of all the terms, at least, should have been agreed upon among the disputants; and our enquiries, in the course of two thousand years, been able to pass from words to the true and real subject of the controversy. For how easy may it seem to give exact definitions of the terms employed in reasoning, and make these definitions, not the mere sound of words, the object of future scrutiny and examination? But if we consider the matter more narrowly, we shall be apt to draw a quite opposite conclusion. From this circumstance alone, that a controversy has been long kept on foot, and remains still undecided, we may presume that there is some ambiguity in the expression, and that the disputants affix different ideas to the terms employed in the controversy.”

Why not do this publicly? Why not address the thought experiment I proposed?

Lack of time, given that I've already written a lot of text on the topic. And because I'm considering to publish some of it at some point in the future, I'm wary of posting long excerpts of it online.

Those things are less straightforward than string theory, in the sense of Kolmogorov complexity. The fact that we can compress those queries into sentences which are simpler to introduce one to than algebra is testimony to how similar humans are.

Isn't the whole point of string theory that it is pretty simple (in terms of Kolmogorov complexity that is, not in whether I can understand it)? If anything, this would be testimony to how good humans are at natural speech as opposed to math. Although humans aren't that good at natural speech, because they often don't notice when they're being confused or talking past each other. But this is being too metaphorical.

I don't really understand your point here. Aren't you presupposing that there is one answer people will converge on with these cases? I've talked to very intelligent people about these sorts of questions, and we've narrowed down all the factual disagreements we could think of. Certainly it is possible that I and the people I was talking to (who disagreed with my views), were missing something. But it seems more probable that answers just don't always converge.

And all your experience has had the constant factor of being pitched by you, someone who believes "optimising for EA" being tiring and draining is all part of the plan.

What, I thought you were saying that, at least more so than I'm saying it.

You're telling people not to try to optimise their full lives to EA right now. If that is what they were trying before, then you are arguing for people to stop trying, QED.

Differentiate between 1) "ways of trying to accomplish a goal, e.g. in terms of decisional algorithms or habits" and 2) "pursuing a goal by whichever means are most effective". I did not try to discourage anyone from 2), and that's clearly what is relevant. I'm encouraging people to stop trying a particular variant of 1) because I believe that particular variant of 1) works for some people (to some extent), but not for all of them. It's a spectrum of course, not just two distinct modes of going about the problem.

Considering you use a gendered pronoun to refer to unspecified people of any gender as well ("she"), I'm confused why you would wrongly 'correct' someone out like that.

Let's not get into this, this discussion is already long enough. I can say that I see the point of your last remark, I did not mean to imply that Williams himself, given the time the text was written, was being sexist.

Comment by lukas_gloor on Room for Other Things: How to adjust if EA seems overwhelming · 2015-04-12T11:59:00.446Z · score: 0 (0 votes) · EA · GW

It seems obvious to me that we're talking past each other, meaning we're in many cases trying to accomplish different things with our models/explanations. The fact that this doesn't seem obvious to you suggests to me that I'm either bad at explaining, or that you might be interpreting my comments uncharitably. I agree with your tl;dr, btw!

This matches the data quite nicely, methinks. Better than "irrationality", anyway.

You're presupposing that the agent would not modify any of its emotional links if it had the means to do so. This assumption might apply in some cases, but it seems obviously wrong as a generalization. Therefore, your model is incomplete. Reread what I wrote in the part "On goals". I'm making a distinction between "utility-function_1", which reflects all the decisions/actions an agent will make in all possible situations, and "utility-function_2", which reflects all the decisions/actions an agent would want to make in all possible situations. You're focusing on "utility-function_1", and what you're saying is entirely accurate – I like your model in regard to what it is trying to do. However, I find "utility-function_2s" much more interesting and relevant, which is why I'm focusing on them. Why don't you find them interesting?

Agenty/rational behaviour isn't exclusive to system 2. How does system 1 decide when to trigger this coping mechanism?

Again, we have different understandings of rationality. The way I defined "goals" in the section "On goals", it is only the system 2 that is defining what is rational, and system 1 heuristics can be "rational" if they are calibrated in a way that produces outcomes that are good in regard to the system 2 goals, given the most probable environment the agent will encounter. This part seems to be standard usage, in fact.

Side note: Your theory of rationality is quite Panglossian, it is certainly possible to interpret all of human behavior as "rational" (as e.g. Gigerenzer does), but that would strike me as a weird/pointless thing to do.

Was that the evidence you have for the claim that humans aren't designed to efficiently pursue a single goal? Or do you have more evidence?

This claim strikes me as really obvious, so I'm wondering whether you might be misunderstanding what I mean. Have you never noticed just how bad people are at consequentialism? When people first hear of utilitarianism/EA, they think EA implies to give away all their wealth until they are penniless, instead of considering future earning prospects. Or they think having children and indoctrinating them with utilitarian beliefs is an effective way to do good, ignoring that you could just use the money to teach other people's children more effectively. The point EY makes in the Godshatter article is all about how evolution programmed many goals into humans even though evolution only (metaphorically) has one goal.

What I'm saying is the following: Humans are bad at being consequentialists, they are bad at orienting their entire lives, i.e. jobs/relationships/free-time/self-improvement... towards one well-defined aim. Why? Well for one thing, if you ask most people "what is your goal", they'll have a hard time to even answer! In addition, evolution programmed us to value many things, so someone whose goal is solely "reduce as much suffering as possible" cannot count on evolutionarily adaptive heuristics, and thus, biases against efficient behavioral consequentialism are to be expected.

It is trivially true that a utility function-based agent exists (in a mathematical sense) which perfectly models someone's behaviour. It may not be the simplest, but it must exist.

I know, what I was trying to say by "A neuroscientist of the future, when the remaining mysteries of the human brain will be solved, will not be able to look at people's brains and read out a clear utility-function" – emphasis should be on CLEAR – was that there is no representation in the brain where you look at it and see "ah, that's where the utility function is". Instead (and I tried to make this clear with my very next sentence, which you quoted), you have to look at the whole agent, where eventually, all the situational heuristics, the emotional weights you talked about, and the agent's beliefs, together imply a utility-function in the utility-function_1-sense. Contrast this to a possible AI: It should be possible to construct an AI with a more clearly represented utility function, or a "neutral" prototype-AI where scientists need to fill in the utility-function part with whatever they care to fill in. When we speak of "agents", we have in mind an entity that knows what it's goals are. When we represent an FAI or a paperclip maximizer, we assume this entity knows what its goal is. However, most humans do not know what their goals are. This is interesting, isn't it? Paperclippers would not enagage in discussions on moral philosophy, because they know what their goals are. Humans are often confused about their goals (and about morality, which some take to imply more than just goals). So your model of human utility functions should incorporate this curious fact of confusion. I was very confused by it myself, but I now think I understand what is going on.

In my opinion, there is always cognitive dissonance in this entire paradigm of utility quotas. You're making yourself act like two agents with two different moralities who share the same body but get control at different times. There is cognitive dissonance between those two agents. Even if you try to always have one agent in charge, there's cognitive dissonance with the part you're denying.

I find this quite a good description, actually. One is your system 2, "what you would want to do under reflection", the other is "primal, animal-like brain", to use outdated language. I wouldn't call the result "cognitive dissonance" necessarily. If you rationally understand what is going on, and realize that rebelling against your instincts/intuitions/emotional set-up is going to lead to a worse outcome than trying to reconcile your conflicting motivations, then the latter is literally the most sensible thing for your system 2 to attempt to do.

Comment by lukas_gloor on Room for Other Things: How to adjust if EA seems overwhelming · 2015-03-30T12:50:03.586Z · score: 1 (1 votes) · EA · GW

Someone who is so emotionally affected by EA that they give up is definitely someone who 'merely tried' to affect the world, because you can't just give up if you care in an agentic sense.

I strongly disagree. Why would people be so deeply affected if they didn't truly care? The way I see it, when you give up EA because it's causing you too much stress, what happens constitutes a failure of goal-preservation, which is irrational, but after you've given up, you've become a different sort of agent. Just because you don't care/try anymore does not mean that caring/trying in the earlier stages was somehow fake.

Giving up is not a rational decision made by your system-2*, it's a coping mechanism triggered by your system-1 feeling miserable, which then creates changes/rationalizations in system-2 that could become permanent. As I said before (and you expressed skepticism), humans are not designed to efficiently pursue a single goal. A neuroscientist of the future, when the remaining mysteries of the human brain will be solved, will not be able to look at people's brains and read out a clear utility-function. Instead, what you have is a web of situational heuristics (system-1), combined with some explicitly or implicitly represented beliefs and goals (system-2), which can well be contradictory. There is often no clear way to get out a utility-function. Of course, people can decide to do what they can to self-modify towards becoming more agenty, and some succeed quite well despite of all the messy obstacles your brain throws at you. But if your ideal self-image and system-2 goals are too far removed from your system-1 intuitions and generally the way your mind works, then this will create a tension that leads to unhappiness and quite likely cognitive dissonance somewhere. If you keep going without changing anything, the outcome won't be good for neither you nor your goals.

You mentioned in your earlier comment that lowering your expectations is exactly what evading cognitive dissonance is. Indeed! But look at the alternatives: If your expectations are impossible to fulfill for you, then you cannot reduce cognitive dissonance by improving your behavior. So either you lower your expectations (which preserves your EA-goal!), or you don't, in which case the only way to reduce the cognitive dissonance is by rationalizing and changing your goal. By choosing strategies like "Avoiding daily dilemmas", you're not changing your goals, your only changing the expectations you set for yourself in regard to these goals.

[…] to me at least it isn't obvious if you meant that people should still try their hardest to be highly agentic but merely not beat themselves up over falling short.

Have you considered that for some people, the most agenty thing to do would be to change their decision-procedure so it becomes less "agenty"?

An analogy (adapted from Parfit): You have a combination-safe at home and get robbed, the robbers want the combination from you. They threaten to kill your family if you don't comply. The safe contains something that is extremely valuable to you, e.g. you and your family would be in gigantic trouble if it got stolen. You realize that the robbers are probably going to kill you and your family anyway after they got the safe open, because you all have seen their faces. What do you do? Now, imagine you had a magic pill that temporarily turns you, a rational, agenty person, into an irrational person whose actions don't make sense. Imagine that this state would be transparent to the robbers, e.g. they would know with certainty that you're not faking it and therefore realize that they can't get the combination from you. Should you take the pill, or would you say "I'm rational, so I can never decide to try to become less rational in the future"? Of course, you'd take the pill, because the rational action for your present self here is rendering your future self irrational.

Likewise: If you notice that trying to be more agenty is counterproductive in important regards, the right/rational/agenty thing for you to do would be to try become a bit less agenty in the future. The robbers in the EA examples is your system-1 and personality vs. your idealized self-image/system-2/goal. With EA being too demanding, you don't even have to change your goals, it suffices to adjust your expectations to yourself. Both would have the same desired effect, the main difference being that, when you don't change your goals, you would want to become more agenty again if you discovered a means to overcome your obstacles in a different way.

Whose "right" are we talking about, here? If it's "right" according to effective altruism, that is obviously false: someone who discovers they like murdering is wrong by EA standards (as well as those of the general population).

We were talking about whether and to what extent people's goals contain EA-components. If part of people's goals contradict EA tenets, then of course they cannot be (fully) EA. I do agree with your implicit point that "right according to x" is a meaningful notion if "x" is sufficiently clearly defined.

"Careful reflection" also isn't enough for humans to converge on an answer for themselves. If it was, tens of thousands of philosophers should have managed to map out morality, and we wouldn't need the likes of MIRI.

Are you equating "morality" with "figuring out an answer for one's goals that converges for all humans"? If yes, then I suspect that the reference of "morality" fails because goals probably don't converge (completely). Why is there so much disagreement in moral philosophy? To a large extent, people seem to be trying to answer different questions. In addition, some people are certainly being irrational at what they're trying to do, e.g. they fail to distinguish between things that they care about terminally and things they only care about instrumentally, or they might fail to even ask fundamental questions.

People's goals can be changed and/or people can be wrong about their goals, depending on what you consider proper "goals".

I agree, see my 2nd footnote in my original post. The point where we disagree is whether you can infer from an existing disagreement about goals that at least one participant is necessary being irrational/wrong about her goals. I'm saying that's not the case.

I'm sufficiently confident that I'm either misunderstanding you or that you're wrong about your morality [...]

I probably thought about my values more than most EAs and have gone through unusual lengths to lay out my arguments and reasons. If you want to try to find mistakes, inconsistencies or thought experiment that would make me change them, feel free to send me a PM here or on FB.

Humans are sort of like agents and we're all sort of similar, so our moralities tend to always be sort of the same.

With lots of caveats, e.g. people will be more or less altruistic, and that's part of your "morality-axis" as well if morality=your_goals. In addition, people will disagree about the specifics of even such "straightfoward" things as what "altruism" implies. Is it altruistic to give someone a sleeping pill against their will if they plan to engage in some activity you consider bad for them? Is it altruistic to turn rocks into happy people? People will disagree about what they would choose here, and it's entirely possible that they are not making any meaningful sort of mistake in the process of disagreeing.

That isn't trivial. If 1 out of X miserable people manages to find a way to make things work eventually they could be more productive than Y people who chose to give up on levelling up and to be 'regular' EAs instead, with Y greater than X, and in that case we should advice people to keep trying even if they're depressed and miserable.

OK, but even so, I would in such a case at least be right about the theoretical possibility of there being people to whom my advice applies correctly. For what it's worth, I consider it dangerous that EA will be associated with a lot of "bad press" if people drop out due to it being too stressful. All my experience with pitching EA so far indicates that it's bad to be too demanding. Sure, you can say you shouldn't be demanding towards newcomers, not established EAs, but you won't be able to keep up a memetic barrier there.

But more importantly, it's a false choice: it should be possible to have people be less miserable but still to continue trying, and you could give advice on how to do that, if you know it.

As a general point, I object to your choice of words: I don't think my posts ever argued for people to stop trying. I'm putting a lot of emphasis on getting the things right that you do do, like e.g. splitting separate motivations for donating so people don't end up donating to fuzzies only due to rationalizing or actual giving-up. I agree with the sentiment that advice is better the more it helps people stay well and still remain super-productive, I tried to give some advice that goes into that direction, e.g. the "Viewing life as a game" is also useful for when you're thinking about EA all the time, but of course I don't have all the advice either; I also welcome more contributions, and from what I've heard, CFAR has helped a lot of people in these regards.

  • I'm using terms like "system-2" in a non-technical sense here.
Comment by lukas_gloor on Room for Other Things: How to adjust if EA seems overwhelming · 2015-03-27T17:32:09.526Z · score: 2 (2 votes) · EA · GW

Thanks for this feedback! You bring up a very important point with the danger of things turning into merely "pretending to try". I see this problem, but at the same time I think many people are far closer to the other end of the spectrum.

I'm part of the target audience, I think, but this post isn't very helpful to me. Mistrust of arguments which tell me to calm down may be a part of it, but it seems like you're looking for reasons to excuse caring for other things than effective altruism, rather than weighing the evidence for what works better for getting EA results.

I suspect that many people don't really get involved in EA in the first place because they're on some level afraid that things will grow over their head. And I know of cases where people gave up EA at least partially because of these problems. This to me is enough evidence that there are people who are putting too much pressure on themselves and would benefit from doing it less. Of course, there is a possibility that a post like this one does more harm because it provides others with "ammunition" to rationalize more, but I doubt this would make much of a difference – it's unfortunately easy to rationalize in general and you don't need that much "ammunition" for it.

Your "two considerations", look like a two-tiered defence against EA pressures rather than convergence on a single right answer on how to consider your goals.

That's what they are. I think there's no other criterion that make your goals the "right" ones other than that you would in fact choose these goals upon careful reflection.

Maybe you mean that some people are 'partial EAs' and others are 'full EAs (who are far from highly productive EA work in willpower -space)', but it isn't very clear.

Yes, that's what I meant. And I agree it's unclear because it's confusing that I'm talking only about 2) in all of what follows, I'll try to make this more clear. So to clarify, most of my post addresses 2), "full EAs (who are far from highly productive in willpower-space)", and 1) is another option that I mention and then don't explore more because the consequences are straightforward. I think there's absolutely nothing wrong with 1), if your goals are different from mine then that doesn't necessarily mean you're making a mistake about your goals. I personally focus on suffering and don't care about preventing death, all else being equal, but I don't (necessarily) consider you irrational for doing so.

Now, on 'partial EAs': If you agree that effective altruism = good (if you don't, adjust your EA accordingly, IMO), then agency attached to something with different goals is bad compared to agency towards EA. Even if those goals can't be changed right now, they would still be worse, just like death is bad even if we can't change it (yet (except maybe with cryonics)).

I'm arguing within a framework of moral anti-realism. I basically don't understand what people mean by the term "good" that could do the philosophical work they expect it to do. A partial EA is someone who would refuse to self-modify to become more altruistic IF this conflicts with other goals (like personal happiness, specific commitments/relationships, etc). I don't think there's any meaningful and fruitful sense in which these people are doing something bad or making some sort of mistake, all you can say is that they're being less altruistic as someone with a 100%-EA goal, and they would reply: "Yes."

Concerning 'full EAs who are far from being very effective EAs in willpower -space", this triggers many alarm bells in my mind, warning of the risk of it turning into an excuse to merely try. You reduce effective effective altruists' productivity to a personality trait (and 'skills' which in context sound unlearnable), which doesn't match 80,000hours' conclusion that people can't estimate well how good they are at things or how much they'll enjoy things before they've tried.

I accept that there's a danger that my post can be read as such, but that's not what I'm saying. Not all skills are learnable to the same extent, but of course there is also a component to how much people try! And I would also second the advice that it's important to try things even if they seem very hard to do at first. But the thing is, some people have tried and failed and feel miserable about it, or even the thought of trying makes them feel miserable, so that certainly cannot be ideal because these people aren't being productive at that point.

Your statement on compartmentalisation (and Ben Kuhn's original post) both just seem to assume that because denying yourself social contact because you could be making money itself is bad, therefore compartmentalisation is good. But the reasoning for this compartmentalisation - it causes happiness, which causes productivity - isn't (necessarily) compartmentalised, so why compartmentalise at all? Your choice isn't just between a delicious candy bar and deworming someone, it's between a delicious candy bar which empowers you to work to deworm two people and deworming one person. This choice isn't removed when you use the compartmentalisation heuristic, it's just hidden. You're "freeing your mind from the moral dilemma", but that is exactly what evading cognitive dissonance is.

Human brains are not designed to optimize towards a single goal. It can drive you crazy. For some, it works, for others, it probably does not. I'm not saying "if you're stressed sometimes, do less EA stuff". Maybe being stressed is the lesser problem. My point is: "If you're stressed to the point that the status quo is not sustainable, then change something and don't feel bad about it".

To sum up, I'm aware that rationalizing is a huge danger – it always amazes me just how irrational people can become when they are protecting a cherished belief – but I think that there are certain people who really aren't in danger of setting their expectations too low, because they have a problem with doing the opposite.

Comment by lukas_gloor on Room for Other Things: How to adjust if EA seems overwhelming · 2015-03-26T21:29:16.140Z · score: 2 (2 votes) · EA · GW

Thanks, great points! I got some help and managed to fix the layout.

Room for Other Things: How to adjust if EA seems overwhelming

2015-03-26T14:10:52.928Z · score: 17 (17 votes)
Comment by lukas_gloor on We might be getting a lot of new EAs. What are we going to do when they arrive? · 2015-03-26T09:37:55.646Z · score: 2 (2 votes) · EA · GW

I agree with this and would add that even monthly meetings only could be very valuable if they're combined with good online follow-up. When you know someone from a meeting first you're probably more likely to ask them lots of questions that if they were just recommended to you over the internet. Nevertheless, I also think it could be promising to invest into personalized online guidance. Michelle mentioned below that the buddy-system wasn't used much, but maybe something similar could work well especially if the demand from new people increases as expected.

Comment by lukas_gloor on Common Misconceptions about Effective Altruism · 2015-03-24T17:02:06.686Z · score: 2 (2 votes) · EA · GW

How do you define EA in this case? If you include e.g. all 17k TLYCS-plegders, then I'd probably agree with your statement, but if we take people in the EA-fb-group (minus the spam accounts or accidentally added ones) or people who self-report as EA then it seems more likely to me that Telofy is right.

Comment by lukas_gloor on Non-English language effective altruism (including a list of venues) · 2015-03-23T18:24:25.467Z · score: 3 (3 votes) · EA · GW

(My impression is German, especially in Switzerland, and following that Dutch, Norwegian and Swedish. It's interesting that north European and historically protestant countries are particularly strong; I believe these have especially strong ethoses of private charity, in particular the UK - think the Victorian culture of charity.)

These countries also have better education systems than the average European countries, see for instance here (I'm surprised Sweden is not in that global top20 but it's probably close). Edit: Norway is 21, Sweden 24. My guess is that this is more relevant. In fact, I'd be hesitant to even bet on a correlation between a country's tradition of charity and proneness to EA once you control for things like GDP. EAs tend to not be very influenced by tradition, plus quite a few EAs I know recount that they weren't involved in traditional do-good-activities at all before they encountered EA. Maybe things change once EA becomes more mainstream, but especially in the early phase where EA is something novel, I would not expect there to be much of a correlation.

Comment by lukas_gloor on Should altruism be selfless? · 2015-03-23T08:13:02.145Z · score: 2 (2 votes) · EA · GW

If EA were perceived as something people do for primarily selfless reasons, this would probably make it harder to persuade new people to join. Implicitly, the presence of an EA would be calling into question the moral character of the non-EA, which can tend to make people defensive and prone to rationalizing, see for instance this effect.

As to what is really going on with EAs: I think that the abstract philosophical motivation is indeed often selfless, because even if you're an EA in order to feel good and find your life meaningful, this trick only works if you manage to fully convince yourself of the EA-goal. However, when it comes to having the motivation to overcome akrasia and put things into practice, I think that selfish reasons play a large role as to how far people will go. This is supported e.g. by the influence of social involvement on people becoming active.

Comment by lukas_gloor on Effective Altruism and Utilitarianism · 2015-01-31T16:40:28.731Z · score: 2 (2 votes) · EA · GW

This. Another core EA tenet might be that non-human animals count (if they are sentient).

Kantianism has positive duties and Kant's "realm of ends" to me sounds very much like taking into account "the instrumental rationality of charitable giving". Kant himself didn't grant babies or non-human animals intrinsic moral status, but some Kantian philosophers, most notably Korsgard, have given good arguments as to why sentientism should follow from the categorical imperative.

Virtue ethics can be made to fit almost anything, so it seems easy to argue for the basic tenets of EA within that framework.

Some forms of contractualism do not have positive rights, so these forms would be in conflict with EA. But if you ground contractualism in reasoning from behind the veil of ignorance, as did Rawls, then EA principles, perhaps in more modest application (even though it is unclear to me why the veil of ignorance approach shouldn't output utilitarianism), will definitely follow from the theory. Contractualism that puts weight on reciprocity would not take non-human animals into consideration, but there, too, you have contractualists arguing in favor of sentientism, e.g. Mark Rowlands.

Comment by lukas_gloor on Effective Altruism and Utilitarianism · 2015-01-30T21:19:58.104Z · score: 6 (8 votes) · EA · GW

I agree that the core EA tenets make sense also according to most non-consequentialist views. But consequentialism might be better at activating people because it has very concrete implications. It seems to me that non-consequentialist positions are often vague when it comes to practical application, which makes it easy for adherents to not really do much. In addition, adherence to consequentialism correlates with analytical thinking skills and mindware such as expected utility theory, which is central to understanding/internalizing the EA concept of cost-effectiveness. Finally, there's a tension between agent-relative positions and cause neutrality, so consequentialism selects for people who are more likely going to be on board with that.