Reducing long-term risks from malevolent actors 2020-04-29T08:55:38.809Z · score: 224 (93 votes)
Thoughts on electoral reform 2020-02-18T16:23:27.829Z · score: 69 (35 votes)
Space governance is important, tractable and neglected 2020-01-07T11:24:38.136Z · score: 50 (25 votes)
How can we influence the long-term future? 2019-03-06T15:31:43.683Z · score: 9 (11 votes)
Risk factors for s-risks 2019-02-13T17:51:37.632Z · score: 31 (12 votes)
Why I expect successful (narrow) alignment 2018-12-29T15:46:04.947Z · score: 18 (17 votes)
A typology of s-risks 2018-12-21T18:23:05.249Z · score: 15 (11 votes)
Thoughts on short timelines 2018-10-23T15:59:41.415Z · score: 22 (24 votes)
S-risk FAQ 2017-09-18T08:05:39.850Z · score: 15 (16 votes)
Strategic implications of AI scenarios 2017-06-29T07:31:27.891Z · score: 6 (6 votes)


Comment by tobias_baumann on How Much Leverage Should Altruists Use? · 2020-05-23T20:42:34.972Z · score: 1 (1 votes) · EA · GW

The drawdowns of major ETFs on this (e.g. EMB / JNK) during the corona crash or 2008 are roughly 2/3 to 3/4 of how much stocks (the S&P 500) went down. So I agree the diversification benefit is limited. The question, bracketing the point on leverage extra cost, is whether the positive EV of emerging markets bonds / high yield bonds is more or less than 2/3 to 3/4 of the positive EV of stocks. That's pretty hard to say - there's a lot of uncertainty on both sides. But if that is the case and one can borrow at very good rates (e.g. through futures or box spread financing) then the best portfolio should be a levered up combination of bonds & stocks rather than just stocks.

FWIW, I'm in a similar position regarding my personal portfolio; I've so far not invested in these asset classes but am actively considering it.

Comment by tobias_baumann on How Much Leverage Should Altruists Use? · 2020-05-18T08:57:18.207Z · score: 1 (1 votes) · EA · GW

What are your thoughts on high-yield corporate bonds or emerging markets bonds? This kind of bond offers non-zero interest rates but of course also entail higher risk. Also, these markets aren't (to my knowledge) distorted by the Fed buying huge amounts of bonds.

Theoretically, there should be some diversification benefit from adding this kind of bond, though it's all positively correlated. But unfortunately, ETFs on these kinds of bonds have much higher fees.

Comment by tobias_baumann on How should longtermists think about eating meat? · 2020-05-17T10:29:58.725Z · score: 33 (22 votes) · EA · GW

Peter's point is that it makes a lot of sense to have certain norms about not causing serious direct harm, and one should arguably follow such norms rather than expecting some complex longtermist cost-benefit analysis.

Put differently, I think it is very important, from a longtermist perspective, to advance the idea that animals matter and that we consequently should not harm them (particularly for reasons as frivolous as eating meat).

Comment by tobias_baumann on Helping wild animals through vaccination: could this happen for coronaviruses like SARS-CoV-2? · 2020-05-13T11:13:37.980Z · score: 2 (2 votes) · EA · GW

Great post, thanks for writing this up!

Comment by tobias_baumann on Reducing long-term risks from malevolent actors · 2020-05-07T07:49:40.329Z · score: 2 (2 votes) · EA · GW

Thanks for commenting!

I agree that early detection in children is an interesting idea. If certain childhood behaviours can be shown to reliably predict malevolence, then this could be part of a manipulation-proof test. However, as you say, there are many pitfalls to be avoided.

I am not well versed in the literature but my impression is that things like torturing animals, bullying, general violence, or callous-unemotional personality traits (as assessed by others) are somewhat predictive of malevolence. But the problem is that you'll probably also get many false positives from those indicators.

Regarding environmental or developmental interventions, we write this in Appendix B:

Malevolent personality traits are plausibly exacerbated by adverse (childhood) environments—e.g. ones rife with abuse, bullying, violence or poverty (cf. Walsh & Wu, 2008). Thus, research to identify interventions to improve such environmental factors could be valuable. (However, the relevant areas appear to be very crowded. Also, the shared environment appears to have a rather small effect on personality, including personality disorders (Knopik et al., 2018, ch. 16; Johnson et al., 2008; Plomin, 2019; Torgersen, 2009).)

Perhaps improving parenting standards and childhood environments could actually be a fairly promising EA cause. For instance, early advocacy against hitting children may have been a pretty effective lever to make society more civilised and less violent in general.

Comment by tobias_baumann on Reducing long-term risks from malevolent actors · 2020-05-02T16:14:08.885Z · score: 8 (5 votes) · EA · GW

Thanks for the comment!

I would guess that having better tests of malevolence, or even just a better understanding of it, may help with this problem. Perhaps a takeaway is that we should not just raise awareness (which can backfire via “witch hunts”), but instead try to improve our scientific understanding and communicate that to the public, which hopefully makes it harder to falsely accuse people.

In general, I don’t know what can be done about people using any means necessary to smear political opponents. It seems that the way to address this is to have good norms favoring “clean” political discourse, and good processes to find out whether allegations are true; but it’s not clear what can be done to establish such norms.

Comment by tobias_baumann on What is a good donor advised fund for small UK donors? · 2020-04-29T14:11:22.008Z · score: 11 (6 votes) · EA · GW

See here for a very similar question (and answers):

Comment by tobias_baumann on Adapting the ITN framework for political interventions & analysis of political polarisation · 2020-04-28T10:41:39.387Z · score: 18 (9 votes) · EA · GW

Great work, thanks for sharing! It's great to see this getting more attention in EA.

Just for those deciding whether to read the full thesis: it analyses four possible interventions to reduce polarisation: (1) switching from FPTP to proportional representation, (2) making voting compulsory, (3) increasing the presence of public service broadcasting, and (4) creating deliberative citizen's assemblies. Olaf's takeaway (as far as I understand it) is that those interventions seem compelling and fairly tractable but the evidence of possible impacts is often not very strong.

Comment by tobias_baumann on My thoughts on Toby Ord’s existential risk estimates · 2020-04-15T21:33:33.600Z · score: 5 (4 votes) · EA · GW

Well, historically, there have been quite a few pandemics that killed more than 10% of people, e.g. the Black Death or Plague of Justinian. There's been no pandemic that killed everyone.

Is your point that it's different for anthropogenic risks? Then I guess we could look at wars for historic examples. Indeed, there have been wars that killed something on the order of 10% of people, at least in the warring nations, and IMO that is a good argument to take the risk of a major war quite seriously.

But there have been far more wars that killed fewer people, and none that caused extinction. The literature usually models the number of casualties as a Pareto distribution, which means that the probability density is monotonically decreasing in the number of deaths. (For a broader reference class of atrocities, genocides, civil wars etc., I think the picture is similar.)

But we don't in fact see lots of unknown risks killing even 0.1% of the population.

Smoking, lack of exercise, and unhealthy diets each kill more than 0.1% of the population each year. Coronavirus may kill 0.1% in some countries. The advent of cars in the 20th century resulted in 60 million road deaths, which is maybe 0.5% of everyone alive over that time (I haven't checked this in detail). That can be seen as an unknown from the perspective of someone in 1900. Granted, some of those are more gradual than the sort of catastrophe people have in mind - but actually I'm not sure why that matters.

Looking at individual nations, I'm sure you can find many examples of civil wars, famines, etc. killing 0.1% of the population of a certain country, but far fewer examples killing 10% (though there are some). I'm not claiming the latter is 100x less likely but it is clearly much less likely.

You could have made the exact same argument in 1917, in 1944, etc. and you would have been wildly wrong.

I don't understand this. What do you think the exact same argument would have been, and why was that wildly wrong?

Comment by tobias_baumann on Coronavirus and non-humans: How is the pandemic affecting animals used for human consumption? · 2020-04-08T20:59:29.491Z · score: 6 (2 votes) · EA · GW

Interesting, thanks!

However, I disagree with the idea that coronavirus doesn't have anything to do with animal farming.

Yeah, I wrote this based on having read that the origins of coronavirus involved bats. After reading more, it seems not that simple because farmed animals may have enabled the virus to spread between species.

Comment by tobias_baumann on My thoughts on Toby Ord’s existential risk estimates · 2020-04-07T13:26:28.610Z · score: 5 (9 votes) · EA · GW

I haven't looked at this in much detail, but Ord's estimates seem too high to me. It seems really hard for humanity to go extinct, considering that there are people in remote villages, people in submarines, people in mid-flight at the time a disaster strikes, and even people on the International Space Station. (And yes, there are women on the ISS, I looked that up.) I just don't see how e.g. a pandemic would plausibly kill all those people.

Also, if engineered pandemics, or "unforeseen" and "other" anthropogenic risks have a chance of 3% each of causing extinction, wouldn't you expect to see smaller versions of these risks (that kill, say, 10% of people, but don't result in extinction) much more frequently? But we don't observe that.

(I haven't read Ord's book so I don't know if he addresses these points.)

Comment by tobias_baumann on Coronavirus and non-humans: How is the pandemic affecting animals used for human consumption? · 2020-04-07T11:01:45.280Z · score: 6 (4 votes) · EA · GW

Great work, thanks for writing this up!

I'm wondering how this might affect the public debate on factory farming. Animal advocates sometimes argue that factory farms contribute to antibiotic resistance, and this point may carry much more force in the future. So perhaps one key conclusion is that advocates should emphasise this angle more in the future. (That said, AFAIK the coronavirus doesn't have anything to do with farmed animals, and my impression from a quick Google search is that the issue of antibiotic resistance is manageable with the right regulations.)

Comment by tobias_baumann on Effective Altruism and Free Riding · 2020-03-29T22:33:20.592Z · score: 20 (8 votes) · EA · GW

Interesting, thanks for writing this up!

In practice, and for the EA community in particular, I think there are some reasons why the collective action problem isn't quite as bad as it may seem. For instance, with diminishing marginal returns on causes, the most efficient allocation will be a portfolio of interventions with weights roughly proportional to how much people care on average. But something quite similar can also happen in the non-cooperative equilibrium for some diversity of actors who all support the cause they're most excited about. (Maybe this is similar to case D in your analysis.)

Can you point to examples of concrete EA causes that you think get too much or too little resources due to these collective action problems?

Comment by tobias_baumann on AMA: Leah Edgerton, Executive Director of Animal Charity Evaluators · 2020-03-18T09:46:21.759Z · score: 12 (5 votes) · EA · GW

How many resources do you think the EAA movement (and ACE in particular) should invest in animal causes that are less "mainstream", such as invertebrate welfare or wild animal suffering?

What would convince you that it should be more (or less) of a focus?

Comment by tobias_baumann on Harsanyi's simple “proof” of utilitarianism · 2020-02-21T13:14:41.810Z · score: 2 (2 votes) · EA · GW

You're right; I meant to refer to the violation of individual rationality. Thanks!

Comment by tobias_baumann on Harsanyi's simple “proof” of utilitarianism · 2020-02-20T17:03:36.630Z · score: 10 (6 votes) · EA · GW

Thanks for writing this up! I agree that this result is interesting, but I find it unpersuasive as a normative argument. Why should morality be based on group decision-making principles? Why should I care about VNM rationality of the group?

Also, you suggest that this result lends support to common EA beliefs. I'm not so sure about that. First, it leads to preference utilitarianism, not hedonic utilitarianism. Second, EAs tend to value animals and future people, but they would arguably not count as part of the "group" in this framework(?). Third, I'm not sure what this tells you about the creation or non-creation of possible beings (cf. the asymmetry in population ethics).

Finally, it's worth pointing out that you could also start with different assumptions and get very different results. For instance, rather than demanding that the group is VNM rational, one could consider rational individuals in a group who bargain over what to do, and then look at bargaining solutions. And it turns out that the utilitarian approach of adding up utilities is *not* a bargaining solution, because it violates Pareto-optimality in some cases. Does that "disprove" total utilitarianism?

(Using e.g. the Nash bargaining solution with many participants probably leads to some form of prioritarianism or egalitarianism, because you'd have to ensure that everyone benefits.)

Comment by tobias_baumann on Thoughts on electoral reform · 2020-02-19T10:28:30.224Z · score: 9 (6 votes) · EA · GW

I'm not entirely convinced that VSE is the right approach. It's theoretically appealing, but practical considerations, like perceptions of the voting process and public acceptance / "legitimacy" of the result, might be more important. Voters aren't utilitarian robots.

I was aware of the simulations you mentioned but I didn't check them in detail. I suspect that these results are very sensitive to model assumptions, such as tactical voting behaviour. But it would be interesting to see more work on VSE.

What EAs definitely shouldn't do, in my opinion, is to spend considerable resources discrediting those alternatives to one's own preferred system, as FairVote has repeatedly done with respect to approval voting. Much more is gained by displacing plurality than is lost by replacing it with a suboptimal alternative (for all reasonable alternatives to plurality).

Strongly agree with this!

Comment by tobias_baumann on Should Longtermists Mostly Think About Animals? · 2020-02-04T11:14:56.915Z · score: 11 (8 votes) · EA · GW

If you think animals on average have net-negative lives, the primary value in preventing x-risks might not be ensuring human existence for humans’ sake, but rather ensuring that humans exist into the long-term future to steward animal welfare, to reduce animal suffering, and to move all animals toward having net-positive lives.

This assumes that (future) humans will do more to help animals than to harm them. I think many would dispute that, considering how humans usually treat animals (in the past and now). It is surely possible that future humans would be much more compassionate and act to reduce animal suffering, but it's far from clear, and it's also quite possible that there will be something like factory farming on an even larger scale.

Comment by Tobias_Baumann on [deleted post] 2020-01-31T11:14:22.461Z

I don't think you've established that the 'technological transformation' is essential. If one believes that something like AI is unlikely in the foreseeable future, one can still try to shape the long-term future through other means, such as moral circle expansion, improving international cooperation, improving political processes (e.g. trying to empower future people, voting reform, reducing polarisation), and so on.

You may believe that shaping AI / the technological transformation would offer far more leverage than other interventions, but some will disagree with that, which is a strong reason to not include this in the definition.

Also, while many longtermist EAs believe that AI / a technological transformation is likely to happen this century, there are still some who don't. I for one am quite unsure about this.

Comment by tobias_baumann on UK donor-advised funds · 2020-01-22T15:12:34.138Z · score: 9 (3 votes) · EA · GW

I looked into this a while ago and ended up with a similar conclusion. The main options (to my knowledge) are NPT-UK, Prism the Gift Fund, and CAF's giving account.

Their fees all seemed too high for me to actually open a DAF (although sometimes it's not transparent and you're just supposed to get in touch). In particular, yearly fees eat up a significant fraction of the money if you leave it in for decades, so it seems unsuitable for such plan. It's probably so expensive because there are relatively few people who are interested in such accounts, and there is a lot of administrative work done by the fund (Gift Aid etc.).

Comment by tobias_baumann on Improving Pest Management for Wild Insect Welfare · 2019-12-26T20:26:47.450Z · score: 5 (4 votes) · EA · GW

Great work, thanks for writing this up!

Comment by tobias_baumann on Next Steps in Invertebrate Welfare, Part 3: Understanding Attitudes and Possibilities · 2019-11-19T12:24:45.958Z · score: 2 (2 votes) · EA · GW

Thanks for writing this up!

In this regard, Michael Greger (of Nutrition Facts) argues forcefully that anti-honey advocacy hurts the vegan movement. Many people apparently have trouble ascribing morally valuable states to cows and pigs. The idea that bees might suffer (and that we should care about their suffering) strikes these people as crazy. If an average person thinks that a small part of vegan ‘ideology’ is crazy, motivated reasoning will easily allow this thought to infect their perception of the rest of the vegan worldview. Hence, the knowledge that vegans care about bees may lead many people to show less compassion toward cows and pigs than they otherwise would[5].

Is there evidence that this is a significant effect? There are many lines of motivated reasoning, and if you avoid this one, perhaps people will just find another. My impression is that people who reject an idea or ideology because of some association with something 'crazy' are actually often just opposed to the idea/ideology in general, and would still be opposed if the 'crazy' thing wasn't around.

Also, there is an effect in the opposite direction from moving the Overton window, or making others look more moderate. (Cf. )

In sum, even if invertebrate welfare is a worthwhile cause, several factors may prevent us from considering this issue properly. Additionally, there is the worry that rushing into a direct advocacy campaign may create hard-to-reverse lock-in effects. If the initial message is suboptimal, these lock-in effects can impose substantial costs. Hence, directly advocating for invertebrate welfare at this time might be actively counterproductive, both to the invertebrate welfare cause area and effective altruism more generally[11].

While I agree that we should be very careful about publicity at this point, I feel like there might still be opportunities for thoughtful advocacy. It seems not implausible that we could find angles that are mainstream-compatible and begin to normalise concern for invertebrates - e.g. extending welfare laws to lobsters.

Comment by tobias_baumann on Next Steps in Invertebrate Welfare, Part 2: Possible Interventions · 2019-11-18T13:08:46.238Z · score: 8 (4 votes) · EA · GW

Great work - thanks for writing this up!

Comment by tobias_baumann on Institutions for Future Generations · 2019-11-12T16:07:06.017Z · score: 20 (17 votes) · EA · GW

Here's another proposal:

We give every contemporary citizen shares in a newly created security. This security settles in, say, 100 years (in 2119), and its settlement value will be based on the degree to which 2119 people approve of the actions of people in the 2019-2119 timespan, as determined by a standardised survey - say, on a scale from 0 to 10.

This gives contemporary people a direct financial incentive to do what future people would approve of, and uses market mechanisms to generate accurate judgments.

(One might think that this doesn't work because people will go "I'll be dead before this settles", but I think this isn't really a problem - there is also an Austrian bond that settles in 100 years, and that doesn't seem to be a problem.)

Comment by tobias_baumann on Next Steps in Invertebrate Welfare, Part 1: Fundamental Research · 2019-11-12T15:39:49.582Z · score: 3 (3 votes) · EA · GW
A reason why it is not necessarily true that there is net suffering in nature is the hypothesis that small individuals–as invertebrates–may have less intense sentient experiences. In that scenario, small animals would experience relatively less suffering and more enjoyment than larger ones.

I don't understand how this follows. Wouldn't less intense experiences affect both suffering and pleasure equally?

Comment by tobias_baumann on Next Steps in Invertebrate Welfare, Part 1: Fundamental Research · 2019-11-12T15:37:59.700Z · score: 11 (5 votes) · EA · GW

Great work - thanks for writing this up!

The question of invertebrate sentience is surely important, but I'm not sure if further research on this is a top priority. Some relevant uncertainties:

  • Would further research significantly reduce uncertainty about invertebrate sentience? It seems that most people who thought about this have settled on something like "there is a significant chance that many invertebrate taxa are sentient, but we don't know for sure".
  • To what is society's lack of moral concern for invertebrates due to the belief that invertebrates are not sentient, rather than other factors (e.g. disgust reaction towards many invertebrates, or the difficulty of avoiding harm to insects in everyday life)?
Comment by tobias_baumann on EA Handbook 3.0: What content should I include? · 2019-10-01T09:02:26.169Z · score: 10 (11 votes) · EA · GW

I'd like to suggest including an article on reducing s-risks (e.g. or as another possible perspective on longtermism, in addition to AI alignment and x-risk reduction.

Comment by tobias_baumann on Are we living at the most influential time in history? · 2019-09-10T10:16:08.654Z · score: 2 (2 votes) · EA · GW

I don't understand this. Your last comment suggests that there may be several key events (some of which may be in the past), but I read your top-level comment as assuming that there is only one, which precludes all future key events (i.e. something like lock-in or extinction). I would have interpreted your initial post as follows:

Suppose we observe 20 past centuries during which no key event happens. By Laplace's Law of Succession, we now think that the odds are 1/22 in each century. So you could say that the odds that a key event "would have occurred" over the course of 20 centuries is 1 - (1-1/22)^20 = 60.6%. However, we just said that we observed no key event, and that's what our "hazard rate" is based on, so it is moot to ask what could have been. The probability is 0.

This seems off, and I think the problem is equating "no key event" with "not hingy", which is too simple because one can potentially also influence key events in the distant future. (Or perhaps there aren't even any key events, or there are other ways to have a lasting impact.)

Comment by tobias_baumann on How do most utilitarians feel about "replacement" thought experiments? · 2019-09-07T20:31:33.923Z · score: 7 (6 votes) · EA · GW

I don't understand why this question has been downvoted by some people? It is a perfectly reasonable and interesting question. (The same holds for comments by Simon Knutsson and Magnus Vinding, which to me seem informative and helpful but have been downvoted.)

Comment by tobias_baumann on Are we living at the most influential time in history? · 2019-09-07T11:02:00.992Z · score: 3 (5 votes) · EA · GW

The following is yet another perspective on which prior to use, which questions whether we should assume some kind of uniformity principle:

As has been discussed in other comments and the initial text, there are some reasons to expect later times to be hingier (e.g. better knowledge) and there are some reasons to expect earlier times to be hingier (e.g. because of smaller populations). It is plausible that these reasons skew one way or another, and this effect might outweigh other sources of variance in hinginess.

That means that the hingiest times are disproportionately likely to be either a) the earliest generation (e.g. humans in pre-historic population bottlenecks) or b) the last generation (i.e. the time just before some lock-in happens). Our time is very unlikely to be the hingiest in this perspective (unless you think that lock-in happens very soon). So this suggests a low prior for HoH; however, what matters is arguably comparing present hinginess to the future, rather than to the past. And in this perspective it would be not-very-unlikely that our time is hingier than all future times.

In other words, rather than there being anything special about our time, it could just the case that a) hinginess generally decreases over time and b) this effect is stronger than other sources of variance in hinginess. I'm fairly agnostic about both of these claims, and Will argued against a), but it's surely likelier than 1 in 100000 (in the absense of further evidence), and arguably likelier even than 5%. (This isn't exactly HoH because past times would be even hingier.)

Comment by tobias_baumann on Are we living at the most influential time in history? · 2019-09-05T11:04:43.795Z · score: 6 (5 votes) · EA · GW
inverse relationship between population size and hingeyness

Maybe it's a nitpick but I don't think this is always right. For instance, suppose that from now on, population size declines by 20% each century (indefinitely). I don't think that would mean that later generations are more hingy? Or, imagine a counterfactual where population levels are divided by 10 across all generations – that would mean that one controls a larger fraction of resources but can also affect fewer beings, which prima facie cancels out.

It seems to me that the relevant question is whether the present population size is small compared to the future, i.e. whether the present generation is a "population bottleneck". (Cf. Max Daniel's comment.) That's arguably true for our time (especially if space colonisation becomes feasible at some point) and also in the rebuilding scenario you mentioned.

Comment by tobias_baumann on Are we living at the most influential time in history? · 2019-09-04T14:50:07.082Z · score: 2 (2 votes) · EA · GW

Do you think that this effect only happens in very small populations settling new territory, or is it generally the case that a smaller population means more hinginess? If the latter, then that suggests that, all else equal, the present is hingier than the future (though the past is even hingier), if we assume that future populations are bigger (possibly by a large factor). While the current population is not small in absolute terms, it could plausibly be considered a population bottleneck relative to a future cosmic civilisation (if space colonisation becomes feasible).

Comment by tobias_baumann on Are we living at the most influential time in history? · 2019-09-03T12:02:36.463Z · score: 11 (11 votes) · EA · GW

Great post! It's great to see more thought going into these issues. Personally, I'm quite sceptical about claims that our time is especially influential, and I don't have a strong view on whether our time is more or less hingy than other times. Some additional thoughts:

I got the impression that you assume that some time (or times) are particularly hingy (and then go on to ask whether it's our time). But it is also perfectly possible that no time is hingy, so I feel that this assumption needs to be justified. Of course, there is some variation and therefore there is inevitably a most influential time, but the crux of the matter is whether there are differences by a large factor (not just 1.5x). And that is not obvious; for instance, if we look at how people in the past could have shaped 21st century societies, it is not clear to me whether any time was especially important.

I think a key question for longtermism is whether the evolution of values and power will eventually settle in some steady state (i.e. the end of history). It is plausible that hinginess increases as one gets closer to this point. (But it's not obvious, e.g. there could just be a slow convergence to a world government without any pivotal events.) By contrast, if values and influence drift indefinitely, as they did so far in human history, then I don't see strong reasons to expect certain times to be particularly hingy. So it is crucial to ask whether a (non-extinction) steady state will happen, and how far away we are from it. (See also this related post of mine.)

"I suggest that in the past, we have seen hinginess increase. I think that most longtermists I know would prefer that someone living in 1600 passed resources onto us, today, rather than attempting direct longtermist influence."

Does this take into account that there have been fewer people around in 1600, and many ways to have an influence were far less competitive? I feel that a person in 1600 could have had a significant impact, e.g. via advocacy for the "right" moral views (e.g. publishing good arguments for consequentialism, antispeciesism, etc.) or by pushing for general improvements like reducing violence and increasing cooperation. So I don't quite agree with your take on this, though I wouldn't claim the opposite either – it is not obvious to me whether hinginess increased or decreased. (By your inductive argument, that suggests that it's not clear whether the future will be more or less hingy than the present.)

"A related, but more general, argument, is that the most pivotal point in time is when we develop techniques for engineering the motivations and values of the subsequent generation (such as through AI, but also perhaps through other technology, such as genetic engineering or advanced brainwashing technology), and that we’re close to that point."

Similar to your recent point about how creating smarter-than human intelligence has long been feasible, I'd guess that, given strong enough motivation, a lock-in would already be feasible via brainwashing, propaganda, and sufficiently ruthless oppression of opposition. (We've had these "technologies" for a long time.) The reason why this doesn't quite work in totalitarian states is that a) what you want to lock in is usually the power of an individual dictator or some group of humans, but there's no way to prevent death, and b) people are not fully aligned with the dictator even at the beginning, which limits what you can do (principal-agent problems etc.). The reason we don't it in liberal democracies is that a) we strongly disapprove of the necessary methods, b) we value free speech and personal autonomy, and c) most people don't really mind moderate forms of value drift. So it's to a large extent a question of motivation and taboos, and it is quite possible that people will reject the use of future lock-in technologies for similar reasons.

Comment by tobias_baumann on Ask Me Anything! · 2019-08-30T10:11:10.228Z · score: 4 (4 votes) · EA · GW
There’s a lot of debate about the causes of the industrial revolution. Very few commentators point to some technological breakthrough as the cause, so it's striking that people are inclined to point to a technological breakthrough in AI as the cause of the next growth mode transition. Instead, leading theories point to some resource overhang (‘colonies and coal’), or some innovation or change in institutions (more liberal laws and norms in England, or higher wages incentivising automation) or in culture. So perhaps there’s some novel governance system that could drive a higher growth mode, and that'll be the decisive thing.

Strongly agree. I think it's helpful to think about it in terms of the degree to which social and economic structures optimise for growth and innovation. Our modern systems (capitalism, liberal democracy) do reward innovation - and maybe that's what caused the growth mode change - but we're far away from strongly optimising for it. We care about lots of other things, and whenever there are constraints, we don't sacrifice everything on the altar of productivity / growth / innovation. And, while you can make money by innovating, the incentive is more about innovations that are marketable in the near term, rather than maximising long-term technological progress. (Compare e.g. an app that lets you book taxis in a more convenient way vs. foundational neuroscience research.)

So, a growth mode could be triggered by any social change (culture, governance, or something else) resulting in significantly stronger optimisation pressures for long-term innovation.

That said, I don't really see concrete ways in which this could happen and current trends do not seem to point in this direction. (I'm also not saying this would necessarily be a good thing.)

Comment by tobias_baumann on Ask Me Anything! · 2019-08-22T09:16:31.366Z · score: 20 (15 votes) · EA · GW

I disagree with your implicit claim that Will's views (which I mostly agree with) constitute an extreme degree of confidence. I think it's a mistake to approach these questions with a 50-50 prior. Instead, we should consider the base rate for "events that are at least as transformative as the industrial revolution".

That base rate seems pretty low. And that's not actually what we're talking about - we're talking about AGI, a specific future technology. In the absense of further evidence, a prior of <10% on "AGI takeoff this century" seems not unreasonable to me. (You could, of course, believe that there is concrete evidence on AGI to justify different credences.)

On a different note, I sometimes find the terminology of "no x-risk", "going well" etc. unhelpful. It seems more useful to me to talk about concrete outcomes and separate this from normative judgments. For instance, I believe that extinction through AI misalignment is very unlikely. However, I'm quite uncertain about whether people in 2019, if you handed them a crystal ball that shows what will happen (regarding AI), would generally think that things are "going well", e.g. because people might disapprove of value drift or influence drift. (The future will plausibly be quite alien to us in many ways.) And finally, in terms to my personal values, the top priority is to avoid risks of astronomical suffering (s-risks), which is another matter altogether. But I wouldn't equate this with things "going well", as that's a normative judgment and I think EA should be as inclusive as possible towards different moral perspectives.

Comment by tobias_baumann on Ask Me Anything! · 2019-08-20T14:21:37.585Z · score: 32 (13 votes) · EA · GW

Very interesting points! I largely agree with your (new) views. Some thoughts:

  • If you think that extinction risk this century is less than 1%, then in particular, you think that extinction risk from transformative AI is less than 1%. So, for this to be consistent, you have to believe either
    • a) that it's unlikely that transformative AI will be developed at all this century,
    • b) that transformative AI is unlikely to lead to extinction when it is developed, e.g. because it will very likely be aligned in at least a narrow sense. (I wrote up some arguments for this a while ago.)
  • Which of the two do you believe to what extent? For instance, if you put 10% on transformative AI this century – which is significantly more conservative than "median EA beliefs" – then you’d have to believe that the conditional probability of extinction is less than 10%. (I’m not saying I disagree – in fact, I believe something along these lines myself.)
  • What do you think about the possibility of a growth mode change (i.e. much faster pace of economic growth and probably also social change, comparable to the industrial revolution) for reasons other than AI? I feel that this is somewhat neglected in EA – would you agree with that?


I’d also be interested in more details on what these beliefs imply in terms of how we can improve the long-term future. I suppose you are now more sceptical about work on AI safety as the “default” long-termist intervention. But what is the alternative? Do you think we should focus on broad improvements to civilisation, such as better governance, working towards compromise and cooperation rather than conflict / war, or generally trying to make humanity more thoughtful and cautious about new technologies and the long-term future? These are uncontroversially good but not very neglected, and it seems hard to get a lot of leverage in this way. (Then again, maybe there is no way to get extraordinary leverage over the long-term future.)

Also, if we aren't at a particularly influential point in time regarding AI, then I think that expanding the moral circle, or otherwise advocating for "better" values, may be among the best things we can do. What are your thoughts on that?

Comment by tobias_baumann on Invertebrate Welfare Cause Profile · 2019-07-30T10:43:49.848Z · score: 3 (3 votes) · EA · GW

Thanks Jason – I'm excited to see more research on this!

What do you make of the possibility of flow-through effects on long-term attitudes towards insects / invertebrates? For instance, one could argue that entomophagy is particularly relevant because it involves a lot of people directly harming insects – which might, similar to meat consumption, bias people against giving moral weight to insects. (On the other hand, we already engage in many other everyday practices that harm insects or invertebrates – even just walking around outside will squash some bugs.)

Perhaps it would be interesting to study how the saliency of causing direct harm to insects / invertebrates affects people's attitude?

Comment by tobias_baumann on Invertebrate Welfare Cause Profile · 2019-07-29T18:08:49.853Z · score: 6 (3 votes) · EA · GW

Excellent work!

Re: entomophagy, I think the problem isn't just direct consumption, but also the use of insects as animal feed – see e.g. this article. Unlike directly eating insects, this doesn't evoke a strong disgust reaction.

Comment by tobias_baumann on Effective animal advocacy movement building: a neglected opportunity? · 2019-06-12T09:42:26.181Z · score: 4 (4 votes) · EA · GW

Great post – thanks for writing this up!

Comment by tobias_baumann on Risk factors for s-risks · 2019-02-17T21:18:51.280Z · score: 1 (1 votes) · EA · GW

Thank you – great to hear that you've found it useful!

Comment by tobias_baumann on Why I expect successful (narrow) alignment · 2019-01-02T12:31:36.379Z · score: 1 (1 votes) · EA · GW

Thanks for the detailed comments!

(Also, BTW, I would have preferred the word "narrow" or something like it in the post title, because some people use "alignment" in a broad sense and as a result may misinterpret you as being more optimistic than you actually are.)

Good point – changed the title.

Also, distributed emergence of AI is likely not safer than centralized AI, because an "economy" of AIs would be even harder to control and harness towards human values than a single or small number of AI agents.

As long as we consider only narrow alignment, it does seem safer to me in that local misalignment or safety issues in individual systems would not immediately cause everything to break down, because such a system would (arguably) not be able to obtain a decisive strategic advantage and take over the world. So there'd be time to react.

But I agree with you that an economy-like scenario entails other safety issues, and aligning the entire "economy" with human (compromise) values might be very difficult. So I don't think this is safer overall, or at least it's not obvious. (From my suffering-focused perspective, distributed emergence of AI actually seems worse than a scenario of the form "a single system quickly takes over and forms a singleton", as the latter seems less likely to lead to conflict-related disvalue.)

This assumes that alignment work is highly parallelizable. If it's not, then doing more alignment work now can shift the whole alignment timeline forward, instead of just adding to the total amount of alignment work in a marginal way.

Yeah, I do think that alignment work is fairly parallelizable, and future work also has a (potentially very big) information advantage over current work because they will know more about what AI techniques look like. Is there any precedent of a new technology where work on safety issues was highly serial and where it was therefore crucial to start working on safety a long time in advance?

This only applies to short-term "alignment" and not to long-term / scalable alignment. That is, I have an economic incentive to build an AI that I can harness to give me short-term profits, even if that's at the expense of the long term value of the universe to humanity or human values. This could be done for example by creating an AI that is not at all aligned with my values and just giving it rewards/punishments so that it has a near-term instrumental reason to help me (similar to how other humans are useful to us even if they are not value aligned to us).

I think there are two different cases:

  • If the human actually cares only about short-term selfish gain, possibly at the expense of others, then this isn't a narrow alignment failure, it's a cooperation problem. (But I agree that it could be a serious issue).
  • If the human actually cares about the long term, then it appears that she's making a mistake by buying an AI system that is only aligned in the short term. So it comes down to human inadequacy – given sufficient information she'd buy a long-term aligned AI system instead, and AI companies would have incentive to provide long-term aligned AI systems. Though of course the "sufficient information" part is crucial, and is a fairly strong assumption as it may be hard to distinguish between "short-term alignment" and "real" alignment. I agree that this is another potentially serious problem.
I think we ourselves don't know how to reliably distinguish between "attempts to manipulate" and "attempts to help" so it would be hard to AIs to learn this. One problem is, our own manipulate/help classifier was trained on a narrow set of inputs (i.e., of other humans manipulating/helping) and will likely fail when applied to AIs due to distributional shift.

Interesting point. I think I still have an intuition that there's a fairly simple core to it, but I'm not sure how to best articulate this intuition.

Comment by tobias_baumann on Why I expect successful (narrow) alignment · 2018-12-30T12:14:02.976Z · score: 2 (2 votes) · EA · GW

Working on these problems makes a lot of sense, and I'm not saying that the philosophical issues around what "human values" means will likely be solved by default.

I think increasing philosophical sophistication (or "moral uncertainty expansion") is a very good idea from many perspectives. (A direct comparison to moral circle expansion would also need to take relative tractability and importance into account, which seems unclear to me.)

Comment by tobias_baumann on Why I expect successful (narrow) alignment · 2018-12-30T11:51:46.672Z · score: 3 (3 votes) · EA · GW

Great point – I agree that it would be value to have a common scale.

I'm a bit surprised by the 1-10% estimate. This seems very low, especially given that "serious catastrophe caused by machine intelligence" is broader than narrow alignment failure. If we include possibilities like serious value drift as new technologies emerge, or difficult AI-related cooperation and security problems, or economic dynamics riding roughshod over human values, then I'd put much more than 10% (plausibly more than 50%) on something not going well.

Regarding the "other thoughtful people" in my 80% estimate: I think it's very unclear who exactly one should update towards. What I had in mind is that many EAs who have thought about this appear to not have high confidence in successful narrow alignment (not clear if the median is >50%?), judging based on my impressions from interacting with people (which is obviously not representative). I felt that my opinion is quite contrarian relative to this, which is why I felt that I should be less confident than the inside view suggests, although as you say it's quite hard to grasp what people's opinions actually are.

On the other hand, one possible interpretation (but not the only one) of the relatively low level of concern for AI risk among the larger AI community and societal elites is that people are quite optimistic that "we'll know how to cross that bridge once we get to it".

Comment by tobias_baumann on Problems with EA representativeness and how to solve it · 2018-08-06T08:53:16.535Z · score: 7 (9 votes) · EA · GW

Agreed. As someone who prioritises s-risk reduction, I find it odd that long-termism is sometimes considered equivalent to x-risk reduction. It is legitimate if people think that x-risk reduction is the best way to improve the long-term, but it should be made clear that this is based on additional beliefs about ethics (rejecting suffering-focused views and not being very concerned about value drift), about how likely x-risks in this century are, and about how tractable it is to reduce them, relative to other ways of improving the long-term. I for one think that none of these points is obvious.

So I feel that there is a representativeness problem between x-risk reduction and other ways of improving the long-term future (not necessarily only s-risk reduction), in addition to an underrepresentation of near-term causes.

Comment by tobias_baumann on Multiverse-wide cooperation in a nutshell · 2017-11-02T16:25:27.427Z · score: 4 (4 votes) · EA · GW

Thanks for writing this up!

I think the idea is intriguing, and I agree that this is possible in principle, but I'm not convinced of your take on its practical implications. Apart from heuristic reasons to be sceptical of a new idea on this level of abstractness and speculativeness, my main objection is that a high degree of similarity with respect to reasoning (which is required for the decisions to be entangled) probably goes along with at least some degree of similarity with respect to values. (And if the values of the agents that correlate with me are similar to mine, then the result of taking them into account is also closer to my own values than the compromise value system of all agents.)

You write:

Superrationality only motivates cooperation if one has good reason to believe that another party’s decision algorithm is indeed extremely similar to one’s own. Human reasoning processes differ in many ways, and sympathy towards superrationality represents only one small dimension of one’s reasoning process. It may very well be extremely rare that two people’s reasoning is sufficiently similar that, having common knowledge of this similarity, they should rationally cooperate in a prisoner’s dilemma.

Conditional on this extremely high degree of similarity to me, isn't it also more likely that their values are also similar to mine? For instance, if my reasoning is shaped by the experiences I've made, my genetic makeup, or the set of all ideas I've read about over the course of my life, then an agent with identical or highly similar reasoning would also share a lot of these characteristics. But of course, my experiences, genes, etc. also determine my values, so similarity with respect to these factors implies similarity with respect to values.

This is not the same as claiming that a given characteristic X that's relevant to decision-making is generally linked to values, in the sense that people with X have systematically different values. It's a subtle difference: I'm not saying that certain aspects of reasoning generally go along with certain values across the entire population; I'm saying that a high degree of similarity regarding reasoning goes along with similarity regarding values.

Comment by tobias_baumann on An Argument for Why the Future May Be Good · 2017-07-20T08:40:43.714Z · score: 12 (18 votes) · EA · GW

Thanks for writing this up! I agree that this is a relevant argument, even though many steps of the argument are (as you say yourself) not airtight. For example, consciousness or suffering may be related to learning, in which case point 3) is much less clear.

Also, the future may contain vastly larger populations (e.g. because of space colonization), which, all else being equal, may imply (vastly) more suffering. Even if your argument is valid and the fraction of suffering decreases, it's not clear whether the absolute amount will be higher or lower (as you claim in 7.).

Finally, I would argue we should focus on the bad scenarios anyway – given sufficient uncertainty – because there's not much to do if the future will "automatically" be good. If s-risks are likely, my actions matter much more.

(This is from a suffering-focused perspective. Other value systems may arrive at different conclusions.)

Comment by tobias_baumann on My current thoughts on MIRI's "highly reliable agent design" work · 2017-07-08T08:31:50.583Z · score: 1 (1 votes) · EA · GW

Do you mean more promising than other technical safety research (e.g. concrete problems, Paul's directions, MIRI's non-HRAD research)?

Yeah, and also (differentially) more promising than AI strategy or AI policy work. But I'm not sure how strong the effect is.

If so, I'd be interested in hearing why you think hard / unexpected takeoff differentially favors HRAD.

In a hard / unexpected takeoff scenario, it's more plausible that we need to get everything more or less exactly right to ensure alignment, and that we have only one shot at it. This might favor HRAD because a less principled approach makes it comparatively unlikely that we get all the fundamentals right when we build the first advanced AI system.

In contrast, if we think there's no such discontinuity and AI development will be gradual, then AI control may be at least somewhat more similar (but surely not entirely comparable) to how we "align" contemporary software systems. That is, it would be more plausible that we could test advanced AI systems extensively without risking catastrophic failure or that we could iteratively try a variety of safety approaches to see what works best.

It would also be more likely that we'd get warning signs of potential failure modes, so that it's comparatively more viable to work on concrete problems whenever they arise, or to focus on making the solutions to such problems scalable – which, to my understanding, is a key component of Paul's approach. In this picture, successful alignment without understanding the theoretical fundamentals is more likely, which makes non-HRAD approaches more promising.

My personal view is that I find a hard and unexpected takeoff unlikely, and accordingly favor other approaches than HRAD, but of course I can't justify high confidence in this given expert disagreement. Similarly, I'm not highly confident that the above distinction is actually meaningful.

I'd be interested in hearing your thoughts on this!

Comment by tobias_baumann on My current thoughts on MIRI's "highly reliable agent design" work · 2017-07-07T14:49:05.074Z · score: 1 (1 votes) · EA · GW

Great post! I agree with your overall assessment that other approaches may be more promising than HRAD.

I'd like to add that this may (in part) depend on our outlook on which AI scenarios are likely. Conditional on MIRI's view that a hard or unexpected takeoff is likely, HRAD may be more promising (though it's still unclear). If the takeoff is soft or AI will be more like the economy, then I personally think HRAD is unlikely to be the best way to shape advanced AI.

(I wrote a related piece on strategic implications of AI scenarios.)

Comment by tobias_baumann on The asymmetry and the far future · 2017-03-10T10:00:15.600Z · score: 12 (12 votes) · EA · GW

Thanks for your post! I agree that work on preventing risks of future suffering is highly valuable.

It’s tempting to say that it implies that the expected value of a miniscule increase in existential risk to all sentient life is astronomical.

Even if the future is negative according to your values, there are strong reasons not to increase existential risk. This would be extremely uncooperative towards other value systems, and there are many good reasons to be nice to other value systems. It is better to pull the rope sideways by working to improve the future (i.e. reducing risks of astronomical suffering) conditional on there being a future.

In addition, I think it makes sense for utilitarians to adopt a quasi-deontological rule against using violence, regardless of whether one is a classical utilitarian or suffering-focused. This obviously prohibits something like increasing risks of extinction.

Comment by tobias_baumann on Students for High Impact Charity: Review and $10K Grant · 2016-10-17T00:35:42.334Z · score: 3 (3 votes) · EA · GW

Thanks a lot, Peter, for taking the time to evaluate SHIC! I agree that their work seems to be very promising.

In particular, it seems that students and future leaders are one of the most important target groups of effective altruism.