Pascal's Mugging and abandoning credences
post by AndreaSR
This is a question post.
What are the theoretical obstacles to abandoning expected utility calculations regarding extremities like x-risk from a rogue AI system in order to avoid biting the bullet on Pascal’s Mugging? Does Bayesian epistemology really require that we assign a credence to any proposition at all and if so - shouldn’t we reject this framework in order to avoid fanaticism? It does not seem rational to me that we should assign credences to e.g. the success of specific x-risk mitigation interventions when there are so many unknown unknowns governing the eventual outcome.
I hope you can help me sort out this confusion.
answer by djbinder
) · GW
Attempts to reject fanatacism necessarily lead to major theoretical problems, as described for instance here and here.
However, questions about fanatacism are not that relevant for most questions about x-risk. The x-risks of greatest concern to most long-termists (AI risk, bioweapons, nuclear weapons, climate change) all have reasonable odds of occurring within the next century or so, and even if we care only about humans living in the next century or so we would find that these are valuable to prevent. This is mostly a consequence of the huge number of people alive today.
↑ comment by MichaelStJules ·
2021-07-09T23:13:03.296Z · EA(p) · GW(p)
I think timidity, as described in your first link, e.g. with a bounded social welfare function, is basically okay, but it's a matter of intuition (similarly, discomfort with Pascalian problems is a matter of intuition). However, it does mean giving up separability in probabilistic cases, and it may instead support x-risks reduction (depending on the details).
I would also recommend
Also, questions of fanaticism may be relevant for these x-risks, since it's not the probability of the risks that matter, but the difference you can make. There's also ambiguity, since it's possible to do more harm than good, by increasing the risk instead or increasing other risks (e.g. reducing extinction risks may increase s-risks, and you may be morally uncertain about how to weigh these).
↑ comment by AndreaSR ·
2021-07-09T17:01:28.992Z · EA(p) · GW(p)
Thanks for your answer. I don't think I under stand what you're saying, though. As I understand it, it makes a huge difference to the resource distribution that longtermism recommends, because if you allow for e.g. Bostrom's 10^52 happy lives to be the baseline utility, avoiding x-risk becomes vastly more important than if you just consider the 10^10 people alive today. Right?Replies from: djbinder
↑ comment by djbinder ·
2021-07-09T18:52:28.135Z · EA(p) · GW(p)
In principal I agree, although in practice there are other mitigating factors which means it doesn't seem to be that relevant.
This is partly because the 10^52 number is not very robust. In particular, once you start postulating such large numbers of future people I think you have to take the simulation hypothesis much more seriously, so that the large size of the far future may in fact be illusory. But even on a more mundane level we should probably worry that achieving 10^52 happy lives might be much harder than it looks.
It is partly also because at a practical level the interventions long-termists consider don't rely on the possibility of 10^52 future lives, but are good even over just the next few hundred years. I am not aware of many things that have smaller impacts and yet still remain robustly positive, such that we would only pursue them due to the 10^52 future lives. This is essentially for the reasons that asolomonr gives in their comment.
answer by Harrison Durland (Harrison D)
) · GW
- Bayesianism is largely about how to assign probabilities to things, it is not a ethical/normative doctrine like utilitarianism that tells you how you should prioritize your time. And as a (non-naïve) utilitarian will emphasize, when doing so-called “utilitarian calculus” (and related forms of analysis) is inefficient/less effective than using intuition, then you should rely on intuition.
- Especially when dealing with facially implausible/far-fetched claims about extremely high risk, I think it’s helpful to fight dubious fire with similarly dubious fire and then trim off the ashes: if someone says “there’s a slight (0.001%) chance that this (weird/dubious) intervention Y could prevent extinction, and that’s extremely important,” you might be able to argue that it is equally or even more likely that doing Y backfires or that doing Y prevents you from doing intervention Z which plausibly has a similar (unlikely) chance of preventing extinction. (See longer illustration block below)
In the end, these two points are not the only things to consider, but I think they tend to be the most neglected/overlooked whereas the complementary concepts are decently understood (although I might be forgetting something else).
Regarding 2 in more detail: Take for example classic Pascal's mugging-type situations, like "A strange-looking man in a suit walks up to you and says that he will warp up to his spaceship and detonate a super-mega nuke that will eradicate all life on earth if and only if you do not give him $50 (which you have in your wallet), but he will give you $3^^^3 tomorrow if and only if you give him $50." We could technically/formally suppose the chance he is being truthful is nonzero (e.g., 0.0000000001%), but still abide by rational expectation theory if you suppose that there are indistinguishably likely cases that cause the opposite expected value -- for example, the possibility that he is telling you the exact opposite of what he will do if you give him the money (for comparison, see the philosopher God response to Pascal's wager), or the possibility that the "true" mega-punisher/rewarder is actually just a block down the street and if you give your money to this random lunatic you won't have the $50 to give to the true one (for comparison, see the "other religions" response to the narrow/Christianity-specific Pascal's wager). More realistically, that $50 might be better donated to an X-risk charity. Add in the fact that stopping and thinking through this entire situation would be a waste of time that you could perhaps be using to help avert catastrophes in some other way (e.g., making money to donate to X-risk charities), and you’ve got a pretty strong case for not even entertaining the fantasy for a few seconds, and thus not getting paralyzed by naive application of expected value theory.
↑ comment by AndreaSR ·
2021-07-09T16:53:48.686Z · EA(p) · GW(p)
Thanks for your reply. A follow-up question: when I see the 'cancelling out'-argument, I always wonder why it doesn't apply to the x-risk case itself. It seems to me that you could just as easily argue that halting biotech research in order to enter the Long Reflection might backfire in some unpredictable way, or that aiming at Bostrom's utopia would ruin the chances of ending up in a vastly better state that we had never even dreamt of - and so on and so forth.
Isn't the whole case for longtermism so empirically uncertain as to be open to the 'cancelling out'-argument as well?
Hope it makes sense what I'm saying.Replies from: Harrison D
↑ comment by Harrison Durland (Harrison D) ·
2021-07-09T20:27:00.500Z · EA(p) · GW(p)
I do understand what you are saying, but my response (albeit as someone who is not steeped in longtermist/X-risk thought) would be "not necessarily (and almost certainly not entirely)."The tl;dr version is "there are lots of claims about X-risks and interventions to reduce x-risks that are reasonably more plausible than their reverse-claim." e.g., there are decent reasons to believe that certain forms of pandemic preparations reduce x-risk more than they increase x-risk. I can't (yet) give full, formalistic rules for how I apply the trimming heuristic, but some of the major points are discussed in the blocks below.
One key to using/understanding the trimming heuristic is that it is not meant to directly maximize the accuracy of your beliefs, rather it's meant to improve the effectiveness of your overall decision-making *in light of constraints on your time/cognitive resources. * If we had infinite time to evaluate everything--even possibilities that seem like red herrings--it would probably (usually) be optimal to do so, but we don't have infinite time so we have to make decisions as to what to spend our time analyzing and what to accept as "best-guesstimates" for particularly fuzzy questions. Here, intuition (including "when should we rely on various levels of intuition/analysis") can be far more effective than formalistic rules.
I think another key is to understand the distinction between risk and uncertainty: (to heavily simplify) risk refers to confidently verifiable/specific probabilities (e.g., a 1/20 chance of rolling a 1 on a standard 20-sided die) whereas uncertainty refers to when we don't confidently know the specific degree of risk (e.g., the chance of rolling a 1 on a confusingly-shaped 20-sided die which has never rolled a 1 yet, but perhaps might eventually).
In the end, I think my 3-4-ish conditions or at least factors for using the trimming heuristic are:
There is a high degree of uncertainty associated with the claim (e.g., it is not a well-established fact that there is a +0.01% chance of extinction upon enacting this policy)
The claim seems rather implausible/exaggerated on its face, but would require a non-trivial amount of time to clearly explain why (since it gets increasingly difficult to show why you ought to increase the number of zeros after a decimal point)
You can quickly fight fire with fire (e.g., think of opposite-outcome claims like I described)
There are other, more-realistic arguments to consider and your time is limited.
Comments sorted by top scores.
comment by asolomonr ·
2021-07-09T11:55:43.888Z · EA(p) · GW(p)
(I'm putting this as a comment and not an answer to reflect that I have a few tentative thoughts here but they're not well developed)
A really useful source that explains a Bayesian method of avoiding Pascal's mugging is this GiveWell Post. TL;dr much of the variation in EV estimates for situations that we know very little about comes from "estimate error", so we'd have very low credence in these estimates. Even if the most likely EV estimate for an action seems very positive, if there's extremely high variance due to having very little evidence on which to base that estimate, then we wouldn't be very surprised if the actual value is net zero or even negative. The post argues that we should also incorporate some sort of prior about the probability distribution of impacts that we can expect from actions. This basically makes us more skeptical the more outlandish the claim is. As a result, we're actually less persuaded to take an action if it is motivated by an extremely high but unfounded EV estimate versus an action that is equally unfounded but has a less extreme EV estimate and so falls closer to our prior about what is generally plausible. This seems to avoid Pascal's mugging. (This was my read of the post, it's completely possible that I misunderstood something and or that persuasive critiques of this reasoning exist and I haven't encountered them so far).
I think that another point here is whether the very promising but difficult to empirically verify claims that you're talking about are made with consideration of a representative spectrum of the possible outcomes for an action. As a bit of a toy example (and I'm not criticizing any actual view point here, this is just hopefully rhetorically illustrative), if you think that improving institutional decision making is really positive, your basic reasoning might look like: taking some action to teach decision makers about rationality has x small probability of reaching a person who now has y small probability of being in a position to decide something that has z hugely positive impact if decided with consideration of rational principles. Therefore the EV of me taking this action is xy * z = really big positive number. This only considers the most positive value direction that this action could unfold since it's assumed that within the much bigger 1-xy probability there are only basically neutral outcomes. It's at least plausible, however, that some of those outcomes are actually quit bad (say that you teach a decision maker an incorrect principle or that you present the idea badly and so through idea inoculation you dissuade someone from becoming more rational and this leads to some significant negative outcome). The likelihood of doing something bad is probably not that high, but say there's k chance that your action leads to m very bad outcome, then the actual EV is (xy * z ) - (k * m), which might be much lower than if we only considered the positive outcomes of this action. This might suggest that the EV estimates of the types of x-risk mitigation actions you're expressing some skepticism about could be forgetting to account for the possibility that they have a negative impact, which could meaningfully lower their EV. Although people may be already factoring such considerations in and just not necessarily making that explicit.