Posts

The cost of slow growth chickens 2019-09-12T17:46:14.376Z · score: 17 (10 votes)
Uncertainty and sensitivity analyses of GiveWell's cost-effectiveness analyses 2019-08-31T23:33:58.276Z · score: 57 (21 votes)
Consumer preferences for labgrown and plant-based meat 2019-08-08T18:45:05.581Z · score: 18 (7 votes)
Realizing the Mass Public Benefit of Evidence-Based Psychological Therapies: The IAPT Program 2019-07-16T00:25:07.010Z · score: 8 (5 votes)
"Moral Bias and Corrective Practices" and the possibility of an ongoing moral catastrophe 2019-06-24T22:38:15.036Z · score: 10 (3 votes)
Summary of Cartwright and Hardie's "Evidence-based Policy: A practical guide to doing it better" 2019-06-17T21:25:25.006Z · score: 17 (9 votes)
Doning with the devil 2018-06-15T15:51:06.030Z · score: 3 (3 votes)

Comments

Comment by cole_haus on [deleted post] 2019-09-13T20:27:41.473Z

I mentioned it in my comment elsewhere, but—from a quick look at the paper and the supplementary material—I don't think it's much like any of these. They don't make any special mention that I could find of trying to translate purely economic measures into welfare. The only mention I could find about income adjustment is "rich/poor specifications" which appears to be about splitting the formula for growth of damages into one of two forms depending on whether the country is rich or poor.

More plainly, I think the final number should be interpreted as yet another estimate in the long line of social cost of carbon estimates. It seems to be measuring the same thing as all the others (i.e. not utility) and I don't know where the idea of income-adjustment in this post is coming from.

Comment by cole_haus on [deleted post] 2019-09-10T21:03:16.594Z

Thanks, this is interesting! I quickly read through the core paper and am a bit confused.

It seems like you're understanding income adjustment to be one of the main additions in the paper. Where are you seeing that? The title/abstract/etc. seem to be pitching greater spatial resolution as the main contribution. Greater spatial resolution helps with income adjustment but isn't sufficient. As far as I can tell the paper primarily uses regular old GDP per capita (with the de rigeur acknowledgement that GDP isn't a great welfare measure). The only income adjustment I see is a couple of mentions of rich/poor specifications and the supplementary information suggests that this is just splitting the formula for the growth of damages based on whether a country falls into the rich bin or poor bin.

They explain the increased cost not as due to income adjustment (as I understand things) but because:

The median estimates of the GSCC (Fig. 1) are significantly higher than the Inter-agency Working Group estimates, primarily due to the higher damages associated with the empirical macroeconomic production function

All that said, I wish it did use logarithmic utility because that seems like an important improvement!

(FYI: All the inline footnotes still link to a Google doc.)

Comment by cole_haus on Uncertainty and sensitivity analyses of GiveWell's cost-effectiveness analyses · 2019-09-08T23:49:23.503Z · score: 1 (1 votes) · EA · GW

(I think my other two recent comments sort of answer each of your questions.)

Comment by cole_haus on Uncertainty and sensitivity analyses of GiveWell's cost-effectiveness analyses · 2019-09-08T23:32:26.470Z · score: 2 (2 votes) · EA · GW

I'm a bit confused. In the GiveDirectly case for 'value of increasing consumption', you're still holding the discount rate constant, right?

Nope, it varies. One way you can check this intuitively is: if the discount rate and all other parameters were held constant, we'd have a proper function and our scatter plot would show at most one output value for each input.

taking GiveWell's point estimate as the prior mean, how do the cost-effectiveness estimates (and their uncertainty) change as we vary our uncertainty over the input parameters.

There are (at least) two versions I can think of:

  1. Adjust all the input uncertainties in concert. That is, spread all the point estimates by ±20% or all by ±30% , etc. This would be computationally tractable, but I'm not sure it would get us too much extra. I think the key problem with the current approach which would remain is that we're radically more uncertain about some of the inputs than the others.

  2. Adjust all the input uncertainties individually. That is, spread point estimate 1 by ±20%, point estimate 2 by ±10%, etc. Then, spread point estimate 1 by ±10%, spread point estimate 2 by ±20%, etc. Repeat for all combinations of spreads and inputs. This would actually give us somewhat useful information, but would be computational intractable given the number of input parameters.

Comment by cole_haus on Uncertainty and sensitivity analyses of GiveWell's cost-effectiveness analyses · 2019-09-08T23:23:06.087Z · score: 2 (2 votes) · EA · GW

Short version:

Do the expected values of the output probability distributions equal the point estimates that GiveWell gets from their non-probabilistic estimates?

No, but they're close.

More generally, are there any good write-ups about when and how the expected value of a model with multiple random variables differs from the same model filled out with the expected value of each of its random variables?

Don't know of any write-ups unfortunately, but the linearity of expectation means that the two are equal if and (generally?) only if the model is linear.

Long version:

When I run the Python versions of the models with point estimates, I get:

Charity Value/$
GiveDirectly 0.0038
END 0.0211
DTW 0.0733
SCI 0.0370
Sightsavers 0.0394
Malaria Consortium 0.0316
HKI 0.0219
AMF 0.0240

The (mostly minor) deviations from the official GiveWell numbers are due to:

  1. Different handling of floating point numbers between Google Sheets and Python
  2. Rounded/truncated inputs
  3. A couple models calculated the net present value of an annuity based on payments at the end of the each period instead of the beginning--I never got around to implementing this
  4. Unknown errors

When I calculate the expected values of the probability distributions given the uniform input uncertainty, I get:

Charity Value/$
GiveDirectly 0.0038
END 0.0204
DTW 0.0715
SCI 0.0354
Sightsavers 0.0383
Malaria Consortium 0.0300
HKI 0.0230
AMF 0.0231

I would generally call these values pretty close.

It's worth noting though that the procedure I used to add uncertainty to inputs doesn't produce inputs distributions that have the original point estimate as their expected value. By creating a 90% CI at ±20% of the original value, the CI is centered around the point estimate but since log normal distributions aren't symmetric, the expected value is not precisely at the the point estimate. That explains some of the discrepancy.

The rest of the discrepancy is presumably from the non-linearity of the models (e.g. there are some logarithms in the models). In general, the linearity of expectation means that the expected value of a linear model of multiple random variables is exactly equal to the linear model of the expected values. For non-linear models, no such rule holds. (The relatively modest discrepancy between the point estimates and the expected values suggests that the models are "mostly" linear.)

Comment by cole_haus on Uncertainty and sensitivity analyses of GiveWell's cost-effectiveness analyses · 2019-09-02T17:31:21.629Z · score: 1 (1 votes) · EA · GW

Oh, very cool! I like the idea of sampling from different GiveWell staffers' values (though I couldn't do that here since I regarded essentially all input parameters as uncertain instead of just the highlighted ones).

I hadn't thought about the MPT connection. I'll think about that more.

Comment by cole_haus on Uncertainty and sensitivity analyses of GiveWell's cost-effectiveness analyses · 2019-09-02T17:29:32.993Z · score: 13 (6 votes) · EA · GW

Thanks for your thoughts.

the thrust of what you're saying is "we should do uncertainty analysis (use Monte Carlo simulations instead of point-estimates) as our cost-effectiveness might be sensitive to it"

Yup, this is the thrust of it.

But you haven't shown that GiveWell's estimates are sensitive to a reliance on point estimates (have you?)

I think I have---conditionally. The uncertainty analysis shows that, if you think the neutral uncertainty I use as input is an acceptable approximation, substantially different rankings are within the bounds of plausibility. If I put in my own best estimates, the conclusion would still be conditional. It's just that instead of being conditional upon "if you think the neutral uncertainty I use as input is an acceptable approximation" it's conditional upon "if you think my best estimates of the uncertainty are an acceptable approximation".

So the summary point there is that there's really no way to escape conditional conclusions within a subjective Bayesian framework. Conclusions will always be of the form "Conclusion C is true if you accept prior beliefs B". This makes generic, public communication hard (as we're seeing!), but offers lots of benefits too (which I tried to demonstrate in the post---e.g. an explicit quantification of uncertainty, a sense of which inputs are most influential).

here's a new, really complicated methodology we could use

If I've given the impression that it's really complicated, I think might have misled. One of the things I really like about the approach is that you pay a relatively modest fixed cost and then you get this kind of analysis "for free". By which I mean the complexity doesn't infect all your actual modeling code. For example, the GiveDirectly model here actually reads more clearly to me than the corresponding spreadsheet because I'm not constantly jumping around trying to figure out what the cell reference (e.g. B23) means in formulas.

Admittedly, some of the stuff about delta moment-independent sensitivity analysis and different distance metrics is a bit more complicated. But the distance metric stuff is specific to this particular problem---not the methodology in general---and the sensitivity analysis can largely be treated as a black box. As long as you understand what the properties of the resulting number are (e.g. ranges from 0-1, 0 means independence), the internal workings aren't crucial.

I think it would actually be very useful for you to input your best guess inputs (and its likely to be more useful for you to do it than an average EA, given you've thought about this more)

Given the responses here, I think I will go ahead and try that approach. Though I guess even better would be getting GiveWell's uncertainty on all the inputs (rather than just the inputs highlighted in the "User weights" and "Moral inputs" tab).

Sorry for adding even more text to what's already a lot of text :). Hope that helps.

Comment by cole_haus on Uncertainty and sensitivity analyses of GiveWell's cost-effectiveness analyses · 2019-09-01T15:46:53.092Z · score: 4 (3 votes) · EA · GW

You looked at the overall recap and saw the takeaways there? e.g. Sensitivity analysis indicates that some inputs are substantially more influential than others, and there are some plausible values of inputs which would reorder the ranking of top charities.

These are sort of meta-conclusions though and I'm guessing you're hoping for more direct conclusions. That's sort of hard to do. As I mention in several places, the analysis depends on the uncertainty you feed into it. To maintain "neutrality", I just pretended to be equally uncertain about each input. But, given this, any simple conclusions like "The AMF cost-effectiveness estimates have the most uncertainty." or "The relative cost-effectiveness is most sensitive to the discount rate." would be misleading at best.

The only way to get simple conclusions like that is to feed input parameters you actually believe in to the linked Jupyter notebook. Or I could put in my best guesses as to inputs and draw simple conclusions from that. But then you'd be learning about me as much as you'd be learning about the world as you see it.

Does that all make sense? Is there another kind of takeaway that you're imagining?

Comment by cole_haus on Movement Collapse Scenarios · 2019-08-28T21:45:18.128Z · score: 1 (1 votes) · EA · GW

Not sure what "attenuation" means in this context.

It's probably correction for attenuation: 'Correction for attenuation is a statistical procedure ... to "rid a correlation coefficient from the weakening effect of measurement error".'

Comment by cole_haus on Key points from The Dead Hand, David E. Hoffman · 2019-08-12T18:23:08.620Z · score: 2 (2 votes) · EA · GW

I’m a bit unclear about this: it seems that this is true for dirty bombs, but it is extremely hard to make a fission bomb work.

I'm far from an expert, but Global Catastrophic Risks makes it sound like that's not the case:

With modern weapons-grade uranium, the background neutron rate is so low that terrorists, if they have such material, would have a good chance of setting off a high­ yield explosion simply by dropping one half of the material onto the other half. Most people seem unaware that if separated HEU is at hand it's a trivial job to set off a nuclear explosion ... even a high school kid could make a bomb in short order.

(the book is actually quoting Luis Alvarez there)

A US government sponsored experiment in the 1960s suggests that several physics graduates without prior experience with nuclear weapons and with access to only unclassified information could design a workable implosion type bomb. The participants in the experiment pursued an implosion design because they decided a gun-type device was too simple and not enough of a challenge (Stober, 2003).

Comment by cole_haus on An overview of arguments for concern about automation · 2019-08-08T17:53:54.080Z · score: 2 (2 votes) · EA · GW

I don't really have a coherent thesis in this response, just some thoughts/references that came to mind:

  • I thought this recent paper was a pretty reasonable framework for thinking about the high-level effects of automation on labor.
  • Wage stagnation: I thought Table 1 here was a pretty good overview of different studies, their methodologies and results
  • Wage stagnation: I think the type of issue raised in GDP-B: Accounting for the Value of New and Free Goods in the Digital Economy is plausibly an important addition to the discussion. We know GDP has never been a great proxy for welfare and it's plausibly getting worse over time. Importantly, it sounds plausible that the growth of the digital economy will be correlated with automation.
  • "it seems reasonable to worry about what might happen should the conditions that led to democracy no longer hold." It also seems plausible to me that there's path dependence/hysteresis such that democracy could persist even in the face of other conditions. One story you could tell along those lines is that populations in many countries are now generally older, wealthier, and more educated.
  • On inequality and stability: Max Weber's criteria for fundamental conflict (from Theoretical Sociology) are:

(1) Membership in social class (life chances in markets and economy), party (house of power or polity), and status groups (rights to prestige and honor) are correlated with each other; those high or low in one of these dimensions of stratification are high and low in the other two.

(2) High levels of discontinuity in the degrees of inequality within social hierarchies built around class, party, and status; that is, there are large gaps between those at high positions and those in middle positions, with large differences between the latter and those in lower positions with respect to class location, access to power, and capacity to command respect. And,

(3) low rates of mobility up and, also, down these hierarchies, thereby decreasing chances for those low in the system of stratification from bettering their station in life.

Comment by cole_haus on "Why Nations Fail" and the long-termist view of global poverty · 2019-07-23T07:44:07.105Z · score: 3 (2 votes) · EA · GW

Yes, I agree they're very incomplete--as advertised. I also think the original claims they're responding to are pretty incomplete.

I.

I agree that time horizons are finite. If you're taking that as meaning that the defect/defect equilibrium reigns due to backward induction on a fixed number of games, that seems much too strong to me. Both empirically and theoretically, cooperation becomes much more plausible in iterated games.

Does the single shot game that Acemoglu and Robinson implicitly describe really seem like a better description of the situation to you? It seems very clear to me that it's not a good fit. If I had to choose between a single shot game and an iterated game as a model, I'd choose the iterated game every time (and maybe just set the discount rate more aggressively as needed--as the post points out, we can interpret the discount rate as having to do with the probability of deposition).

Maybe the crux here is the average tenure of autocrats and who we're thinking of when we use the term?

II.

(I don't say "solve" anywhere in the post so I think the quote marks there are a bit misleading.)

I agree that to come up with something closer to a conclusion, you'd have to do something like analyze the weighted value of each of these structural factors. Even in the absence of such an analysis, I think getting a fuller list of the structural advantages and disadvantages gets us closer to the truth than a one-sided list.

Also, if we accept the claim that Acemoglu and Robinson's empirical evidence is weak, then the fact that I haven't presented any evidence on the real-world importance of these theoretical mechanisms becomes a bit less troubling. It means there's something closer to symmetry in the absence of good evidence bearing on the relative importance of structural advantages and disadvantages in each type of society.

My intuition is that majoritarian tyrannies and collective action problems are huge, pervasive problems in the contemporary world, but I won't argue for that here. I can pretty quickly come up with several examples where it might be in an autocrat's self-interest to confront coordination problems and/or majoritarian tyrannies:

  • Reducing local air pollution would improve an autocrat's health
  • Reducing overuse of antibiotics in animal agriculture could reduce their risk of contracting an antibiotic-resistant infection
  • Allowing/encouraging immigration (for some autocratic country appealing to immigrants) could boost the economy in a way that benefits the autocrat and leads them to overrule the preferences of locals

Obviously, each of these examples is only the briefest sketch and way more work would have to be done to make things conclusive.

Comment by cole_haus on Age-Weighted Voting · 2019-07-18T22:44:04.062Z · score: 10 (5 votes) · EA · GW

This paper has increased my general skepticism on the accuracy of any estimates of discount rates: Time Discounting and Time Preference: A Critical Review. It has a table listing studies that find discount rates ranging from -6% to ∞% .

Comment by cole_haus on Age-Weighted Voting · 2019-07-18T22:39:52.265Z · score: 5 (3 votes) · EA · GW

I'm not sure how much you thought about this aspect, but I've recently become extra wary of surveys on this topic (beyond the ordinary skepticism I'd have for questions which are mostly about expressive preferences and not revealed preferences). Time Discounting and Time Preference: A Critical Review has a table listing studies that find discount rates ranging from -6% to ∞% . Even if that doesn't influence you as much as it did me, the paper has some good discussion of different methods of elicitation (which are especially likely to influence results given the difficulty of the domain).

Comment by cole_haus on "Why Nations Fail" and the long-termist view of global poverty · 2019-07-18T19:11:55.771Z · score: 16 (8 votes) · EA · GW

In addition to the empirical problems, I was very underwhelmed by the theoretical mechanisms Acemoglu and Robinson outline. I wrote up my complaints in a couple of blog posts:

  • Autocrats can accelerate growth through cooperation: Institutions as a fundamental cause of long-run growth claims that inclusive societies should, ceteris paribus have greater economic growth than authoritarian ones—in part, because autocrats can’t credibly commit to upholding property rights after productive investment has occurred. If we formalize this argument as a game, we see that the single shot case supports this claim. But once we turn to the (more plausible) repeated game, we see that mutual cooperation is an equilibrium.
  • Inclusive and extractive societies each have structural advantages: Acemoglu and Robinson claim that extractive societies are at an economic disadvantage because elites will block economic improvements in the name of self-interested stability. But majorities in inclusive societies might also block economic improvements in the name of self-interest. Furthermore, we might expect inclusive societies to be more disadvantaged by problems of collective action.

(These posts are still a bit drafty so apologies for typos, errors, etc.)

Comment by cole_haus on The Happy Culture: A Theoretical, Meta-Analytic, and Empirical Review of the Relationship Between Culture and Wealth and Subjective Well-Being · 2019-07-16T22:25:06.224Z · score: 3 (3 votes) · EA · GW

Nothing especially insightful to add. Just wanted to link to The French Unhappiness Puzzle: The Cultural Dimension of Happiness which is on a related topic and reasonably good.

Comment by cole_haus on Do we know how many big asteroids could impact Earth? · 2019-07-11T17:47:47.089Z · score: 3 (2 votes) · EA · GW

No idea really. The chapter reports "The best chance for discovery of such [dark Damocloid] bodies would be through their thermal radiation around perihelion, using infrared instrumentation on the ground (Rivkin et al., 2005) or in satellites." Rivken et al. 2005 is here.

Comment by cole_haus on Do we know how many big asteroids could impact Earth? · 2019-07-10T22:33:16.778Z · score: 8 (2 votes) · EA · GW

Global Catastrophic Risks (now slightly outdated with a 2008 publication date) has a chapter on comets and asteroids.

It estimates that an impactor with a diameter of 1 or 2 kilometers would be "civilization-disrupting" and 10 kilometers would "have a good chance of causing the extinction of the human species". So that's what the "big" means in this context.

We can estimate the population of possible impactors via impact craters, telescopic searches and dynamical analysis. Using these techniques, "[i]t is generally thought that the total population of near-Earth asteroids over a kilometre across is about 1100." But there are other classes of impactors with greater uncertainty-comets and Damocloids. "Whether small, dark Damocloids, of, for example, 1 km diameter exist in abundance is unknown - they are in essence undiscoverable with current search programmes."

This sounds like a plausible reconciliation of the apparently conflicting claims. OpenPhil is specifically talking about near-earth asteroids where we do indeed have fairly accurate estimates. The NASA employee referenced by MacAskill may be referring to the larger class of all possible impactors where uncertainly is much greater.

Comment by cole_haus on "Moral Bias and Corrective Practices" and the possibility of an ongoing moral catastrophe · 2019-07-02T00:35:26.624Z · score: 1 (1 votes) · EA · GW

Yup, I agree that she draws strong conclusions from weak evidence. I wish it were more careful, but I posted it anyway since this is really the only analysis I have seen along these lines.

Comment by cole_haus on Announcing the launch of the Happier Lives Institute · 2019-06-24T23:25:30.795Z · score: 3 (3 votes) · EA · GW

Contemporary Metaethics delineates the field as being about:

(a)  Meaning: what is the semantic function of moral discourse? Is the function of moral discourse to state facts, or does it have some other non-fact-stating role?

(b)  Metaphysics: do moral facts (or properties) exist? If so, what are they like? Are they identical or reducible to natural facts (or properties) or are they irreducible and sui generis?

(c)  Epistemology and justification: is there such a thing as moral knowledge? How can we know whether our moral judgements are true or false? How can we ever justify our claims to moral knowledge?

(d)  Phenomenology: how are moral qualities represented in the experience of an agent making a moral judgement? Do they appear to be ‘out there’ in the world?

(e)  Moral psychology: what can we say about the motivational state of someone making a moral judgement? What sort of connection is there between making a moral judgement and being motivated to act as that judgement prescribes?

(f)  Objectivity: can moral judgements really be correct or incorrect? Can we work towards finding out the moral truth?

It doesn't quite seem to me like the original claim fits neatly into any of these categories.

Comment by cole_haus on EA Forum: Footnotes are live, and other updates · 2019-05-21T18:41:08.074Z · score: 6 (4 votes) · EA · GW

Nice!

Perhaps you all have considered this already, but I think there's a lot to like about sidenotes over footnotes, especially on the web (e.g. footnotes aren't always in sight at the bottom of a physical page).

Comment by cole_haus on Structure EA organizations as WSDNs? · 2019-05-10T22:02:59.574Z · score: 4 (3 votes) · EA · GW

How would you expect EA WSDNs to differ from current EA orgs concretely?

When it comes to worker cooperatives, I see the differences as all flowing from reducing conflicting interests. That is, in standard firms, owners are ultimately interested in profits and only instrumentally interested in working conditions while workers are ultimately interested in working conditions (broadly construed) and only instrumentally interested in profits. Worker cooperatives resolve this tension by making agents principals and principals agents.

This is an idealization, but it seems like the interests of all relevant actors in EA orgs (and nonprofits more generally?) are more aligned. The board and the workers are (at least in theory) largely (if not solely) motivated by the same do-gooding goal.

Comment by cole_haus on What is the current best estimate of the cumulative elasticity of chicken? · 2019-05-04T04:10:01.464Z · score: 3 (3 votes) · EA · GW

Consideration 1: Economists often consider small actors in competitive markets to be price-takers meaning that they cannot influence prices on their own. This seems like a pretty plausible characterization of any individual food buyer.

Consideration 2: "He reasoned that economics says a drop in demand for some commodity should cause prices to fall for that commodity, and overall consumption remains the same." This is not correct. In inward shift in the demand curve ("a drop in demand") (for ordinary downward sloping demand curves and upward sloping supply curves), causes both equilibrium price and quantity to decrease. I'd guess the thing he's trying to get at is that for a good which is unit elastic, a small drop in price is offset by a small increase in quantity which leads to total revenue being unchanged.

So our first option is to regard individual actors as too small to influence the price. If we reject this and think they do have an effect, their effect would be to shift the demand curve in---dropping equilibrium price and quantity.

Aside: I'm reasonably well-informed about economics and don't recall having ever heard the term "cumulative elasticity" before.

Comment by cole_haus on Why does EA use QALYs instead of experience sampling? · 2019-04-24T02:00:02.793Z · score: 7 (5 votes) · EA · GW

I don't really see ESM as being in opposition to QALYs. It seems like it's a method that you would use as an input in QALY weight determinations. Wikipedia lists some of the current methods for deriving QALY weights as:

Time-trade-off (TTO): Respondents are asked to choose between remaining in a state of ill health for a period of time, or being restored to perfect health but having a shorter life expectancy.
Standard gamble (SG): Respondents are asked to choose between remaining in a state of ill health for a period of time, or choosing a medical intervention which has a chance of either restoring them to perfect health, or killing them.
Visual analogue scale (VAS): Respondents are asked to rate a state of ill health on a scale from 0 to 100, with 0 representing being dead and 100 representing perfect health. This method has the advantage of being the easiest to ask, but is the most subjective.

There's also the "day reconstruction method" (DRM). The Oxford Handbook of Happiness talks about ESM, DRM and others relevant measurement approaches at various points.

I'd guess the trouble with using ESM, DRM and some other methods like them for QALY weights is it's hard to isolate the causal effect of particular conditions using these methods.

Comment by cole_haus on Thoughts on 80,000 Hours’ research that might help with job-search frustrations · 2019-04-18T23:29:47.218Z · score: 7 (6 votes) · EA · GW

Ah, I see that now. Thanks.

FWIW, I was specifically looking for a disclaimer and it didn't quickly come to my attention. It looks like a few other people in these subthreads may have also missed the disclaimer.

Comment by cole_haus on Thoughts on 80,000 Hours’ research that might help with job-search frustrations · 2019-04-18T20:06:37.620Z · score: 10 (4 votes) · EA · GW

Yeah, I hadn't realized it was more or less deprecated. (The page itself doesn't seem to give any indication of that. Edit: Ah, it does. I missed the second paragraph of the sidenote when I quickly scanned for some disclaimer.)

Also, apparently unfortunately, it's the first sublink under the 80,000 Hours site on Google if you search for 80,000 Hours.

Comment by cole_haus on Thoughts on 80,000 Hours’ research that might help with job-search frustrations · 2019-04-16T20:06:28.454Z · score: 15 (9 votes) · EA · GW

It seems quite possible to me have a "parameterized list". That is, recommendations can take the shape "If X is true of you, Y and Z are good options." And in fact 80,000 Hours does do this to some degree (via, for example, their career quiz). While this isn't entirely personalized (it's based only on certain attributes that 80,000 Hours highlights), it's also far from a single, definitive list. So it doesn't seem to be that there's any insoluble tension between taking account of individual difference and communicating the same message to a broad audience--you just have to rely on the audience to do some interpreting.

Comment by cole_haus on Long-Term Future Fund: April 2019 grant recommendations · 2019-04-10T05:23:52.556Z · score: 1 (1 votes) · EA · GW

I don't particularly want to try to resolve the disagreement here, but I'd think value per dollar is pretty different for dollars at EA institutions and for dollars with (many) EA-aligned people [1]. It seems like the whole filtering/selection process of granting is predicated on this assumption. Maybe you believe that people at CEA are the type of people that would make very good use of money regardless of their institutional affiliation?

[1] I'd expect it to vary from person to person depending on their alignment, commitment, competence, etc.

Comment by cole_haus on Long-Term Future Fund: April 2019 grant recommendations · 2019-04-10T01:55:37.185Z · score: 23 (10 votes) · EA · GW

I am not OP but as someone who also has (minor) concerns under this heading:

  • Some people judge HPMoR to be of little artistic merit/low aesthetic quality
  • Some people find the subcultural affiliations of HPMoR off-putting (fanfiction in general, copious references to other arguably low-status fandoms)

If the recipients have negative impressions of HPMoR for reasons like the above, that could result in (unnecessarily) negative impressions of rationality/EA.

Clearly, there also many people that like HPMoR and don't have the above concerns. The key question is probably what fraction of recipients will have positive, neutral and negative reactions.

Comment by cole_haus on Long-Term Future Fund: April 2019 grant recommendations · 2019-04-10T01:44:58.006Z · score: 5 (4 votes) · EA · GW

It's not at all clear to me why the whole $150k of a counterfactual salary would be counted as a cost. The most reasonable (simple) model I can think of is something like: ($150k * .1 + $60k) * 1.5 = $112.5k where the $150k*.1 term is the amount of salary they might be expected to donate from some counterfactual role. This then gives you the total "EA dollars" that the positions cost whereas your model seems to combine "EA dollars" (CEA costs) and "personal dollars" (their total salary).

Comment by cole_haus on Long-Term Future Fund: April 2019 grant recommendations · 2019-04-10T01:42:28.034Z · score: 5 (4 votes) · EA · GW

I think you have some math errors:

  • $150k * 1.5 + $60k = $285k rather than $295k
  • Presumably, this should be ($150k + $60k) * 1.5 = $315k ?
Comment by cole_haus on Most important unfulfilled role in the EA ecosystem? · 2019-04-05T20:51:24.196Z · score: 20 (9 votes) · EA · GW

I have a pretty averse reaction to all the people you named, expect I would feel similarly about someone in that mold in EA, and expect many other people in EA would feel similarly. I don't think charismatic leadership fits all that well with the other elements of EA in ways both important and incidental.

Comment by cole_haus on [Link] The Optimizer's Curse & Wrong-Way Reductions · 2019-04-04T19:11:10.759Z · score: 4 (3 votes) · EA · GW

I don't know how promising others think this is, but I quite liked Concepts for Decision Making under Severe Uncertainty with Partial Ordinal and Partial Cardinal Preferences. It tries to outline possible decision procedures once you relax some of the subject expected utility theory assumptions you object to. For example, it talks about the possibility of having a credal set of beliefs (if one objects to the idea of assigning a single probability) and then doing maximin on this i.e. selecting the outcome that has the best expected utility according to its least favorable credences.

Comment by cole_haus on [Link] The Optimizer's Curse & Wrong-Way Reductions · 2019-04-04T19:06:53.335Z · score: 10 (8 votes) · EA · GW

There's actually a thing called the Satisficer's Curse (pdf) which is even more general:

The Satisficer’s Curse is a systematic overvaluation that occurs when any uncertain prospect is chosen because its estimate exceeds a positive threshold. It is the most general version of the three curses, all of which can be seen as statistical artefacts.
Comment by cole_haus on Is any EA organization using or considering using Buterin et al.'s mechanism for matching funds? · 2019-04-02T05:02:39.442Z · score: 1 (1 votes) · EA · GW

IIRC, the mechanism has problems with collusion/dissembling. For example, one backer with $46 dollars and 4 backers with $1 each will get significantly better results by splitting their money into 5 contributions of $10 each. This seems like a problem that's actually moderately likely to arise in practice.

Comment by cole_haus on a black swan energy prize · 2019-03-30T04:25:57.222Z · score: 2 (2 votes) · EA · GW

It looks like the case you're making in the "a prize" section is that prizes are more open to "outsiders" than grants which seems generally plausible to me. On the other hand, grants can actually fund the research itself while contestants for a prize need some source of funding. If it's capital-intensive to mount a serious attempt at the prize, this creates a funding and vetting problem again (contestants will need money to bankroll their attempt).

Comment by cole_haus on a black swan energy prize · 2019-03-30T04:25:32.132Z · score: 2 (2 votes) · EA · GW

I'm not convinced that a prize is particularly helpful in this case. I think of prizes as useful for inducing investment in things like public goods where private returns are limited. That doesn't seem to be the case here; successfully creating "radically better energy generation" seems like it would be wildly remunerative. The promise of vast wealth seems like it ought to be sufficient incentive regardless of a prize.

OTOH, that's all very first-principles and the history of innovation prizes doesn't seem to really pay much attention to this line of criticism. Maybe prizes make particular problems more salient, etc.

Comment by cole_haus on How to Understand and Mitigate Risk (Crosspost from LessWrong) · 2019-03-14T18:06:23.030Z · score: 2 (2 votes) · EA · GW

This is interesting! I think it would also be useful to talk about the standard terminology in the field. Some of those terms are:

Reasons I think it's useful to talk about standard terminology:

  • Allows you to converse with others and understand their work more easily
  • Allows readers to follow up and connect with a larger body of work
  • Communicates to experts that you've seriously engaged with the field and understand it

In this particular case, I'd be interesting in hearing how your categories map to the standard ones. Or, if you think they don't, it would be interesting to hear why that is. What are the inadequacies of the standard terms and categories?

Comment by cole_haus on Impact Prizes as an alternative to Certificates of Impact · 2019-02-20T07:12:09.193Z · score: 3 (2 votes) · EA · GW

This seems very related to social impact bonds: "Social Impact Bonds are a type of bond, but not the most common type. While they operate over a fixed period of time, they do not offer a fixed rate of return. Repayment to investors is contingent upon specified social outcomes being achieved."

Comment by cole_haus on What we talk about when we talk about life satisfaction · 2019-02-12T19:13:26.372Z · score: 2 (2 votes) · EA · GW

Yup. It's in Chapter 23, The Nature and Significance of Happiness.

Comment by cole_haus on What we talk about when we talk about life satisfaction · 2019-02-09T19:10:59.909Z · score: 5 (3 votes) · EA · GW

I found a passage from the book that's much more on the nose:

But here we will focus on a deeper threat to the importance of LS, one that stems from the very nature and point of LS attitudes. How satisfied you are with your life does not simply depend on how well you see your life going relative to your priorities. It also depends centrally on how high you set the bar for a “satisfactory” life: how good is “good enough?” Rosa might be satisfied with her life only when getting almost everything she wants, while Juliet is satisfied even when getting very little of what she wants—indeed, even when most of her goals are being frustrated. It can seem odd to think that satisfied Juliet, for whom every day is a new kick in the teeth, is better off than dissatisfied Rosa, who nonetheless succeeds in almost all the things she cares about but is more demanding.

More to the point, it is not clear why LS should be so important insofar as it is a matter of how high or low individuals set the bar. Suppose Rosa has a lengthy, and not inconsequential, “life list,” and will not be satisfied until she has checked off every item on the list. It is not implausible that we should care about how well Rosa achieves her priorities—e.g., whether her goals are mostly met or roundly frustrated. But should anyone regard it as a weighty matter whether she actually gets every last thing on her list, and thus is satisfied with her life? It is doubtful, indeed, that Rosa should put much stock in it.

The point here is not simply that LS can reflect unreasonable demands, but that it depends on people’s standards for a good enough life, and these bear a problematic relationship to people’s well-being, depending on various factors that have no obvious relationship to how well people’s lives are going for them. It may happen that Rosa comes to see her standards as unreasonably high and revises them downwards—not because her priorities change, but because she now finds it unseemly to be so needy. In this case, what drives her LS is, in part, the norms she takes to apply to her attitudes—how it is fitting to respond to her life. Such norms likely influence most people’s attitudes toward their lives—a wish to exhibit virtues like fortitude, toughness, strength, or exactingness, non-complacency, and so forth. How satisfied we are with our lives partly depends, in short, on the norms we accept regarding how it is appropriate to respond to our lives. Note that most of us accept a variety of such norms, pulling in different directions, and it can be somewhat arbitrary which norms we emphasize in thinking about our lives. You may value both fortitude and not being complacent, and it may not be obvious which to give more weight in assessing your life. You may, at diff erent times, vary between them.

Similarly, LS depends on the perspective one adopts: relative to what are you more or less satisfied? Looking at Tiny Tim, you may naturally take up a perspective on your life that makes your good fortune more salient, and so you reasonably find yourself pretty satisfied with things. Then you think about George Clooney, and your life doesn’t look so good by comparison: your satisfaction drops. Worse, it is doubtful that any perspective is uniquely the right one to take: again, it is somewhat arbitrary. Unless you are like Rosa and have bizarrely—not to say childishly—determinate criteria for how good your life has to be to qualify as a satisfactory one, it will be open to you to assess your life from any of a number of vantage points, each quite reasonable and each yielding a different verdict.

Indeed, the very idea of subjecting one’s life to an all-in assessment of satisfactoriness is a bit odd. When you order a steak prepared medium and it turns up rare, its deficiencies are immediately apparent and your dissatisfaction can be given plain meaning: you send it back. Or, you don’t return to that establishment. But when your life has annoying features, what would it mean to deem it unsatisfactory? You can’t very well send it back. (Well . . .) Nor can you resolve to choose a different one next time around. It just isn’t clear what’s at stake in judging one’s life satisfactory or otherwise; lives are vastly harder to judge than steaks; and anyway, what counts as a reasonable expectation for a life is less than obvious since the price of admission is free—you’re just born, and there you are. So it is hard to know where to set the bar, and unsurprising that people can be so easily gotten by trivial influences to move it (Schwarz & Strack, 1999). You might be satisfi ed with your life simply because it beats being dead. The ideal of life satisfaction arguably imports a consumer’s concept, one most at home in retail environments, into an existential setting where metrics of customer satisfaction may be less than fitting. (It is an interesting question how far people spoke of life satisfaction before the postwar era got us in the habit of calling ourselves “consumers.”)

In short, LS depends heavily on where you set the bar for a “good enough” life, and this in turn depends on factors like perspectives and norms that are substantially arbitrary and have little bearing on your well-being. Th e worry is not that LS fails to track some objective standard of well-being, but that we should expect that it will fail to track any sane metric of well-being, including the individual’s own. To take one example: Studies suggest that dialysis patients report normal levels of LS, which might lead us to think they don’t really mind it very much. Yet when asked to state a preference, patients said they would be willing to give up half their remaining life-years to regain normal kidney function (Riis et al., 2005 ; Torrance, 1976 ; Ubel & Loewenstein, 2008). This is about as strong as a preference gets. A plausible supposition is that people don’t adjust their priorities when they get kidney disease so much as they adjust their standards for what they’ll consider a satisfactory life. LS thus obscures precisely the sort of information one might expect it to provide—not because of errors or noise, but because it is not the sort of thing that is supposed in any straightforward way to yield that information. LS is not that sort of beast.

The claim is not that LS measures never provide useful information about well-being. In fact they frequently do, because the perceived welfare information is in there somewhere, and differences in norms and perspectives may often cancel out over large populations. They may not cancel out, however, where norms and perspectives systematically differ, and this is a serious problem in many contexts, especially cross-cultural comparisons using LS (Haybron, 2007, 2008). But what the points raised in this section chiefly indicate about LS measures is that we cannot support conclusions about absolute levels of well-being with facts about LS. That people are satisfied with their lives does not so much as hint that their lives are going well relative to their priorities. If we wish reliably to assess how people see their lives going for them, we need a better yardstick than life satisfaction.
Comment by cole_haus on What we talk about when we talk about life satisfaction · 2019-02-05T23:42:15.235Z · score: 1 (1 votes) · EA · GW

Ah, yeah. I didn't mean to suggest that the philosophers have it all worked out. What I meant is that I think the philosophers seem to share your goals. In other words, I (as a non-professional in either psychology or philosophy) think if someone came up to a psychologist and said, "I've come up with these edge cases for 'life satisfaction'", they'd more or less reply, "That's regrettable. Moving on...". On the other hand, if someone came up to a philosopher and said, "I've come up with edge cases for 'eudaimonia'", they might reply, "Yes, edges cases like these are among my central concerns. Here's the existing work on the matter and here are my current attempts at a resolution."

Comment by cole_haus on How Can Donors Incentivize Good Predictions on Important but Unpopular Topics? · 2019-02-05T06:51:23.309Z · score: 5 (4 votes) · EA · GW

Subsidizing a prediction market seems like one of the more promising approaches to me. There's a write-up of would that would look like more concretely at: Subsidizing prediction markets. Unfortunately, a quick search also turns up a theoretical limitation of this approach: Subsidized Prediction Markets for Risk Averse Traders.

Comment by cole_haus on What we talk about when we talk about life satisfaction · 2019-02-05T02:30:18.641Z · score: 9 (5 votes) · EA · GW

My impression is that the term "life satisfaction" sees the heaviest use in psychology where full philosophical analysis of the necessary and sufficient properties of "life satisfaction" isn't especially desired or useful. As long as it the term denotes a concept with some internal consistency and we all use the term in roughly compatible ways, we can usefully use it in measurements.

If you're looking for a concept that's a load-bearing part of your ethics, primarily psychological constructs like "life satisfaction" aren't a great fit. I think the discussions you'd want to look at for these more philosophical purposes are discussions around eudaimonia, hedonia, etc.

Comment by cole_haus on What we talk about when we talk about life satisfaction · 2019-02-05T02:23:26.048Z · score: 8 (4 votes) · EA · GW

I don't have a neat, definitive answer for you, but I've been reading the Oxford Handbook of Happiness lately and these are the bits that come to mind:

  • The Satisfaction with Life Scale is the most common instrument used to measure life satisfaction and may give you some sense of how they operationalize the term. Rated on a 7 point Likert scale:
    • In most ways my life is close to my ideal.
    • The conditions of my life are excellent.
    • I am satisfied with my life.
    • So far I have gotten the important things I want in life.
    • If I could live my life over, I would change almost nothing.
  • Another common instrument is the Cantrill ladder which asks people to place themselves on a ladder where the bottom rung is the worst life possible and the top rung is the best life possible. This is probably closest to your "the most satisfying life imaginable".
  • One explicit definition listed in the book is:
Campbell et al. (1976) argue that satisfaction with any aspect of life reflects the gap between one’s current perceived reality and the level to which one aspires.

This sounds closest to your "the most satisfying life, in practice".

  • Another set of authors contend that general life satisfaction is actually (contrary to first impressions) more affective than cognitive:
This generalized positive view may be measured through asking “How satisfied are you with your life as a whole?,” and this question has been used in population surveys for over 35 years (Andrews & Withey, 1976). Not surprisingly, given the extraordinary generality of this question, the response that people give does not represent a cognitive evaluation of their life. Rather it reflects a deep and stable positive mood state that we initially called “core affect” (Davern et al., 2007), but which we now refer to as HPMood (Cummins, 2010).
Comment by cole_haus on Disentangling arguments for the importance of AI safety · 2019-01-23T20:49:24.844Z · score: 2 (2 votes) · EA · GW

Agreed. I think these reasons seem to fit fairly easily into the following schema: Each of A, B, C, and D is necessary for a good outcome. Different people focus on failures of A, failures of B, etc. depending on which necessary criterion seems to them most difficult to satisfy and most salient.

Comment by cole_haus on How can I internalize my most impactful negative externalities? · 2019-01-17T16:33:31.815Z · score: 5 (5 votes) · EA · GW

I actually wrote up a survey a bit ago pulling together negative externalities with estimates in the literature: https://www.col-ex.org/posts/pigouvian-compendium/. From (estimated) largest to smallest, they are:

  • Driving
  • Emitting carbon
  • Obesity
  • Drinking alcohol
  • Agriculture
  • Municipal waste
  • Smoking
  • Antibiotic use
  • Debt
  • Gun ownership
Comment by cole_haus on What is the Most Helpful Categorical Breakdown of Normative Ethics? · 2018-08-15T21:13:59.473Z · score: 4 (6 votes) · EA · GW

I think there's a certain prima facie plausibility to the traditional tripartite division. If you just think about the world in general, each of actors, actions, and states seem salient. It wouldn't take much to convince me that--appropriately defined--actors, actions, and states are mutually exclusive and collectively exhaustive in some metaphysical sense.

Once you accept the actors, actions, states division, it makes sense to have ethical theories revolving around each. These corresponds to virtue ethics, deontology and consequentialism.

Comment by cole_haus on What is the Most Helpful Categorical Breakdown of Normative Ethics? · 2018-08-15T21:06:45.371Z · score: 3 (3 votes) · EA · GW

I think you could fairly convincingly bucket virtue ethics in 'Ends' if you wanted to adopt this schema. A virtue ethicist could be someone who chooses the action that produces the best outcome in terms of personal virtue. They are (sort of) a utilitarian that optimizes for virtue rather than utility and restricts their attention to only themselves rather than the whole world.

Comment by cole_haus on EA Forum 2.0 Initial Announcement · 2018-07-24T11:47:07.806Z · score: 2 (2 votes) · EA · GW

That's great to hear!