EA essay contest for <18s 2017-01-22T22:07:15.983Z · score: 3 (3 votes)


Comment by capybaralet on On Collapse Risk (C-Risk) · 2020-03-12T19:59:32.047Z · score: 3 (2 votes) · EA · GW

I just skimmed the post.

Many of the most pressing threats to the humanity are far more likely to cause collapse than be an outright existential threat with no ability for civilisation to recover.

This claim is not supported, and I think most people who study catastrophic risks (they already coined the acronym C-risk, sorry!) and x-risks would disagree with it.

In fact, civilization collapse is considered fairly unlikely by many, although Toby Ord thinks it hasn't been properly explored (see is recent 80k interview).

AI in particular (which many believe is easily the largest x-risk) seems quite unlikely to cause civilization collapse or c-risk without also x-risk.

From what I understand, the loss of welfare is probably much less significant than the decreased ability to prevent X-risks. Although, since X-risks are thought to be mostly anthropogenic, civilization collapse could actually significantly reduce immediate x-risk.

In general, I believe the thinking goes that we lose quite a small fraction of the light cone over the course of, e.g., a few centuries... this is why things like "long reflection periods" seem like good ideas. I'm not sure anyone has tried to square that with simulation hypothesis or other unknown-unknown type x-risks, which seem like they should make us discount much more aggressively. I guess the idea there is probably that most of the utility lies in universes with long futures, so we should prioritize our effects on them.

I suspect someone who has more expertise on this topic might want to respond more thoroughly.

Comment by capybaralet on On Collapse Risk (C-Risk) · 2020-03-12T19:48:41.948Z · score: 1 (3 votes) · EA · GW

These are not the same thing. GCR is just anything that's bad on a massive scale, civilization doesn't have to collapse.

Comment by capybaralet on Could we solve this email mess if we all moved to paid emails? · 2019-08-15T02:30:17.971Z · score: 6 (2 votes) · EA · GW

Overall, I'm intrigued and like this general line of thought. A few thoughts on the post:

  • If you're using, it's not really email anymore, right? So maybe it's better to think about this as about "online messaging".
  • Another (complementary) way to improve email is to make it like facebook where you have to agree to connect with someone before they can message you.
  • Like many ideas about using $$ as a signal, I think it might be better if we instead used a domain-specific credit system, where credits are allotted to individuals at some fixed rate, or according to some rules, and cannot be purchased. People can find ways of subverting that, but they can also subvert the paid email idea (just open all their emails and take the $$ without reading or responding meaningfully).
Comment by capybaralet on How Europe might matter for AI governance · 2019-07-30T00:57:56.820Z · score: 1 (1 votes) · EA · GW

To answer your question: no.

I basically agree with this comment, but I'd add that the "diminishing returns" point is fairly generic, and should be coupled with some arguments about why there are very rapidly diminishing returns in US/China (seems false) or non-trivial returns in Europe (seems plausible, but non-obvious, and also to be one of the focuses of the OP).

Comment by capybaralet on How Europe might matter for AI governance · 2019-07-26T00:41:15.421Z · score: 1 (1 votes) · EA · GW

RE "why look at Europe at all?", I'd say Europe's gusto for regulation is a good reason to be interested (you discuss that stuff later, but for me it's the first reason I'd give). It's also worth mentioning the "right to an explanation" as well as GDPR.

Comment by capybaralet on Top Tips on how to Choose an Effective Charity · 2018-04-08T14:25:11.183Z · score: 0 (0 votes) · EA · GW

Based on the report [1], it's a bit misleading to say that they are a charity doing $35 cataracts. The report seems pretty explicit that donations to the charity are used for other activities.

Comment by capybaralet on Personal thoughts on careers in AI policy and strategy · 2017-10-16T03:00:35.800Z · score: 1 (1 votes) · EA · GW

I strongly agree that independent thinking seems undervalued (in general and in EA/LW). There is also an analogy with ensembling in machine learning (

By "independent" I mean "thinking about something without considering others' thoughts on it" or something to that effect... it seems easy for people's thoughts to converge too much if they aren't allowed to develop in isolation.

Thinking about it now, though, I wonder if there isn't some even better middle ground; in my experience, group brainstorming can be much more productive than independent thought as I've described it.

There is a very high-level analogy with evolution: I imagine sexual reproduction might create more diversity in a population than horizontal gene transfer, since in the latter case, an idea(=gene) which seems good could rapidly become universal, and thus "local optima" might be more of a problem for the population (I have no idea if that's actually how this works biologically... in fact, it seems like it might not be, since at least some viruses/bacteria seem to do a great job of rapidly mutating to become resistant to defences/treatments.)

Comment by capybaralet on My current thoughts on MIRI's "highly reliable agent design" work · 2017-10-10T16:30:34.849Z · score: 0 (0 votes) · EA · GW

it's cross-posted on LW:

Comment by capybaralet on Personal thoughts on careers in AI policy and strategy · 2017-10-07T16:14:19.001Z · score: 3 (3 votes) · EA · GW

Thanks for writing this. My TL;DR is:

  1. AI policy is important, but we don’t really know where to begin at the object level

  2. You can potentially do 1 of 3 things, ATM: A. “disentanglement” research: B. operational support for (e.g.) FHI C. get in position to influence policy, and wait for policy objectives to be cleared up

  3. Get in touch / Apply to FHI!

I think this is broadly correct, but have a lot of questions and quibbles.

  • I found “disentanglement” unclear. [14] gave the clearest idea of what this might look like. A simple toy example would help a lot.
  • Can you give some idea of what an operations role looks like? I find it difficult to visualize, and I think uncertainty makes it less appealling.
  • Do you have any thoughts on why operations roles aren’t being filled?
  • One more policy that seems worth starting on: programs that build international connections between researchers (especially around policy-relevant issues of AI (i.e. ethics/safety)).
  • The timelines for effective interventions in some policy areas may be short (e.g. 1-5 years), and it may not be possible to wait for disentanglement to be “finished”.
  • Is it reasonable to expect the “disentanglement bottleneck” to be cleared at all? Would disentanglement actually make policy goals clear enough? Trying to anticipate all the potential pitfalls of policies is a bit like trying to anticipate all the potential pitfalls of a particular AI design or reward specification… fortunately, there is a bit of a disanalogy in that we are more likely to have a chance to correct mistakes with policy (although that still could be very hard/impossible). It seems plausible that “start iterating and create feedback loops” is a better alternative to the “wait until things are clearer” strategy.
Comment by capybaralet on My current thoughts on MIRI's "highly reliable agent design" work · 2017-08-11T01:15:51.989Z · score: 1 (1 votes) · EA · GW

My main comments:

  1. As others have mentioned: great post! Very illuminating!

  2. I agree value-learning is the main technical problem, although I’d also note that value-learning related techniques are becoming much more popular in mainstream ML these days, and hence less neglected. Stuart Russell has argued (and I largely agree) that things like IRL will naturally become a more popular research topic (but I’ve also argued this might not be net-positive for safety:

  3. My main comment wrt the value of HRAD (3a) is: I think HRAD-style work is more about problem definitions than solutions. So I find it to be somewhat orthogonal to the other approach of “learning to reason from humans” (L2R). We don’t have the right problem definitions, at the moment; we know that the RL framework is a leaky abstraction. I think MIRI has done the best job of identifying the problems which could result from our current leaky abstractions, and working to address them by improving our understanding of what problems need to be solved.

  4. It’s also not clear that human reasoning can be safely amplified; the relative safety of existing humans may be due to our limited computational / statistical resources, rather than properties of our cognitive algorithms. But this argument is not as strong as it seems; see comment #3 below.

A few more comments:

  1. RE 3b: I don’t really think the AI community’s response to MIRI’s work is very informative, since it’s just not on people’s radar. The problems and not well known or understood, and the techniques are (AFAIK) not very popular or in vogue (although I’ve only been in the field for 4 years, and only studied machine-learning based approaches to AI). I think decision theory was already a relatively well known topic in philosophy, so I think philosophy would naturally be more receptive to these results.

  2. I’m unconvinced about the feasibility of Paul’s approach**, and share Wei Dai’s concerns about it hinging on a high level of competitiveness. But I also think HRAD suffers from the same issues of competitiveness (this does not seem to be MIRI’s view, which I’m confused by). This is why I think solving global coordination is crucial.

  3. A key missing (assumed?) argument here is that L2R can be a stepping stone, e.g. providing narrow or non-superintelligent AI capabilities which can be applied to AIS problems (e.g. making much more progress on HRAD than MIRI). To me this is a key argument for L2R over HRAD, and generally a source of optimism. I’m curious if this argument plays a significant role in your thought; in other words, is it that HRAD problems don’t need to be solved, or just that the most effective solution path goes through L2R? I’m also curious about the counter-argument for pursuing HRAD now: i.e. what role does MIRI anticipate safe advanced (but not general / superhuman) intelligent systems to play in HRAD?

  4. An argument for more funding for MIRI which isn’t addressed is the apparent abundance of wealth at the disposal of Good Ventures. Since funding opportunities are generally scarce in AI Safety, I think every decent opportunity should be aggressively pursued. There are 3 plausible arguments I can see for the low amount of funding to MIRI: 1) concern of steering other researchers in unproductive directions 2) concern about bad PR 3) internal politics.

  5. Am I correct that there is a focus on shorter timelines (e.g. <20 years)?

Briefly, my overall perspective on the future of AI and safety relevance is:

  1. There ARE fundamental insights missing, but they are unlikely to be key to building highly capable OR safe AI.

  2. Fundamental insights might be crucial for achieving high confidence in a putatively safe AI (but perhaps not for developing an AI which is actually safe).

  3. HRAD line of research is likely to uncover mostly negative results (ala AIXI’s arbitrary dependence on prior)

  4. Theory is behind empiricism, and the gap is likely to grow; this is the main reason I’m a bit pessimistic about theory being useful. On the other hand, I think most paths to victory involve using capability-control for as long as possible while transitioning to completely motivation-control based approaches, so conditioning on victory, it seems more likely that we solve more fundamental problems (i.e. “we have to solve these problems eventually”).

** the two main reasons are: 1) I don’t think it will be competitive and 2) I suspect it will be difficult to prevent compounding errors in a bootstrapping process that yields superintelligent agents.

Comment by capybaralet on My current thoughts on MIRI's "highly reliable agent design" work · 2017-08-06T11:55:15.349Z · score: 0 (0 votes) · EA · GW

My point was that HRAD potentially enables the strategy of pushing mainstream AI research away from opaque designs (which are hard to compete with while maintaining alignment, because you don't understand how they work and you can't just blindly copy the computation that they do without risking safety), whereas in your approach you always have to worry about "how do I compete with with an AI that doesn't have an overseer or has an overseer who doesn't care about safety and just lets the AI use whatever opaque and potentially dangerous technique it wants".

I think both approaches potentially enable this, but are VERY unlikely to deliver. MIRI seems more bullish that fundamental insights will yield AI that is just plain better (Nate gave me the analogy of Judea Pearl coming up with Causal PGMs as such an insight), whereas Paul just seems optimistic that we can get a somewhat negligible performance hit for safe vs. unsafe AI.

But I don't think MIRI has given very good arguments for why we might expect this; it would be great if someone can articulate or reference the best available arguments.

I have a very strong intuition that dauntingly large safety-performance trade-offs are extremely likely to persist in practice, thus the only answer to the "how do I compete" question seems to be "be the front-runner".

Comment by capybaralet on My current thoughts on MIRI's "highly reliable agent design" work · 2017-08-06T07:32:08.441Z · score: 0 (0 votes) · EA · GW

Will - I think "meta-reasoning" might capture what you mean by "meta-decision theory". Are you familiar with this research (e.g. Nick Hay did a thesis w/Stuart Russell on this topic recently)?

I agree that bounded rationality is likely to loom large, but I don't think this means MIRI is barking up the wrong tree... just that other trees also contain parts of the squirrel.

Comment by capybaralet on What Should the Average EA Do About AI Alignment? · 2017-03-01T23:34:16.735Z · score: 2 (2 votes) · EA · GW

I'm also very interested in hearing you elaborate a bit.

I guess you are arguing that AIS is a social rather than a technical problem. Personally, I think there are aspects of both, but that the social/coordination side is much more significant.

RE: "MIRI has focused in on an extremely specific kind of AI", I disagree. I think MIRI has aimed to study AGI in as much generality as possible and mostly succeeded in that (although I'm less optimistic than them that results which apply to idealized agents will carry over and produce meaningful insights in real-world resource-limited agents). But I'm also curious what you think MIRIs research is focusing on vs. ignoring.

I also would not equate technical AIS with MIRI's research.

Is it necessary to be convinced? I think the argument for AIS as a priority is strong so long as the concerns have some validity to them, and cannot be dismissed out of hand.

Comment by capybaralet on Essay contest: general considerations for evaluating small-scale giving opportunities ($300 for winning submission) · 2017-01-27T02:31:26.654Z · score: 3 (3 votes) · EA · GW

(cross posted on facebook):

I was thinking of applying... it's a question I'm quite interested in. The deadline is the same as ICML tho!

I had an idea I will mention here: funding pools:

  1. You and your friends whose values and judgement you trust and who all have small-scale funding requests join together.
  2. A potential donor evaluates one funding opportunity at random, and funds all or none of them on the basis of that evaluation.
  3. You have now increased the ratio of funding / evaluation available to a potential donor by a factor of #projects
  4. There is an incentive for you to NOT include people in your pool if you think their proposal is quite inferior to yours... however, you might be incentivized to include somewhat inferior proposals in order to reach a threshold where the combined funding opportunity is large enough to attract more potential donors.
Comment by capybaralet on Building Cooperative Epistemology (Response to "EA has a Lying Problem", among other things) · 2017-01-17T05:38:13.226Z · score: 0 (2 votes) · EA · GW

I was overall a bit negative on Sarah's post, because it demanded a bit too much attention, (e.g. the title), and seemed somewhat polemic. It was definitely interesting, and I learned some things.

I find the most evocative bit to be the idea that EA treats outsiders as "marks".
This strikes me as somewhat true, and sadly short-sighted WRT movement building. I do believe in the ideas of EA, and I think they are compelling enough that they can become mainstream.

Overall, though, I think it's just plain wrong to argue for an unexamined idea of honesty as some unquestionable ideal. I think doing so as a consequentialist, without a very strong justification, itself smacks of disingenuousness and seems motivated by the same phony and manipulative attitude towards PR that Sarah's article attacks.

What would be more interesting to me would be a thoughtful survey of potential EA perspectives on honesty, but an honest treatment of the subject does seem to be risky from a PR standpoint. And it's not clear that it would bring enough benefit to justify the cost. We probably will all just end up agreeing with common moral intuitions.

Comment by capybaralet on Why donate to 80,000 Hours · 2017-01-07T02:05:59.658Z · score: 1 (1 votes) · EA · GW

Do you have any info on how reliable self-reports are wrt counterfactuals about career changes and EWWC pledging?

I can imagine that people would not be very good at predicting that accurately.

Comment by capybaralet on Effective Altruism is Not a Competition · 2017-01-05T19:38:16.253Z · score: 6 (5 votes) · EA · GW

People are motivated both by:

  1. competition and status and
  2. cooperation and identifying with the successes of a group. I think we should aim to harness both of these forms of motivation.
Comment by capybaralet on Thoughts on the "Meta Trap" · 2017-01-05T18:13:17.713Z · score: 0 (0 votes) · EA · GW

"But maybe that's just because I am less satisfied with the current EA "business model"/"product" than most people."

Care to elaborate (or link to something?)

Comment by capybaralet on What the EA community can learn from the rise of the neoliberals · 2017-01-05T08:06:26.056Z · score: 0 (0 votes) · EA · GW

"This is something the EA community has done well at, although we have tended to focus on talent that current EA organization might wish to hire. It may make sense for us to focus on developing intellectual talent as well."

Definitely!! Are there any EA essay contests or similar? More generally, I've been wondering recently if there are many efforts to spread EA among people under the age of majority. The only example I know of is SPARC.

Comment by capybaralet on Resources regarding AI safety · 2017-01-04T22:07:39.354Z · score: 2 (2 votes) · EA · GW

EDIT: I forgot to link to the Google group:!forum/david-kruegers-80k-people

Hi! David Krueger (from Montreal and 80k) here. The advice others have given so far is pretty good.

My #1 piece of advise is: start doing research ASAP!
Start acting like a grad student while you are still an undergrad. This is almost a requirement to get into a top program afterwards. Find a supervisor and ideally try to publish a paper at a good venue before you graduate.

Stats is probably a bit more relevant than CS, but some of both is good. I definitely recommend learning (some) programming. In particular, focus on machine learning (esp. Deep Learning and Reinforcement Learning). Do projects, build a portfolio, and solicit feedback.

If you haven't already, please check out these groups I created for people wanting to get into AI Safety. There are a lot of resources to get you started in the Google Group, and I will be adding more in the near future. You can also contact me directly (see for contact info) and we can chat.

Comment by capybaralet on Two Strange Things About AI Safety Policy · 2017-01-04T19:49:42.835Z · score: 0 (0 votes) · EA · GW

Sure, but the examples you gave are more about tactics than content. What I mean is that there are a lot of people who are downplaying their level of concern about Xrisk in order to not turn off people who don't appreciate the issue. I think that can be a good tactic, but it also risks reducing the sense of urgency people have about AI-Xrisk, and can also lead to incorrect strategic conclusions, which could even be disasterous when they are informing crucial policy decisions.

TBC, I'm not saying we are lacking in radicals ATM, the level is probably about right. I just don't think that everyone should be moderating their stance in order to maximize their credibility with the (currently ignorant, but increasingly less so) ML research community.

Comment by capybaralet on Principia Qualia: blueprint for a new cause area, consciousness research with an eye toward ethics and x-risk · 2017-01-04T19:32:22.694Z · score: 0 (3 votes) · EA · GW

I think I was too terse; let me explain my model a bit more.

I think there's a decent chance (OTTMH, let's say 10%) that without any deliberate effort we make an AI which wipes our humanity, but is anyhow more ethically valuable than us (although not more than something which we deliberately design to be ethically valuable). This would happen, e.g. if this was the default outcome (e.g. if it turns out to be the case that intelligence ~ ethical value). This may actually be the most likely path to victory.**

There's also some chance that all we need to do to ensure that AI has (some) ethical value (e.g. due to having qualia) is X. In that case, we might increase our chance of doing X by understanding qualia a bit better.

Finally, my point was that I can easily imagine a scenario in which our alternatives are:

  1. Build an AI with 50% chance of being aligned, 50% chance of just being an AI (with P(AI has property X) = 90% if we understand qualia better, 10% else)
  2. Allow our competitors to build an AI with ~0% chance of being ethically valuable.

So then we obviously prefer option1, and if we understand qualia better, option 1 becomes better.

* I notice as I type this that this may have some strange consequences RE high-level strategy; e.g. maybe it's better to just make something intelligent ASAP and hope that it has ethical value, because this reduces its X-risk, and we might not be able to do much to change the distribution of the ethical value the AI we create produces that much anyhow*. I tend to think that we should aim to be very confident that the AI we build is going to have lots of ethical value, but this may only make sense if we have a pretty good chance of succeeding.

Comment by capybaralet on Why I'm donating to MIRI this year · 2017-01-04T06:12:52.527Z · score: 0 (0 votes) · EA · GW

MIRI seems like the most value-aligned and unconstrained of the orgs.

OpenAI also seems pretty unconstrained, but I have no idea what their perspective on Xrisk is, and all reports are that there is no master plan there.

Comment by capybaralet on 2016 AI Risk Literature Review and Charity Comparison · 2017-01-04T06:11:19.797Z · score: 0 (0 votes) · EA · GW

Thanks for this!

A few comments:

RE: public policy / outreach:

“However, I now think this is a mistake.” What do you think is a mistake?

“Given this, I actually think policy outreach to the general population is probably negative in expectation.” I think this makes more sense if you see us as currently on or close to a winning path. I am more pessimistic about our current prospects for victory, so I favor a higher risk/reward. I tend to see paths to victory as involving a good chunk of the population having a decent level of understanding of AI and Xrisk.

“I think this is why a number of EA organisations seem to have seen sublinear returns to scale.” Which ones?

“There have been at least two major negative PR events, and a number of near misses.” again, I am very curious what you refer to!

Comment by capybaralet on Principia Qualia: blueprint for a new cause area, consciousness research with an eye toward ethics and x-risk · 2016-12-20T01:00:19.159Z · score: 1 (1 votes) · EA · GW

Hey I (David Krueger) remember we spoke about this a bit with Toby when I was at FHI this summer.

I think we should be aiming for something like CEV, but we might not get it, and we should definitely consider scenarios where we have to settle for less.

For instance, some value-aligned group might find that its best option (due to competitive pressures) is to create an AI which has a 50% probability of being CEV-like or "aligned via corrigibility", but has a 50% probability of (effectively) prematurely settling on a utility function whose goodness depends heavily on the nature of qualia.

If (as I believe) such a scenario is likely, then the problem is time-sensitive.

Comment by capybaralet on Two Strange Things About AI Safety Policy · 2016-09-28T21:29:56.589Z · score: 2 (2 votes) · EA · GW

In general, I think that people are being too conservative about addressing the issue. I think we need some "radicals" who aren't as worried about losing some credibility. Whether or not you want to try and have mainstream appeal, or just be straightforward with people about the issue is a strategic question that should be considered case-by-case.

Of course, it is a big problem that talking about AIS makes a good chunk of people think you're nuts. It's been my impression that most of those people are researchers, not the general public, who are actually quite receptive to the idea (although maybe for the wrong reasons...)

Comment by capybaralet on Two Strange Things About AI Safety Policy · 2016-09-28T21:27:16.492Z · score: 0 (0 votes) · EA · GW

Right, I was going to mention the fact that AIS concerned people are very interested in courting the ML community, and very averse to anything which might alienate them, but it's already come up.

I'm not sure I agree with this strategy. I think we should maybe be more "good cop / bad cop" about it. I think the response so far from ML people is almost indefensible, and the AIS folks have won every debate so far, but there is of course this phenomena with debate where you think that your side won ;).

If it ends up being necessary to slow down research, or, more generally, carefully control AI technology in some way, then we might have genuine conflicts of interest with AI researchers which can't be resolved solely by good cop tactics. This might be the case if, e.g. using SOTA AIS techniques significantly impairs performance or research, which I think it likely.

It's still a huge instrumental good to get more ML people into AIS and supportive of it, but I don't like to see AIS people bending over backwards to do this.

Comment by capybaralet on Problems and Solutions in Infinite Ethics · 2016-08-29T23:34:58.700Z · score: 1 (1 votes) · EA · GW

This is really interesting stuff, and thanks for the references.

A few comments:

It'd be nice to clarify what: "finite intergenerational equity over [0,1]^N" means (specifically, the "over [0,1]^N" bit).

Why isn't the sequence 1,1,1,... a counter-example to Thm4.8 (dictatorship of the present)? I'm imagining exponential discounting, e.g. of 1/2 so the welfare function of this should return 2 (but a different number if u_t is changed, for any t).

Comment by capybaralet on I am Nate Soares, AMA! · 2015-08-20T05:15:48.365Z · score: 0 (0 votes) · EA · GW

I guess I am way late to the party, but.....

What part of the MIRI research agenda do you think is the most accessible to people with the least background?

How could AI alignment research be made more accessible?

Comment by capybaralet on I am Nate Soares, AMA! · 2015-08-20T04:05:18.825Z · score: 0 (0 votes) · EA · GW

Nailed it.

(anyone have) any suggestions for how to make progress in this area?