Is "founder effects" EA jargon? 2021-10-15T11:16:47.512Z
Has anyone wrote something using moral cluelessness to "debunk" anti-consequentialist thought experiments? 2021-10-09T13:09:51.417Z
When is cost a good proxy for environmental impact? How good and why? 2021-04-11T16:44:24.032Z
Any EAs familiar with Partha Dasgupta's work? 2021-03-31T00:45:16.672Z
Trying to help coral reefs survive climate change seems incredibly neglected. 2021-01-17T18:47:55.315Z
What's a good reference for finding (more) ethical animal products? 2020-12-26T21:39:07.897Z
Idea: "SpikeTrain" for lifelogging 2020-12-24T04:21:15.445Z
Idea: the "woketionary" 2020-12-11T00:04:18.045Z
Idea: an AI governance group colocated with every AI research group! 2020-12-07T23:41:05.196Z
Idea: "Change the World University" 2020-12-07T08:00:56.276Z
Idea: Resumes for Politicians 2020-12-03T23:54:00.269Z
EA essay contest for <18s 2017-01-22T22:07:15.983Z


Comment by capybaralet on Why AI alignment could be hard with modern deep learning · 2021-09-28T16:37:40.034Z · EA · GW

Great post!

This framing doesn't seem to capture the concern that even slight misspecification (e.g. a reward function that is a bit off) could lead to x-catastrophe.  

I think this is a big part of many people's concerns, including mine.

This seems somewhat orthogonal to the Saint/Sycophant/Schemer disjunction... or to put it another way, it seems like a Saint that is just not quite right about what your interests actually are (e.g. because they have alien biology and culture) could still be an x-risk.


Comment by capybaralet on Buck's Shortform · 2021-09-02T13:47:06.156Z · EA · GW

Reminds me of The House of Saud (although I'm not saying they have this goal, or any shared goal):
"The family in total is estimated to comprise some 15,000 members; however, the majority of power, influence and wealth is possessed by a group of about 2,000 of them. Some estimates of the royal family's wealth measure their net worth at $1.4 trillion"

Comment by capybaralet on Towards a Weaker Longtermism · 2021-08-16T21:15:26.537Z · EA · GW

IMO, the best argument against strong longtermism ATM is moral cluelessness.  

Comment by capybaralet on Most research/advocacy charities are not scalable · 2021-08-16T21:02:11.816Z · EA · GW

IMO, the main things holding back scaling are EA's (in)ability to identify good "shovel ready" ideas and talent within the community and allocate funds appropriately.  I think this is a very general problem that we should be devoting more resources to.  Related problems are training and credentialing, and solving common good problems within the EA community.

I'm probably not articulating all of this very well, but basically I think EA should focus a lot more on figuring out how to operate effectively, make collective decisions, and distribute resources internally.  

These are very general problems that haven't been solved very well outside of EA either.  But the EA community still probably has a lot to learn from orgs/people outside EA about this.  If we can make progress here, it can scale outside of the EA community as well.

Comment by capybaralet on Any EAs familiar with Partha Dasgupta's work? · 2021-04-11T16:57:40.173Z · EA · GW

I view economists are more like physicists working with spherical cows, and often happy to continue to do so.  So that means we should expect lots of specific blind spots, and for them to be easy to identify, and for them to be readily acknowledged by many economists.  Under this model, economists are also not particularly concerned with the practical implications of the simplifications they make.  Hence they would readily acknowledge many specific limitations of their models.  Another way of putting it: this is more of a blind spot for economics, not economists.

I'll also get back to this point about measurement... there's a huge space between "nature has intrinsic value" and "we can measure the extrinsic value of nature".  I think the most reasonable position is:
- Nature has some intrinsic value, because there are conscious beings in it (with a bonus because we don't understand consciousness well enough to be confident that we aren't under-counting).
- Nature has hard to quantify, long-term extrinsic value (in expectation), and we shouldn't imagine that we'll be able to quantify it appropriately any time soon.
- We should still try to quantify it sometimes, in order to use quantitative decision-making / decision-support tools.  But we should maintain awareness of the limitations of these efforts.

Comment by capybaralet on Any EAs familiar with Partha Dasgupta's work? · 2021-03-31T17:20:18.402Z · EA · GW

It hardly seems "inexplicable"... this stuff is harder to quantify, especially in terms of the long-term value.  I think there's an interesting contrast with your comment and jackmalde's below: "It's also hardly news that GDP isn't a perfect measure."

So I don't really see why there should be a high level of skepticism of a claim that "economists haven't done a good job of modelling X[=value of nature]".  I'd guess most economists would emphatically agree with this sort of critique.

Or perhaps there's an underlying disagreement about what to do when we have  hard time modelling something: Do we mostly just ignore them?  Or do we try to reason about them less formally?  I think the latter is clearly correct, but I get the sense a lot of people in EA would disagree (e.g. the "evidence-based charity" perspective seems to go against this).

Comment by capybaralet on SHIC Will Suspend Outreach Operations · 2021-03-08T13:24:55.866Z · EA · GW

I think this illustrates a harmful double standard.  Let me substitute a different cause area in your statement:
"Sounds like any future project meant to reduce x-risk will have to deal with the measurement problem".


Comment by capybaralet on SHIC Will Suspend Outreach Operations · 2021-03-08T13:23:23.675Z · EA · GW

Online meetings could be an alternative/supplement, especially in the post-COVID world.

Comment by capybaralet on SHIC Will Suspend Outreach Operations · 2021-03-08T13:22:23.990Z · EA · GW

Reiterating my other comments: I don't think it's appropriate to say that the evidence showed it made sense to give up.  As others have mentioned, there are measurement issues here.  So this is a case where absence of evidence is not strong evidence of absence.  

Comment by capybaralet on SHIC Will Suspend Outreach Operations · 2021-03-08T13:20:03.996Z · EA · GW

Just because they didn't get the evidence of impact they were aiming for doesn't mean it "didn't work".  

I understand if EAs want to focus on interventions with strong evidence of impact, but I think it's terrible comms (both for PR and for our own epistemics) to go around saying that interventions lacking such evidence don't work.

It's also pretty inconsistent; we don't seem to have that attitude about spending $$ on speculative longtermist interventions! (although I'm sure some EAs do, I'm pretty sure it's a minority view).

Comment by capybaralet on SHIC Will Suspend Outreach Operations · 2021-03-08T13:17:28.805Z · EA · GW

Thanks for this update, and for your valuable work.

I must admit I was frustrated by reading this post.  I want this work to continue, and I don't find the levels of engagement you report surprising or worth massively updating on (i.e. suspending outreach).

I'm also bothered by the top-level comments assuming that this didn't work and should've been abandoned.  What you've shown is that you could not provide strong evidence of the type that you hoped for the programs effectiveness, NOT that it didn't work!

Basically, I think there should be a strong prior that this type of work is effective, and I think the question should be how to do a good job of it.  So I want these results to be taken as a baseline, and for your org to continue iterating and trying to improve your outreach, rather than giving up on it.  And I want funders to see your vision and stick with you as you iterate.  

I'm frustrated by the focus on short-term, measurable results here.  I don't expect you to be able to measure the effects well. 

Overall, I feel like the results you've presented here inspire a lot of ideas and questions, and I think continued work to build a better model of how outreach to high schoolers works seems very valuable.  I think this should be approached with more of a scientific/tinkering/start-up mindset of "we have this idea that we believe in and we're going to try our damndest to make it work before giving up!" I think part of "making it work" here includes figuring out how to gauge the impact.  How do teachers normally tell if they're having an impact?  Probably they mostly trust their gut.  So is there a way to ask them (obvious risk is they'll tell you a white lie).  Maybe you think continuing this work is not your comparative advantage, or you're not the org to do it, which seems fine, but I'd rather you try and hire a new "CEO"/team for SHIC in that case (if possible), and not throw away existing institutional knowledge, rather than suspend the outreach.

RE evaluating effectiveness:
I'd be very curious to know more  about the few students who did engage outside of class.  In my mind, the evidence for effectiveness hinges to a significant extent on the quality and motivation of the students who continue engaging.

I think there are other ways you could gauge effectiveness, mostly by recruiting teachers into this process.  They were more eager for your material than you expected (well, I think it makes sense, since its less work for them!) So you can ask for things in return: follow-up surveys, assignments, quiz questions, or any form of evaluation from them in terms of how well the content stuck and if they think it had any impact.  

A few more specific questions:
- RE footnote 3: why not use "EA" in the program?  This seems mildly dishonest and liable to reduce expected impact.
- RE footnote 7: why did they feel inappropriate?

Comment by capybaralet on PhD student mutual line-manager invitation · 2021-03-08T12:46:49.090Z · EA · GW

I have a recommendation: try to get at least 3 people, so you aren't managing your manager.  I think accountability and social dynamics would be better that way, since:
- I suspect part of why line managers work for most people is because they have some position of authority that makes you feel obligated to satisfy them.  If you are in equal positions, you'd mostly lose that effect. 
- If there are only 2 of you, it's easier to have a cycle of defection where accountability and standards slip.  If you see the other person slacking, you feel more OK with slacking.  Whereas if you don't see the work of your manager, you can imagine that they are always on top of their shit. 

Comment by capybaralet on Trying to help coral reefs survive climate change seems incredibly neglected. · 2021-02-02T00:01:27.069Z · EA · GW

(Sorry, this is a bit stream-of-conscious):

I assume its because humans rely on natural ecosystems in a variety of ways in order to have the conditions necessary for agriculture, life, etc.  So, like with climate change, the long-term cost of mitigation is simply massive... really these numbers should not be thought of as very meaningful, I think, since the kinds of disruptions and destruction we are talking about is not easily measured in $s.

TBH, I find it not-at-all surprising that saving coral reefs would have a huge impact, since they are basically part of the backbone of the entire global ocean ecosystem, and this stuff is all connected, etc.

I think environmentalism is often portrayed as some sort of hippy-dippy sentimentalism and contrasted with humanist values and economic good sense, and I've been a bit surprised how prevalent that sort of attitude seems to be in EA.  I'm not trying to say that either of you in the thread have this attitude; it's more just that I was reminded of it by these comments... it seems like I have a much stronger prior that protecting the environment is good for people's long-term future (e.g. like most people here have probably heard the idea that all the biodiversity we're destroying could have massive scientific implications, e.g. leading to the development of new materials and drugs).

I think the reality is that we're completely squandering the natural resources of the earth, and all of this only looks good for people in the short term, or if we expect to achieve technological independence from nature.  I think it's very foolhardy to assume that we will achieve technological independence from nature, and doing so is a source of x-risk.  (TBC, I'm not an expert on any of this; just sharing my perspective.)

To be clear, I also think that AI timelines are likely to be short, and AI x-risk mostly dominates my thinking about the future.  If we can build aligned, transformative AI, there is a good chance that we will be able to leverage to develop technological independence from nature.  At the same time, I think our current irresponsible attitude towards managing natural resources doesn't bode well, even if we grant ourselves huge technological advances (it seems to me that many problems facing humanity now require social, not technological solutions; the technology is often already there...).

Comment by capybaralet on Big List of Cause Candidates · 2021-01-19T19:17:16.500Z · EA · GW

Yeah...  it's not at all my main focus, so I'm hoping to inspire someone else to do that! :) 

Comment by capybaralet on Big List of Cause Candidates · 2021-01-17T22:54:05.667Z · EA · GW

I recommend changing the "climate change" header to something a  bit broader (e.g."environmentalism" or "protecting the natural environment", etc.).  It is a shame that (it seems) climate change has come to eclipse/subsume all other environmental  concerns in the public imagination.  While most environmental issues are exacerbated by climate change, solving climate change will not necessarily solve them.

A specific cause worth mentioning is preventing the collapse of key ecosystems, e.g. coral reefs:


Comment by capybaralet on Idea: "SpikeTrain" for lifelogging · 2021-01-04T20:25:30.559Z · EA · GW

Thanks for the pointer!  I think many EAs are interested in QS, but I agree it's a bit tangential.

Comment by capybaralet on Improving Institutional Decision-Making: a new working group · 2021-01-02T21:07:19.803Z · EA · GW

IIRC Etherium foundation is using QF somehow.
But it's probably best just to get in touch with someone who knows more of what's going on at RXC.
Not sure who that would be OTTMH, unfortunately.

Comment by capybaralet on Improving Institutional Decision-Making: a new working group · 2020-12-29T18:26:32.304Z · EA · GW

I think you guys are already aware of RadicalXChange.  It's a bit different in focus, but I know they are excited about trying out mechanisms like QV/QF in institutional settings.

Comment by capybaralet on What's a good reference for finding (more) ethical animal products? · 2020-12-29T17:37:19.335Z · EA · GW

It was a few years back that I looked into it, and I didn't try too hard.  Sad to see the PETA link.
I'm basically looking for a reference that summarizes someone else's research (so I don't have to do my own).

Comment by capybaralet on Idea: the "woketionary" · 2020-12-12T20:12:16.683Z · EA · GW

This doesn't seem like a great use of time. For one thing,  I think it gets the psychology of political disagreements backwards. People don't simply disagree with each other because they don't understand each others' words. Rather they'll often misinterpret words to meet political ends.

It's not one or the other.  Anyways, having shared definitions also prevents deliberate/strategic misinterpretation.

I also question anyone's ability to create such an "objective/apolitical" dictionary. As you note, even the term "woke" can have a negative connotation. (And in some circles it still has a positive connotation.) Some words are essentially political footballs in today's climate. For example, in this dictionary what would be the definition of the word "woman"?

Sure, nothing is ever apolitical.  But you can try to make it less so.

I'm also unconvinced that this is an EA type of activity. For the standard reasons, I think EA should be very cautious when approaching politics. It seems like creating a central hub for people to look up politically loaded terms is the opposite of this.

What do you mean "the standard reasons"?   I don't think it should be EA "branded".  I don't believe EAs should reason from cause areas to interventions; rather I think we should evaluate each intervention independently.

Comment by capybaralet on What are the most common objections to “multiplier” organizations that raise funds for other effective charities? · 2020-12-11T00:07:55.535Z · EA · GW

Do you disagree that the EA community at large seems less excited about multiplier orgs vs. more direct orgs?  

Comment by capybaralet on What are the most common objections to “multiplier” organizations that raise funds for other effective charities? · 2020-12-09T04:32:23.269Z · EA · GW

I'm skeptical of multiplier organizations relative effectiveness because the EA community doesn't seem that excited about them. 

(P.S.: This is actually probably my #1 reason, as someone who hasn't spent much time thinking about where people should donate.  I suspect a lot of people are wary of seeming too enthusiastic because they don't want EA to look like a pyramid scheme.)

Comment by capybaralet on The Intellectual and Moral Decline in Academic Research · 2020-12-09T04:19:03.288Z · EA · GW

Aren't grant lotteries a more obvious solution than the three you mention?

Comment by capybaralet on Idea: "Change the World University" · 2020-12-09T04:08:26.683Z · EA · GW
  • To some extent, you don't need to.  I don't believe there's a very clear distinction between the 2 camps.
  • To begin with, this university would be viewed as weird, and I suspect, would not be particularly attractive to virtue signalers as a result.  This would help establish a culture of genuine idealists.
  • This is part of the mandate of the admissions decision-makers.  I expect if you had good people, you could do a pretty good job of screening applicants.

Comment by capybaralet on Effective charities for improving institutional decision making and improving global coordination · 2020-12-09T03:17:55.149Z · EA · GW


Comment by capybaralet on Effective charities for improving institutional decision making and improving global coordination · 2020-12-09T03:17:34.944Z · EA · GW

What does "effective charity" mean in this context?

Comment by capybaralet on Idea: "Change the World University" · 2020-12-07T23:22:38.778Z · EA · GW

What you describe is part of what I meant by "jadedness".

"If they were actually trying to change the world -- if they were actually strongly motivated to make the world a better place, etc. -- the stuff they learn in college wouldn't stop them."

^ I disagree.  Or rather, I should say, there are a lot of people who are not-so-strongly motivated to make the world a better place, and so get burned out and settle into a typical lifestyle.  I think this outcome would be much less likely at a place like "Change the World University", both because it would feel worse to give up on that goal (you would constantly be reminded of that), and because your peers would be (self-/)selected for being passionate about changing the world.

Comment by capybaralet on Idea: Resumes for Politicians · 2020-12-07T08:06:08.286Z · EA · GW

Thanks for that! 
I'm interested if you have other examples.

This one looks similar, but not that similar.  The whole framing/vision is different.

When I visit their webpage, the message I get is: "hey, do you maybe want to opt in to this thing to tell us about yourself because you can't get any real publicity?"

The message I want to send is: "Politicians are job candidates; why don't we make them apply/grovel for a job like everyone else?

Comment by capybaralet on Correlations Between Cause Prioritization and the Big Five Personality Traits · 2020-12-03T23:55:31.268Z · EA · GW

I think I understand what you are doing, and disagree with it being a way of meaningfully addressing my concern.  

Comment by capybaralet on Correlations Between Cause Prioritization and the Big Five Personality Traits · 2020-10-26T17:55:13.463Z · EA · GW

It seems like you are calculating the chance that NONE of these results are significant, not the chance that MOST of them ARE (?)

Comment by capybaralet on Correlations Between Cause Prioritization and the Big Five Personality Traits · 2020-09-25T06:23:02.847Z · EA · GW
Out of 55 2-sample t-tests, we would expect 2 to come out "statistically significant" due to random chance, but I found 10, so we can expect most of these to point to actually meaningful differences represented in the survey data.

Is there a more rigorous form of this argument?

Comment by capybaralet on On Collapse Risk (C-Risk) · 2020-03-12T19:59:32.047Z · EA · GW

I just skimmed the post.

Many of the most pressing threats to the humanity are far more likely to cause collapse than be an outright existential threat with no ability for civilisation to recover.

This claim is not supported, and I think most people who study catastrophic risks (they already coined the acronym C-risk, sorry!) and x-risks would disagree with it.

In fact, civilization collapse is considered fairly unlikely by many, although Toby Ord thinks it hasn't been properly explored (see is recent 80k interview).

AI in particular (which many believe is easily the largest x-risk) seems quite unlikely to cause civilization collapse or c-risk without also x-risk.

From what I understand, the loss of welfare is probably much less significant than the decreased ability to prevent X-risks. Although, since X-risks are thought to be mostly anthropogenic, civilization collapse could actually significantly reduce immediate x-risk.

In general, I believe the thinking goes that we lose quite a small fraction of the light cone over the course of, e.g., a few centuries... this is why things like "long reflection periods" seem like good ideas. I'm not sure anyone has tried to square that with simulation hypothesis or other unknown-unknown type x-risks, which seem like they should make us discount much more aggressively. I guess the idea there is probably that most of the utility lies in universes with long futures, so we should prioritize our effects on them.

I suspect someone who has more expertise on this topic might want to respond more thoroughly.

Comment by capybaralet on On Collapse Risk (C-Risk) · 2020-03-12T19:48:41.948Z · EA · GW

These are not the same thing. GCR is just anything that's bad on a massive scale, civilization doesn't have to collapse.

Comment by capybaralet on Could we solve this email mess if we all moved to paid emails? · 2019-08-15T02:30:17.971Z · EA · GW

Overall, I'm intrigued and like this general line of thought. A few thoughts on the post:

  • If you're using, it's not really email anymore, right? So maybe it's better to think about this as about "online messaging".
  • Another (complementary) way to improve email is to make it like facebook where you have to agree to connect with someone before they can message you.
  • Like many ideas about using $$ as a signal, I think it might be better if we instead used a domain-specific credit system, where credits are allotted to individuals at some fixed rate, or according to some rules, and cannot be purchased. People can find ways of subverting that, but they can also subvert the paid email idea (just open all their emails and take the $$ without reading or responding meaningfully).
Comment by capybaralet on How Europe might matter for AI governance · 2019-07-30T00:57:56.820Z · EA · GW

To answer your question: no.

I basically agree with this comment, but I'd add that the "diminishing returns" point is fairly generic, and should be coupled with some arguments about why there are very rapidly diminishing returns in US/China (seems false) or non-trivial returns in Europe (seems plausible, but non-obvious, and also to be one of the focuses of the OP).

Comment by capybaralet on How Europe might matter for AI governance · 2019-07-26T00:41:15.421Z · EA · GW

RE "why look at Europe at all?", I'd say Europe's gusto for regulation is a good reason to be interested (you discuss that stuff later, but for me it's the first reason I'd give). It's also worth mentioning the "right to an explanation" as well as GDPR.

Comment by capybaralet on Top Tips on how to Choose an Effective Charity · 2018-04-08T14:25:11.183Z · EA · GW

Based on the report [1], it's a bit misleading to say that they are a charity doing $35 cataracts. The report seems pretty explicit that donations to the charity are used for other activities.

Comment by capybaralet on [deleted post] 2017-10-16T03:00:35.800Z

I strongly agree that independent thinking seems undervalued (in general and in EA/LW). There is also an analogy with ensembling in machine learning (

By "independent" I mean "thinking about something without considering others' thoughts on it" or something to that effect... it seems easy for people's thoughts to converge too much if they aren't allowed to develop in isolation.

Thinking about it now, though, I wonder if there isn't some even better middle ground; in my experience, group brainstorming can be much more productive than independent thought as I've described it.

There is a very high-level analogy with evolution: I imagine sexual reproduction might create more diversity in a population than horizontal gene transfer, since in the latter case, an idea(=gene) which seems good could rapidly become universal, and thus "local optima" might be more of a problem for the population (I have no idea if that's actually how this works biologically... in fact, it seems like it might not be, since at least some viruses/bacteria seem to do a great job of rapidly mutating to become resistant to defences/treatments.)

Comment by capybaralet on My current thoughts on MIRI's "highly reliable agent design" work · 2017-10-10T16:30:34.849Z · EA · GW

it's cross-posted on LW:

Comment by capybaralet on [deleted post] 2017-10-07T16:14:19.001Z

Thanks for writing this. My TL;DR is:

  1. AI policy is important, but we don’t really know where to begin at the object level

  2. You can potentially do 1 of 3 things, ATM: A. “disentanglement” research: B. operational support for (e.g.) FHI C. get in position to influence policy, and wait for policy objectives to be cleared up

  3. Get in touch / Apply to FHI!

I think this is broadly correct, but have a lot of questions and quibbles.

  • I found “disentanglement” unclear. [14] gave the clearest idea of what this might look like. A simple toy example would help a lot.
  • Can you give some idea of what an operations role looks like? I find it difficult to visualize, and I think uncertainty makes it less appealling.
  • Do you have any thoughts on why operations roles aren’t being filled?
  • One more policy that seems worth starting on: programs that build international connections between researchers (especially around policy-relevant issues of AI (i.e. ethics/safety)).
  • The timelines for effective interventions in some policy areas may be short (e.g. 1-5 years), and it may not be possible to wait for disentanglement to be “finished”.
  • Is it reasonable to expect the “disentanglement bottleneck” to be cleared at all? Would disentanglement actually make policy goals clear enough? Trying to anticipate all the potential pitfalls of policies is a bit like trying to anticipate all the potential pitfalls of a particular AI design or reward specification… fortunately, there is a bit of a disanalogy in that we are more likely to have a chance to correct mistakes with policy (although that still could be very hard/impossible). It seems plausible that “start iterating and create feedback loops” is a better alternative to the “wait until things are clearer” strategy.
Comment by capybaralet on My current thoughts on MIRI's "highly reliable agent design" work · 2017-08-11T01:15:51.989Z · EA · GW

My main comments:

  1. As others have mentioned: great post! Very illuminating!

  2. I agree value-learning is the main technical problem, although I’d also note that value-learning related techniques are becoming much more popular in mainstream ML these days, and hence less neglected. Stuart Russell has argued (and I largely agree) that things like IRL will naturally become a more popular research topic (but I’ve also argued this might not be net-positive for safety:

  3. My main comment wrt the value of HRAD (3a) is: I think HRAD-style work is more about problem definitions than solutions. So I find it to be somewhat orthogonal to the other approach of “learning to reason from humans” (L2R). We don’t have the right problem definitions, at the moment; we know that the RL framework is a leaky abstraction. I think MIRI has done the best job of identifying the problems which could result from our current leaky abstractions, and working to address them by improving our understanding of what problems need to be solved.

  4. It’s also not clear that human reasoning can be safely amplified; the relative safety of existing humans may be due to our limited computational / statistical resources, rather than properties of our cognitive algorithms. But this argument is not as strong as it seems; see comment #3 below.

A few more comments:

  1. RE 3b: I don’t really think the AI community’s response to MIRI’s work is very informative, since it’s just not on people’s radar. The problems and not well known or understood, and the techniques are (AFAIK) not very popular or in vogue (although I’ve only been in the field for 4 years, and only studied machine-learning based approaches to AI). I think decision theory was already a relatively well known topic in philosophy, so I think philosophy would naturally be more receptive to these results.

  2. I’m unconvinced about the feasibility of Paul’s approach**, and share Wei Dai’s concerns about it hinging on a high level of competitiveness. But I also think HRAD suffers from the same issues of competitiveness (this does not seem to be MIRI’s view, which I’m confused by). This is why I think solving global coordination is crucial.

  3. A key missing (assumed?) argument here is that L2R can be a stepping stone, e.g. providing narrow or non-superintelligent AI capabilities which can be applied to AIS problems (e.g. making much more progress on HRAD than MIRI). To me this is a key argument for L2R over HRAD, and generally a source of optimism. I’m curious if this argument plays a significant role in your thought; in other words, is it that HRAD problems don’t need to be solved, or just that the most effective solution path goes through L2R? I’m also curious about the counter-argument for pursuing HRAD now: i.e. what role does MIRI anticipate safe advanced (but not general / superhuman) intelligent systems to play in HRAD?

  4. An argument for more funding for MIRI which isn’t addressed is the apparent abundance of wealth at the disposal of Good Ventures. Since funding opportunities are generally scarce in AI Safety, I think every decent opportunity should be aggressively pursued. There are 3 plausible arguments I can see for the low amount of funding to MIRI: 1) concern of steering other researchers in unproductive directions 2) concern about bad PR 3) internal politics.

  5. Am I correct that there is a focus on shorter timelines (e.g. <20 years)?

Briefly, my overall perspective on the future of AI and safety relevance is:

  1. There ARE fundamental insights missing, but they are unlikely to be key to building highly capable OR safe AI.

  2. Fundamental insights might be crucial for achieving high confidence in a putatively safe AI (but perhaps not for developing an AI which is actually safe).

  3. HRAD line of research is likely to uncover mostly negative results (ala AIXI’s arbitrary dependence on prior)

  4. Theory is behind empiricism, and the gap is likely to grow; this is the main reason I’m a bit pessimistic about theory being useful. On the other hand, I think most paths to victory involve using capability-control for as long as possible while transitioning to completely motivation-control based approaches, so conditioning on victory, it seems more likely that we solve more fundamental problems (i.e. “we have to solve these problems eventually”).

** the two main reasons are: 1) I don’t think it will be competitive and 2) I suspect it will be difficult to prevent compounding errors in a bootstrapping process that yields superintelligent agents.

Comment by capybaralet on My current thoughts on MIRI's "highly reliable agent design" work · 2017-08-06T11:55:15.349Z · EA · GW

My point was that HRAD potentially enables the strategy of pushing mainstream AI research away from opaque designs (which are hard to compete with while maintaining alignment, because you don't understand how they work and you can't just blindly copy the computation that they do without risking safety), whereas in your approach you always have to worry about "how do I compete with with an AI that doesn't have an overseer or has an overseer who doesn't care about safety and just lets the AI use whatever opaque and potentially dangerous technique it wants".

I think both approaches potentially enable this, but are VERY unlikely to deliver. MIRI seems more bullish that fundamental insights will yield AI that is just plain better (Nate gave me the analogy of Judea Pearl coming up with Causal PGMs as such an insight), whereas Paul just seems optimistic that we can get a somewhat negligible performance hit for safe vs. unsafe AI.

But I don't think MIRI has given very good arguments for why we might expect this; it would be great if someone can articulate or reference the best available arguments.

I have a very strong intuition that dauntingly large safety-performance trade-offs are extremely likely to persist in practice, thus the only answer to the "how do I compete" question seems to be "be the front-runner".

Comment by capybaralet on My current thoughts on MIRI's "highly reliable agent design" work · 2017-08-06T07:32:08.441Z · EA · GW

Will - I think "meta-reasoning" might capture what you mean by "meta-decision theory". Are you familiar with this research (e.g. Nick Hay did a thesis w/Stuart Russell on this topic recently)?

I agree that bounded rationality is likely to loom large, but I don't think this means MIRI is barking up the wrong tree... just that other trees also contain parts of the squirrel.

Comment by capybaralet on What Should the Average EA Do About AI Alignment? · 2017-03-01T23:34:16.735Z · EA · GW

I'm also very interested in hearing you elaborate a bit.

I guess you are arguing that AIS is a social rather than a technical problem. Personally, I think there are aspects of both, but that the social/coordination side is much more significant.

RE: "MIRI has focused in on an extremely specific kind of AI", I disagree. I think MIRI has aimed to study AGI in as much generality as possible and mostly succeeded in that (although I'm less optimistic than them that results which apply to idealized agents will carry over and produce meaningful insights in real-world resource-limited agents). But I'm also curious what you think MIRIs research is focusing on vs. ignoring.

I also would not equate technical AIS with MIRI's research.

Is it necessary to be convinced? I think the argument for AIS as a priority is strong so long as the concerns have some validity to them, and cannot be dismissed out of hand.

Comment by capybaralet on Essay contest: general considerations for evaluating small-scale giving opportunities ($300 for winning submission) · 2017-01-27T02:31:26.654Z · EA · GW

(cross posted on facebook):

I was thinking of applying... it's a question I'm quite interested in. The deadline is the same as ICML tho!

I had an idea I will mention here: funding pools:

  1. You and your friends whose values and judgement you trust and who all have small-scale funding requests join together.
  2. A potential donor evaluates one funding opportunity at random, and funds all or none of them on the basis of that evaluation.
  3. You have now increased the ratio of funding / evaluation available to a potential donor by a factor of #projects
  4. There is an incentive for you to NOT include people in your pool if you think their proposal is quite inferior to yours... however, you might be incentivized to include somewhat inferior proposals in order to reach a threshold where the combined funding opportunity is large enough to attract more potential donors.
Comment by capybaralet on Building Cooperative Epistemology (Response to "EA has a Lying Problem", among other things) · 2017-01-17T05:38:13.226Z · EA · GW

I was overall a bit negative on Sarah's post, because it demanded a bit too much attention, (e.g. the title), and seemed somewhat polemic. It was definitely interesting, and I learned some things.

I find the most evocative bit to be the idea that EA treats outsiders as "marks".
This strikes me as somewhat true, and sadly short-sighted WRT movement building. I do believe in the ideas of EA, and I think they are compelling enough that they can become mainstream.

Overall, though, I think it's just plain wrong to argue for an unexamined idea of honesty as some unquestionable ideal. I think doing so as a consequentialist, without a very strong justification, itself smacks of disingenuousness and seems motivated by the same phony and manipulative attitude towards PR that Sarah's article attacks.

What would be more interesting to me would be a thoughtful survey of potential EA perspectives on honesty, but an honest treatment of the subject does seem to be risky from a PR standpoint. And it's not clear that it would bring enough benefit to justify the cost. We probably will all just end up agreeing with common moral intuitions.

Comment by capybaralet on Why donate to 80,000 Hours · 2017-01-07T02:05:59.658Z · EA · GW

Do you have any info on how reliable self-reports are wrt counterfactuals about career changes and EWWC pledging?

I can imagine that people would not be very good at predicting that accurately.

Comment by capybaralet on Effective Altruism is Not a Competition · 2017-01-05T19:38:16.253Z · EA · GW

People are motivated both by:

  1. competition and status and
  2. cooperation and identifying with the successes of a group. I think we should aim to harness both of these forms of motivation.
Comment by capybaralet on Thoughts on the "Meta Trap" · 2017-01-05T18:13:17.713Z · EA · GW

"But maybe that's just because I am less satisfied with the current EA "business model"/"product" than most people."

Care to elaborate (or link to something?)

Comment by capybaralet on What the EA community can learn from the rise of the neoliberals · 2017-01-05T08:06:26.056Z · EA · GW

"This is something the EA community has done well at, although we have tended to focus on talent that current EA organization might wish to hire. It may make sense for us to focus on developing intellectual talent as well."

Definitely!! Are there any EA essay contests or similar? More generally, I've been wondering recently if there are many efforts to spread EA among people under the age of majority. The only example I know of is SPARC.