TAI Safety Bibliographic Database 2020-12-22T16:03:54.484Z
Meetup : New York City 2014-09-16T15:00:05.440Z


Comment by Jess_Riedel on Certificates of impact · 2021-05-04T15:18:54.857Z · EA · GW

Paul Graham writes that Noora Health is doing something like this.

Comment by Jess_Riedel on TAI Safety Bibliographic Database · 2020-12-26T18:32:32.565Z · EA · GW

Regarding your 4 criteria, I think they don't really delineate how to make the sort of judgment calls we're discussing here, so it really seems like it should be about a 5th criterion that does delineate that.

Sorry I was unclear.  Those were just 4 desiderata that the criteria need to satisfy; the desiderata weren't intended to fully specify the criteria.

If a small group of researchers at MIRI were trying to do work on verification but not getting much traction in the academic community, my intuition is that their papers would reliably meet your criteria.

Certainly possible, but I think this would partly be because MIRI would explicitly talk in their paper about the (putative) connection to TAI safety, which makes it a lot easier for me see. (Alternative interpretation: it would be tricking me, a non-expert, into thinking there was more of a substantive connection to TAI safety than actually is there.)  I am trying not to penalize researchers for failing to talk explicitly about TAI, but I am limited.

I think it's more likely the database has inconsistencies of the kind you're pointing at from CHAI, Open AI, and (as you've mentioned) DeepMind, since these organizations have self-described (partial) safety focus while still doing lots of research non-safety and near-term-safety research.  When confronted with such inconsistencies, I will lean heavily toward not including any of them since this seems like the only feasible choice given my resources. In other words, I select your final option: "The hypothetical MIRI work shouldn't have made the cut".

I definitely agree that you shouldn't just include every paper on robustness or verification, but perhaps at least early work that led to an important/productive/TAI-relevant line should be included

Here I understand you to be suggesting that we use a notability criterion that can make up for the connection to TAI safety being less direct.  I am very open to this suggestion, and indeed I think an ideal database would use criteria like this.  (It would make the database more useful to both researchers and donors.)  My chief concern is just that I have no way to do this right now because I am not in a position to judge the notability.  Even after looking at the abstracts of the work by Raghunathan et al. and Wong & Kolter, I, as a layman, am unable to tell that they are quite notable.  

Now, I could certainly infer notability by (1) talking to people like you and/or (2) looking at a citation trail.  (Note that a citation count is insufficient because I'd need to know it's well cited by TAI safety papers specifically.)  But this is just not at all feasible for me to do for a bunch of papers, much less every paper that initially looked equally promising to my untrained eyes. This database is a personal side project, not my day job.  So I really need some expert collaborators or, at the least, some experts who are willing to judge batches of papers based on a some fixed set of criteria.

Comment by Jess_Riedel on TAI Safety Bibliographic Database · 2020-12-23T15:43:15.200Z · EA · GW

Sure, sure, we tried doing both of these. But they were just taking way too long in terms of new papers surfaced per hour worked. (Hence me asking for things that are more efficient than looking at reference lists from review articles and emailing the orgs.) Following the correct (promising) citation trail also relies more heavily on technical expertise, which neither Angelica nor I have.

I would love to have some collaborators with expertise in the field to assist on the next version. As mentioned, I think it would make a good side project for a grad student, so feel to nudge yours to contact us!

Comment by Jess_Riedel on TAI Safety Bibliographic Database · 2020-12-23T15:39:56.860Z · EA · GW

for instance if you think Wong and Cohen should be dropped then about half of the DeepMind papers should be too since they're on almost identical topics and some are even follow-ups to the Wong paper).

Yea, I'm saying I would drop most of those too.

I think focusing on motivation rather than results can also lead to problems, and perhaps contributes to organization bias (by relying on branding to asses motivation).

I agree this can contribute to organizational bias.

I do agree that counterfactual impact is a good metric, i.e. you should be less excited about a paper that was likely to soon happen anyways; maybe that's what you're saying? But that doesn't have much to do with motivation.

Just to be clear: I'm using "motivation" here in the technical sense of "What distinguishes this topic for further examination out of the space of all possible topics?", i.e., is the topic unusually likely to lead to TAI safety results down the line?" (It's not anything to do with the author's altruism or whatever.)

I think what would best advance this conversation would be for you to propose alternative practical inclusion criteria which could be contrasted the ones we've given.

Here's how is how I arrived at ours. The initial desiderata are:

  1. Criteria are not based on the importance/quality of the paper. (Too hard for us to assess.)

  2. Papers that are explicitly about TAI safety are included.

  3. Papers are not automatically included merely for being relevant to TAI safety. (There are way too many.)

  4. Criteria don't exclude papers merely for failure to mention TAI safety explicitly. (We want to find and support researchers working in institutions where that would be considered too weird.)

(The only desiderata that we could potentially drop are #2 or #4. #1 and #3 are absolutely crucial for keeping the workload manageable.)

So besides papers explicitly about TAI safety, what else can we include given the fact that we can't include everything relevant to safety? Papers that TAI safety researchers are unusually likely (relative to other researchers) to want to read, and papers that TAI safety donors will want to fund. To me, that means the papers that are building toward TAI safety results more than most papers are. That's what I'm trying to get across by "motivated".

Perhaps that is still too vague. I'm very in your alternative suggestions!

Comment by Jess_Riedel on TAI Safety Bibliographic Database · 2020-12-22T23:52:08.212Z · EA · GW

Thanks Jacob.  That last link is broken for me, but I think you mean this?

 You sort of acknowledge this already, but one bias in this list is that it's very tilted towards large organizations like DeepMind, CHAI, etc.

Well,  it's biased toward safety organizations, not large organizations.  (Indeed, it seems to be biased toward small safety organizations over larges ones since they tend to reply to our emails!)  We get good coverage of small orgs like Ought, but you're right we don't have a way to easily track individual unaffiliated safety researchers and it's not fair.

I look forward to a glorious future where this database is so well known that all safety authors naturally send us a link to their work when its released, but for now the best way we have of finding papers is (1) asking safety organizations for what they've produced and (2) taking references from review articles.  If you can suggest another option for getting more comprehensive coverage per hour of work we'd be very interested to hear it (seriously!).

For what it's worth, the papers by Hendrycks are very borderline based on our inclusion criteria, and in fact I think if I were classifying it today I think I would not include it.  (Not because it's not high quality work, but just because I think it still happens in a world where no research is motivated by the safety of transformative AI; maybe that's wrong?) For now I've added the  papers you mention by Hendrycks, Wong, and Cohen to the database, but my guess is they get dropped for being too near-term-motivated when they get reviewed next year.

More generally, let me mention that  we do want to recognize great work, but our higher priority is to (1) recognize work that is particularly relevant to TAI safety and (2) help donors assess safety organizations. 

Thanks again!  I'm adding your 2019 review to the list.


Comment by Jess_Riedel on Quantum computing timelines · 2020-09-16T11:59:56.634Z · EA · GW

Jaime gave a great thorough explanation. My catch-phrase version: This is not a holistic Bayesian prediction. The confidence intervals come from bootstrapping (re-sampling) a fixed dataset, not summing over all possible future trajectories for reality.

Comment by Jess_Riedel on Use resilience, instead of imprecision, to communicate uncertainty · 2020-07-19T18:30:27.139Z · EA · GW

I was curious about the origins of this concept in the EA community since I think it's correct, insightful, and I personally had first noticed it in conversation among people at Open Phil. On Twitter, @alter_ego_42 pointed out the existence of the Credal Resilience page in the "EA concepts" section of this website. That page cites

Skyrms, Brian. 1977. Resiliency, propensities, and causal necessity. The journal of philosophy 74(11): 704-713. [PDF]

which is the earliest thorough academic reference to this idea that I know of. With apologies to Greg, this seems like the appropriate place to post a couple comments on that paper so others don't have to trudge through it.

I didn't find Skyrms's critique of frequentism at the beginning, or his pseudo-formalization of resilency on page 705 (see for instance the criticism "Some Remarks on the Concept of Resiliency" by Patrick Suppes in the very next article, pages 713-714), to be very insightful, so I recommend the time-pressed reader concentrate on

  • The bottom of p. 705 ("The concept of probabilistic resiliency is nicely illustrated...") to the top of p. 708 ("... well confirmed to its degree of instantial resiliency, as specified above..").
  • The middle of p. 712 ("The concept of resiliency has connections with...") to p. 713 (the end).

Skyrms quotes Savage (1954) as musing about the possibility of introducing "second-order probabilities". This is grounded in a relative-frequency intuition: when I say that there is a (first-order) probability p of X occurring but that I am uncertain, what I really mean is something like that there is some objective physical process that generates X with (second-order) probability q, but I am uncertain about the details of that process (i.e., about what q is), so my value of p is obtained by integrating over some pdf f (q).

There is, naturally, a Bayesian version of the same idea: We shouldn't concern ourselves with a hypothetical giant (second-order) ensemble of models, each of which generates a hypothetical (first-order) ensemble of individual trials. Resilience about probabilities is best measured by our bets on how future evidence would change those probabilities, just as probabilities is best measured by our bets on future outcomes.

(Unfortunately, and unlike the case for standard credences, there seems to be multiple possible formulations depending on which sorts of evidence we are supposing: what I expect to learn in the actual future, what I could learn if I thought about it hard, what a superforecaster would say in my shoes, etc.)

Comment by Jess_Riedel on Notes on 'Atomic Obsession' (2009) · 2019-10-26T13:29:59.998Z · EA · GW

Were there a lot of new unknown or underappreciated facts in this book? From the summary, it sounds mostly like a reinterpretation of the standard history, which hinges on questions of historical determinism.

Comment by Jess_Riedel on What's Changing With the New Forum? · 2018-11-13T17:25:04.899Z · EA · GW

Consider changing the visual format a bit to better distinguish this forum from LW. They are almost indistinguishable right now, especially once you scroll down just a bit and the logo disappears.

Comment by Jess_Riedel on Announcing the 2017 donor lottery · 2017-12-18T04:18:24.818Z · EA · GW

Could you explain your first sentence? What risks are you talking about?

Also, how does one lottery up further if all the block sizes are $100k? Diving it up into multiple blocks doesn't really work.

Comment by Jess_Riedel on Announcing the 2017 donor lottery · 2017-12-17T18:00:05.801Z · EA · GW

I'm curious about why blocks were chosen rather than just a single-lottery scheme, i.e., having all donors contribute to the same lottery, with a $100k backstop but no upper limit. The justification on the webpage is

Multiple blocks ensure that there is no cap on the number of donors who may enter the lottery, while ensuring that the guarantor's liability is capped at the block size.

But of course we could satisfy this requirement with the single-lottery scheme. The single-lottery scheme also has the benefits that (1) the guarantor has significantly less risk since there's a much higher chance they need to pay nothing, especially once the popularity of donor lotteries is more stable and (2) the "leverage" can get arbitrarily high rather than being capped by $100k/. The main feature (benefit?) of the multi-block scheme is, as Carl says elsewhere in this thread, "the odds of payouts for other participants are unaffected by anyone's particular participation in this lottery design". But it's not clear to me why this non-interaction principle is better than allowing large leverage. We just want to be really careful about unintended incentives?

Comment by Jess_Riedel on An argument for broad and inclusive "mindset-focused EA" · 2017-07-16T16:29:16.523Z · EA · GW

EAs seems pretty open to the idea of being big-tent with respect to key normative differences (animals, future people, etc). But total indifference to cause area seems too lax. What if I just want to improve my local neighborhood or family? Or my country club? At some point, it becomes silly.

It might be worth considering parallels with the Catholic Church and the Jesuits. The broader church is "high level", but the requirements for membership are far from trivial.

Comment by Jess_Riedel on Introducing the EA Funds · 2017-02-10T18:59:25.115Z · EA · GW

The list of donation recipients from Nick's DAF is here:

I don't believe there's been any write-ups or dollar amounts, except the above list is ordered by donation size.

Comment by Jess_Riedel on Introducing the EA Funds · 2017-02-09T22:43:33.493Z · EA · GW

I am on the whole positive about this idea. Obviously, specialization is good, and creating dedicated fund managers to make donation decisions can be very beneficial. And it makes sense that the boundaries between these funds arise from normative differences between donors, while putting fund managers in charge of sorting out empirical questions about efficiency. This is just the natural extension, of the original GiveWell concept, to account for normative differences, and also to utilize some of the extra trust that some EAs will have for other people in the community that isn't shared by a lot of GiveWell's audience.

That said, I'm worried about principle-agent problems and transparency, and about CEA becoming an organization receiving monthly direct debits from the bank accounts of ten thousand people. Even if we assume that current CEA employees are incorruptible superhuman angels, giving CEA direct control of a firehose of cash makes it an attractive target for usurpers (in a way that it is not when it's merely making recommendations and doing outreach). These sorts of worries apply much less to GiveWell when it's donating to developing-world health charities than to CEA when it's donating to EA start-ups who are good friends with the staff.

Will EA Fund managers be committed to producing the sorts of detailed explanations and justifications we see from GiveWell and Open Phil, at least after adjusting for donation size? How will the conflicts of interest be managed and documented with such a tightly interlinked community?

What sorts of additional precautions will be taken to manage these risks, especially for the long term?

Comment by Jess_Riedel on Effective Altruism Prediction Registry · 2016-07-12T23:19:46.337Z · EA · GW

Update: the Good Judgment Project has just launched Good Judgement Open.

Comment by Jess_Riedel on How valuable is movement growth? · 2015-05-24T19:40:00.226Z · EA · GW

I mostly agree with this. No need to reinvent the wheel, and armchair theorizing is so tempting, while sorting through the literature can be painful. But I will say your reason #1 (the typical sociological research is of very poor quality) leads to a second effect: scouring the literature for the useful bits (of which I am sure there is plenty) is very difficult and time consuming.

If we were talking about ending global poverty, we would not be postulating new models of economic development. Why should we demand any less empirical/academic rigor in the context of movement building?

I can tell you that when financial quants want to make money, they spend some time reading the academic literature on the market, but they are often very critical of its quality and usefulness for real-life decisions.

So what we really need are people to say "this particular topic was already addressed by this particular reference". Too often, the criticism to reinventing the wheel is "you should just read this vaguely defined body of work, most of which is inapplicable".

Comment by Jess_Riedel on Can we set up a system for international donation trading? · 2015-03-04T13:55:03.134Z · EA · GW

I am mildly worried that connecting strangers to make honor-system donation trades could lead to a dispute. There are going to be more and more new faces around if the various EA growth strategies this year pan out. The fact that donation trading has been going on smoothly until now means folks might get overly relaxed, and it only takes one publicized dispute to really do damage to the culture. Even if no one is outright dishonest, miscommunication could lead to a someone thinking they have been wronged to the tune of thousands of dollars.

I don't think that communication between the donors, as Brian mentions, is fully satisfactory. Even if everyone promises to send receipts afterwards, you still have Byzantine generals' problems. One idea is that we find someone at CEA who, at the least, can be listed as an email contact to which two donors can send their agreement before they execute, just so there's a records and so the CEA person can point out any obvious confusion. I think this could be a very efficient use of CEA time, especially if it increases trust and therefore makes more trades possible.

Comment by Jess_Riedel on Help a Canadian give with a tax-deduction by swapping donations with them! · 2015-01-06T21:29:18.624Z · EA · GW

Howdy, I'm trying to make a donation to CEA of about $4,000 this month from Canada. Would be very glad to swap with you for AMF. If you're still up for this, please shoot me an email.

Comment by Jess_Riedel on Where are you giving and why? · 2014-12-12T13:57:03.995Z · EA · GW

Worth noting that it can still be worth posting to your personal blog if only to increase how many people see it.

Comment by Jess_Riedel on Open Thread · 2014-09-20T04:18:32.281Z · EA · GW

Very reasonable. Thanks Ryan.

Comment by Jess_Riedel on Introduce Yourself · 2014-09-20T04:17:32.282Z · EA · GW

Hey, I'm a postdoc in q. info (although more on the decoherence and q. foundations side of things). I'm interested to know more about where you're at and how you found out about LessWrong. Shoot me an email if you want. My address is my username without the underscore .

Comment by Jess_Riedel on Minor Updates · 2014-09-20T04:06:29.242Z · EA · GW

I lean against creating multiple fora. Even if it was a good idea in the long run, I think that it's better to start with one forum so that it's easier to achieve a critical mass. It's no exaggeration to say that LW's Main/Discussion distinction was one of the most hated features of the site. I also think that fragmenting an online community and decreasing its usability are two of the most damaging things you can do to a budding community website.

This was interesting to me.

Here's one more idea to throw out there: Divide the posts into "major" and "minor" tags and then include a checkbox for signed-in users that says something like "filter for major posts" that would only show the important/major/fleshed-out posts. If you wanted to make sure the minor posts didn't get neglected by apathy, you could have that box become unchecked the next time the person visits. In order to maintain an impressive appearance to visitors, they would only see the major posts.

This should significantly reduce the chance that minor posts are neglected (except by people who shouldn't or don't want to see them) and would be expandable to a more extensive tagging system in the future.

Comment by Jess_Riedel on Open Thread · 2014-09-20T03:26:47.912Z · EA · GW

Thanks for info Ryan. A couple of points:

(1) I don't think minor posts like "Here's an interesting article. Anyone have thoughts?" fit very well in the open thread. The open threads are kind of unruly, and it's hard to find anything in there. In particular, it's not clear when something new has been added.

One possibility is to create a second tier of posts which do not appear on the main page unless you specifically select it. Call it "minor posts" or "status updates" or whatever. (Didn't LessWrong have something like this?) These would have essentially no barrier to entry and could consist of single link. However, the threaded comment sections would be a lot more useful than FB.

This is similar to Peter_Hurford and MichaelDickens and SoerenMind comments above.

(2) I've talked to at least a couple of other people who think EAs need a place to talk that's more casual in the specific sense that comments aren't saved for all eternity on the internet. (Or, at the very least, aren't indexed by search engines.) Right now there is a significant friction associated with the fact that each time you click submit you have to make sure you're comfortable with your name being attached to your comment forever.

It might make sense to combine (1) and (2) (or not).

Comment by Jess_Riedel on Open Thread · 2014-09-16T14:48:35.511Z · EA · GW

I'm still fuzzy on the relationship between the EA Facebook group and the EA forum. Are we supposed to move most or all the discussion that was going on in the FB group here? Will the FB be shut down, and if not what will is be used for?

I think the format of the forum will present a higher barrier to low-key discussion than the FB group, e.g. I'd guess people are much less likely to post an EA related new article if they don't have too much to add to it. This is primarily because the forum looks like a blog. Is FB style posting encouraged?

If this has all been described somewhere. Could someone point me toward it?

Also, what's the relationship between the EA forum and the EA hub?

Comment by Jess_Riedel on Where I'm giving and why: Will MacAskill · 2014-01-02T21:21:00.000Z · EA · GW

My impression is more that FHI is at the startup stage and CSER is simply an idea people have been kicking around. Whether or not you support CSER would depend on whether or not you think it's actually going to be instantiated. Am I confused?

Comment by Jess_Riedel on Where I'm giving and why: Will MacAskill · 2014-01-02T21:18:00.000Z · EA · GW

I think the claim, which I do not necessarily support, would be this: Many people give to multiple orgs as a way of selfishly benefiting themselves (by looking good and affiliating with many good causes), whereas a "good" EAer might spread their donation to multiple orgs as a way to (a) persuade the rest of the world to accomplish more good or (b) coordinate better with other EAs, a la the argument you link with Julia. (Whether or not there's a morally important distinction between the laymen and the EAer as things actually take place in the real world is a bit dubious. EA arguments might just be a way to show off how well you can abstractly justify your actions.)

Comment by Jess_Riedel on Where I'm giving and why: Will MacAskill · 2014-01-02T21:06:00.000Z · EA · GW

> They needn't be strangers. This has already happened in the UK EA community amongst EAs who met through 80,000 Hours and supported each other financially in the early training and internship stages of their earning to give careers.

Agreed, but if the funds are effectively restricted to people you know and can sort of trust, then the public registry loses most of its use. Just let it be known among your trusted circle that you have money that you'd be willing share for EA activities. This has the added benefit of not putting you in the awkward position of having to turn down less-trusted folks who request money.

Comment by Jess_Riedel on Where I'm giving and why: Will MacAskill · 2013-12-31T06:51:00.000Z · EA · GW

Will, are you saying that this fund would basically just be a registry? (As opposed to an actual central collection of money with some sort of manager.)

Do you really think people would just send money to 1st-world strangers (ii) on the promise that the recipient was training to earn to give? I have similar misgivings about (iv).