Posts

Do AI companies make their safety researchers sign a non-disparagement clause? 2022-09-05T13:40:48.981Z
Impact markets may incentivize predictably net-negative projects 2022-06-21T13:00:16.644Z
[Meta] Is it legitimate to ask people to upvote posts on this forum? 2021-06-29T07:42:57.439Z
Book review: Architects of Intelligence by Martin Ford (2018) 2020-08-11T17:24:16.833Z
ofer's Shortform 2020-02-19T06:53:16.647Z

Comments

Comment by ofer on I'm interviewing prolific AI safety researcher Richard Ngo (now at OpenAI and previously DeepMind). What should I ask him? · 2022-09-29T15:38:42.447Z · EA · GW

What are the upsides and downsides of doing AI governance research at an AI company, relative to doing it at a non-profit EA organization?

Comment by ofer on CEA Ops is now EV Ops · 2022-09-17T06:22:11.063Z · EA · GW

Reasonably determining whether an anthropogenic x-risk related intervention is net-positive or net-negative is often much more difficult[1] than identifying the intervention as potentially high-impact. With less than 2 minutes to think, one can usually do the latter but not the former. People in EA can easily be unconsciously optimizing for impact (which tends to be much easier and aligned with maximizing status & power) while believing they're optimizing for EV. Using the term "impact" to mean "EV" can exacerbate this problem.


  1. Due to an abundance of crucial considerations. ↩︎

Comment by ofer on CEA Ops is now EV Ops · 2022-09-14T18:27:45.128Z · EA · GW

I haven’t seen anything that makes me think that someone in EA doesn’t care about the sign of their impact

It's not about people not caring about the sign of their impact (~everyone in EA cares); it's about a tendency to behave in a way that is aligned with maximizing impact (rather than EV).

I’d certainly be interested in any evidence of that

Consider this interview with one of the largest funders in EA (the following is based on the transcript from the linked page):

Rob: "What might be distinctive about your approach that will allow you to find things that all the other groups haven’t already found or are going to find?"

[...]

SBF: But having gotten that out of the way, I think that being really willing to give significant amounts is a real piece of this. Being willing to give 100 million and not needing anything like certainty for that. We’re not in a position where we’re like, “If you want this level of funding, you better effectively have proof that what you’re going to do is great.” We’re happy to give a lot with not that much evidence and not that much conviction — if we think it’s, in expectation, great. Maybe it’s worth doing more research, but maybe it’s just worth going for. I think that is something where it’s a different style, it’s a different brand. And we, I think in general, are pretty comfortable going out on a limb for what seems like the right thing to do.

.

Rob Wiblin: OK, so with that out of the way, what’s a mistake you think at least some nontrivial fraction of people involved in effective altruism are making?

[...]

SBF: Then the last thing is thinking about grantmaking. This is definitely a philosophical difference that we have as a grantmaking organization. And I don’t know that we’re right on it, but I think it’s at least interesting how we think about it. Let’s say we evaluate a grant for 48 seconds. After 48 seconds, we have some probability distribution of how good it’s going to be, and it’s quite good in expected value terms. But we don’t understand it that well; there’s a lot of fundamental questions that we don’t know the answer to that would shift our view on this.

Then we think about it for 33 more seconds, and we’re like, “What might this probability distribution look like after 12 more hours of thinking?” And in 98% of those cases, we would still decide to fund it, but it might look materially different. We might have material concerns if we thought about it more, but we think they probably won’t be big enough that we would decide not to fund it.

Rob Wiblin: Save your time.

SBF: Right. You can spend that time, do that, or you could just say, “Great, you get the grant, because we already know where this is going to end up.” But you say that knowing that there are things you don’t know and could know that might give you reservations, that might turn out to make it a mistake. But from an expected value of impact perspective —

Rob Wiblin: It’s best just to go ahead.

SBF: Yeah, exactly. I think that’s another example of this, where being completely comfortable doing something that in retrospect is a little embarrassing. They’ll go, “Oh geez, you guys funded that. That was obviously dumb.” I’m like, “Yeah, you know, I don’t know.” That’s OK.

[...]

Rob Wiblin: Yeah. It’s so easy to get stuck in that case, where you are just unwilling to do anything that might turn out to be negative.

SBF: Exactly. And a lot of my response in those cases is like, “Look, I hear your concerns. I want you to tell me — in writing, right now — whether you think it is positive or negative expected value to take this action. And if you write down positive, then let’s do it. If you write down negative, then let’s talk about where that calculation’s coming from.” And maybe it will be right, but let’s at least remove the scenario where everyone agrees it’s a positive EV move, but people are concerned about some…

Notably, the FTX Foundation's regranting program "gave over 100 people access to discretionary budget" (and I'm not aware of them using a reasonable mechanism to resolve the obvious unilateralist's curse problem). One of the resulting grants was a $215,000 grant for creating an impact market. They wrote:

This regrant will support the creation of an “impact market.” The hope is to improve charity fundraising by allowing profit-motivated investors to earn returns by investing in charitable projects that are eventually deemed impactful.

A naive impact market is a mechanism that incentivizes people to carry out risky projects—that might turn out to be beneficial—while regarding potential harmful outcomes as if they were neutral. (The certificates of a project that ended up being harmful are worth as much as the certificates of a project that ended up being neutral, namely nothing.)

Comment by ofer on CEA Ops is now EV Ops · 2022-09-14T16:14:56.165Z · EA · GW

I strongly agree with this comment, except that I don't think this issue is minor.

IMO, this issue is related to a very troubling phenomenon that EA is seemingly undergoing in the past few years: people in EA tend to sometimes do not think much about their EV, and instead strive to have as much impact as possible. "Impact" is a sign-neutral term ("COVID-19 had a large impact on international travel"). It's very concerning that many people in EA now use it interchangeably with "EV", as if EA interventions in anthropogenic x-risk domains cannot possibly be harmful. One can call this phenomenon "sign neglect".

Having a major EA organization named "EV" (as an acronym for something that is not "expected value") may exacerbate this problem by further decreasing the usage of the term "EV", and making people use sign-neutral language instead.

Comment by ofer on How to disclose a new x-risk? · 2022-08-24T15:00:04.870Z · EA · GW

It's a very important question.

However, it probably doesn't make sense to keep this information to oneself since other people can begin to work on research and mitigation if they are aware of the risk.

I don't think this is always the case. In anthropogenic x-risk domains, it can be very hard to decrease the chance of an existential catastrophe from a certain technology, and very easy to inadvertently increase it (by drawing attention to an info hazard). Even if the researchers (within EA) are very successful, their work can easily be ignored by the relevant actors in the name of competitiveness ("our for-profit public-benefit company takes the risk much more seriously than the competitors, so it's better if we race full speed ahead", "regulating companies in this field would make China get that technology first", etc.).

(See also: The Vulnerable World Hypothesis.)

Comment by ofer on Announcing Encultured AI: Building a Video Game · 2022-08-18T22:01:12.651Z · EA · GW

Hi there!

Your website says:

Encultured AI is a for-profit video game company with a public benefit mission: to develop technologies promoting the long-term survival and flourishing of humanity and other sentient life.

Can you share any information about the board of directors, the investors, and governance mechanisms (if there are any) that aim to cause the company to make good decisions when facing conflicts between its financial goals and EA-aligned goals?

Comment by ofer on Risks from atomically precise manufacturing - Problem profile · 2022-08-09T20:03:16.370Z · EA · GW

Hi there!

There could be harms to engaging in work around atomically precise manufacturing. For example, if the technology would truly be harmful overall, then speeding up its development through raising interest in the topic could cause harm.

I agree. Was there a meta effort to evaluate whether the potential harms from publishing such an article ("written for an audience broadly unfamiliar with EA") outweigh the potential benefits?

Comment by ofer on Making Effective Altruism Enormous · 2022-07-25T18:50:18.382Z · EA · GW

I'm not sure what exactly we disagree on. I think we agree that it's extremely important to appreciate that [humans tend to behave in a way that is aligned with their local incentives] when considering meta interventions related to anthropogenic x-risks and EA.

Comment by ofer on Making Effective Altruism Enormous · 2022-07-25T15:38:29.744Z · EA · GW

groups of people aren't single agents

I agree. But that doesn't mean that the level of coordination and the influence of conflicts of interest in EA are not extremely important factors to consider/optimize.

deciding that the goal of a movement should be chosen even if it turns out that it is fundamentally incompatible with human, economic, and other motives leads to horrific things.

Can you explain this point further?

Comment by ofer on Making Effective Altruism Enormous · 2022-07-25T15:05:48.194Z · EA · GW

What does Effective Altruism look like if it is successful?

Ideally, a very well-coordinated group that acts like a single, rational wise, EA-aligned agent. (Rather than a poorly coordinated set of individuals who compete for resources and status by unilaterally doing/publishing impressive, risky things related to anthropogenic x-risks, while being subject to severe conflicts of interest).

Comment by ofer on Impact Markets: The Annoying Details · 2022-07-24T19:40:19.304Z · EA · GW

There's a reason society converged to loss-limited companies being the right thing to do, even though there is unlimited gain and limited downside, and that's that individuals tend to be far too risk averse.

I think the reason that states tend to allow loss-limited companies is that it causes them to have larger GDP (and thus all the good/adaptive things that are caused by having larger GDP). But loss-limited companies may be a bad thing from an EA perspective, considering that such companies may be financially incentivized to act in net-negative ways (e.g. exacerbating x-risks), especially in situations where lawmakers/regulators are lagging behind.

Comment by ofer on Impact Markets: The Annoying Details · 2022-07-22T15:48:23.670Z · EA · GW

In particular, we can let oraculars pay to mark projects as having been strongly net negative, and have this detract from the ability of those who funded that project to earn on their entire portfolio.

I think this approach has the following problems:

  1. Investors will still be risking only the total amount of money they invest in the market (or place as a collateral), while their potential gain is unlimited.
  2. People tend to avoid doing things that directly financially harm other individuals. Therefore, I expect retro funders would usually not use their power to mark a project as "ex-ante net negative", even if it was a free action and the project was clearly ex-ante net negative (let alone if the retro funders need to spend money on doing it; and if it's very hard to judge whether the project was ex-ante net negative, which seems a much more common situation).
Comment by ofer on Impact Markets: The Annoying Details · 2022-07-20T17:48:34.966Z · EA · GW

It's not that hard to pick out highly risky projects retroactively, relative to identifying them prospectively.

Do you mean that, if a project ends up being harmful we have Bayesian evidence that it was ex-ante highly risky? If so, I agree. But that fact does not alleviate the distribution mismatch problem, which is caused by the prospect of a risky project ending up going well.

Impact markets don't solve the problem of funders being able to fund harmful projects. But they don't make it differentially worse (it empowers funders generally, but I don't expect you would argue that grantmakers are net negative, so this still comes out net-positive).

If the distribution mismatch problem is not mitigated (and it seems hard to mitigate), investors are incentivized to fund high-stakes projects while regarding potential harmful outcomes as if they were neutral. (Including in anthropogenic x-risks and meta-EA domains.) That is not the case with EA funders today.

There are some other effects around cultural effects of making money flows more legible which seem possibly concerning, but I'm not super worried about negative EV projects being run.

I think this is a highly over-optimistic take about cranking up the profit-seeking lever in EA and the ability to mitigate the effects of Goodhart's law. It seems that when humans have an opportunity to make a lot of money (without breaking laws or norms) at the expense of some altruistic values, they usually behave in a way that is aligned with their local incentives (while convincing themselves it's also the altruistic thing to do).

I do think it makes sense to not rush into creating a decentralized unregulatable system on general principles of caution, as we certainly should watch the operation of a more controllable one for some time before moving towards that.

If you run a fully controlled (Web2) impact market for 6-12 months, and the market funds great projects/posts and there's no sign of trouble, will you then launch a decentralized impact market that no one can control (in which people can sell the impact of recruiting additional retro funders, and the impact of establishing that very market)?

Comment by ofer on Slowing down AI progress is an underexplored alignment strategy · 2022-07-18T08:02:07.201Z · EA · GW

It seems plausible for some useful regulation to take the form of industry self-regulation (which safety-concerned people at these companies could help advance).

Generally, I think self-regulation is usually promoted by industry actors in order to prevent actual regulation. Based on your username and a bit of internet research, you seem to be an AI Governance Research Contractor at a major AGI company. Is this correct? If so, I suggest that you disclose that affiliation on your profile bio (considering that you engage in the topic of AI regulation on this forum).

(To be clear, your comments here seem consistent with you acting in good faith and having the best intentions.)

Comment by ofer on Will Maskill Gave Up on Utilitarianism (and the rest of us are soon to follow) · 2022-07-17T18:41:17.945Z · EA · GW

I strong-downvoted due to giving me the false impression that "Of course, I'm miserable, I'm a utilitarian" was a quote from Will (haven't read the OP beyond that).

Comment by ofer on Slowing down AI progress is an underexplored alignment strategy · 2022-07-17T16:09:56.669Z · EA · GW

If AI progress slows down enough in countries were safety-concerned people are especially influential, then these countries (and their companies) will fall behind internationally in AI development. This would eliminate much/most of safety-concerned people's opportunities for impacting AI's trajectory.

There's a country-agnostic version of that argument about self-regulation: "If AGI companies in which safety-concerned people are especially influential allow safety concerns to slow down their progress towards AGI, then these companies will fall behind. This would eliminate much/most of safety-concerned people's opportunities for impacting AI's trajectory".

Therefore, without any regulation, it's not clear to what extent the presence of safety-concerned people in AGI companies will matter.

Comment by ofer on Impact Markets: The Annoying Details · 2022-07-16T17:25:11.705Z · EA · GW

It's not that hard to see when a project was at risk of having large downsides

I strongly disagree. It's often extremely hard to judge whether a project related to anthropogenic x-risks was ex-ante net-negative. For example, was the creation of OpenAI/Anthropic/CSET net-positive or net-negative (ex-ante)? How about any particular gain-of-function research effort, or the creation of any particular BSL-4 virology lab?

Given a past project that is related to anthropogenic x-risks or meta-EA, it can be extremely hard to evaluate the ex-ante potential harm that the project could have had by, for example:

  1. Potentially drawing attention to info hazards (e.g. certain exciting approaches for developing AGI).
    • If a researcher believes they came up with an impressive insight, they will probably be biased towards publishing it, even if it may draw attention to potentially dangerous information. Their career capital, future compensation and status may be on the line.
    • Here's Alexander Berger (co-CEO of OpenPhil):

      I think if you have the opposite perspective and think we live in a really vulnerable world — maybe an offense-biased world where it’s much easier to do great harm than to protect against it — I think that increasing attention to anthropogenic risks could be really dangerous in that world. Because I think not very many people, as we discussed, go around thinking about the vast future.

      If one in every 1,000 people who go around thinking about the vast future decide, “Wow, I would really hate for there to be a vast future; I would like to end it,” and if it’s just 1,000 times easier to end it than to stop it from being ended, that could be a really, really dangerous recipe where again, everybody’s well intentioned, we’re raising attention to these risks that we should reduce, but the increasing salience of it could have been net negative.

  2. Potentially "patching" a problem and preventing a non-catastrophic, highly-visible outcome that would have caused an astronomically beneficial "immune response". Here's Nick Bostrom ("lightly edited for readability"):

    Small and medium scale catastrophe prevention? Also looks good. So global catastrophic risks falling short of existential risk. Again, very difficult to know the sign of that. Here we are bracketing leverage at all, even just knowing whether we would want more or less, if we could get it for free, it’s non-obvious. On the one hand, small-scale catastrophes might create an immune response that makes us better, puts in place better safeguards, and stuff like that, that could protect us from the big stuff. If we’re thinking about medium-scale catastrophes that could cause civilizational collapse, large by ordinary standards but only medium-scale in comparison to existential catastrophes, which are large in this context, again, it is not totally obvious what the sign of that is: there’s a lot more work to be done to try to figure that out. If recovery looks very likely, you might then have guesses as to whether the recovered civilization would be more likely to avoid existential catastrophe having gone through this experience or not.

  3. Potentially causing decision makers to have a false sense of security.
    • For example, perhaps it's not feasible to solve AI alignment in a competitive way without strong coordination, etcetera. But researchers are biased towards saying good things about their field, their colleagues and their (potential) employers.
  4. Potentially accelerating progress in AI capabilities in a certain way.
  5. Potentially intensifying the competition dynamics among AI labs / states.
  6. Potentially decreasing the EV of the EA community by exacerbating bad incentives and conflicts of interest, and by reducing coordination.
    • For example, by creating impact markets.
  7. Potentially causing accidental harm via outreach campaigns or regulation advocacy (e.g. by causing people to get a bad first impression of something important).
  8. Potentially causing a catastrophic leak from a virology lab.

You wrote:

unless the early funders irrationally expect oraculars to buy up bad EV bets which paid off.

Depending on the implementation of the impact market, it may be rational to expect that many retro funders will buy the impact of ex-ante net-negative projects that ended up being beneficial. Especially if the impact market is decentralized and cannot be controlled by anyone, and if it allows people to profit from recruiting new retro funders who are not very careful. For more important arguments about this point, see the section "Mitigating the risk is hard" in our post.

As for "weak consensus", we have Scott Alexander, Paul Christiano, and Eliezer Yudkowsky coming down on the side of "Yes, retrofunding is great". I'm not sure how that could be seen as anything other than strong consensus of key thought leaders,

The statement "retrofunding is great" is very vague. AFAIK, none of the people you mentioned gave a blank endorsement for all possible efforts to create an impact market (including a decentralized market that no one can control). There should be a consensus in EA about a specific potential intervention to create an impact market, before it is decided to carry out that intervention. Also, the EA community is large, so it's wrong to claim that there is a "strong consensus of key thought leaders" for doing something risky because a few brilliant, high-status people wrote positive things about it (and especially if they wrote those things before there was substantial discourse about the downside risks).

not to mention the many other people who've thought carefully about this and decided it is one of the most important interventions to improve the future.

Who are you referring to here?

Comment by ofer on Impact Markets: The Annoying Details · 2022-07-15T23:32:57.729Z · EA · GW

Ofer and Owen Cotton-Barrett have discussed this here. Their conclusion is that final oracular funders should take this into account when deciding who to grant impact certificates to.

Speaking only for myself, I don't see that as my conclusion. Here are some excerpts from our post:

even if everyone believes that a certain project is net-negative, its certificates may be traded for a high price due to the chance that the project will end up being beneficial.

.

It seems especially important to prevent the risk from materializing in the domains of anthropogenic x-risks and meta-EA. Many projects in those domains can cause a lot of accidental harm because, for example, they can draw attention to info hazards, produce harmful outreach campaigns, produce dangerous experiments (e.g. in machine learning or virology), shorten AI timelines, intensify competition dynamics among AI labs, etcetera.

.

Currently, launching impact markets seems to us (non-robustly) net-negative. The following types of impact markets seems especially concerning:

  • Decentralized impact markets (in which there are no accountable decision makers that can control or shut down the market).
  • [...]

.

In any case, launching an impact market should not be done without (weak) consensus among the EA community, in order to avoid the unilateralist's curse.

.

To avoid tricky conflicts of interest, work to establish impact markets should only ever be funded in forward-looking ways. Retro funders should commit to not buying impact of work that led to impact markets (at least work before the time when the incentivization of net-negative projects has been robustly cleared up, if it ever is). EA should socially disapprove of anyone who did work on impact markets trying to sell impact of that work.

You wrote:

but at least figuring out whether there were potential bad effects sounds like a somewhat easier problem than figuring out whether something will work.

I think the oppositive may often be true. Quoting from our post:

Suppose that a risky project that is ex-ante net-negative ends up being beneficial. If retro funders attempt to evaluate it after it already ended up being beneficial, hindsight bias can easily cause them to overestimate its ex-ante EV. This phenomenon can make the certificates of net-negative projects more appealing to investors, already at an early stage of the project (before it is known whether the project will end up being beneficial or harmful).

You wrote:

Advantage: this is how we do everything else under capitalism, and it usually sort of works.

I'm not sure what "usually sort of works" means, from an EA perspective. Capitalism seems to be a contributing factor to x-risks from AI, for example.

This could also be handled at the market level, by refusing to list certificates/tokens in projects that were especially likely to be high negative-impact - although, again, this requires market officials to be smart, which is a strong requirement.

This solution also requires that the impact market will not be a decentralized market that no one can control. It also requires to sufficiently prevent the "market officials" from being influenced by conflicts of interest (e.g. in case they'll be able to sell the impact of their work to establish/control an impact market on that very market.)

Conclusion

Impact markets may incentivize people to carry out net-negative projects in anthropogenic x-risk domains, using EA funding. Conditional on impact markets being astronomically influential, the risk of them incentivizing net-negative projects can cause the entire EA movement to be net-negative. It's not a minor issue that we should just keep in mind while moving ahead with impact markets. As an analogy, suppose the NIH would prepare a long policy analysis document about how to fund more effective BSL-4 virology labs, with a section at the end titled: "Appendix II: safety issues".

Comment by ofer on High-risk high-reward philanthropy: applying venture capital concepts to doing good · 2022-07-05T13:12:21.352Z · EA · GW

Hi there!

I think the OP uses the term "risk" to denote only potential outcomes in which an intervention ends up being neutral (and thus the money that was used to fund it ends up being functionally "wasted"). But in the domains of anthropogenic x-risks and meta-EA, many impactful interventions can easily end up being harmful because, for example, they can draw attention to info hazards, produce harmful outreach campaigns, produce dangerous experiments (e.g. in machine learning or virology), shorten AI timelines, intensify competition dynamics among AI labs, etcetera.

In the for-profit world, a limited liability company will generally not be worth to its shareholders less than nothing, even if it ends up causing a lot of harm. Relatedly, the "prospecting for gold" metaphor for EA-motivated hits-based giving is problematic, because it's impossible to find a negative amount of gold, while it is possible to accidentally increase the chance of an existential catastrophe.

Comment by ofer on Limits to Legibility · 2022-06-30T04:41:33.724Z · EA · GW

First, you increase the pressure on the "justification generator" to mask various black boxes by generating arguments supporting their conclusions.

.

Third, there's a risk that people get convinced based on bad arguments - because their "justification generator" generated a weak legible explanation, you managed to refute it, and they updated. The problem comes if this involves discarding the output of the neural network, which was much smarter than the reasoning they accepted.

On the other hand, if someone in EA is making decisions about high-stakes interventions while their judgement is being influenced by a subconscious optimization for things like status and power, I think it's probably beneficial to subject their "justification generator" to a lot of pressure (in the hope that that will cause them, and onlookers, to end up making the best decisions from an EA perspective).

Comment by ofer on Announcing Epoch: A research organization investigating the road to Transformative AI · 2022-06-29T05:51:36.263Z · EA · GW

Hey there!            

Can you describe your meta process for deciding what analyses to work on and how to communicate them? Analyses about the future development of transformative AI can be extremely beneficial (including via publishing them and getting many people more informed). But getting many people more hyped about scaling up ML models, for example, can also be counterproductive. Notably, The Economist article that you linked to shows your work under the title "The blessings of scale". (I'm not making here a claim that that particular article is net-negative; just that the meta process above is very important.)

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-27T19:00:14.946Z · EA · GW

First of all, what we’ve summarized as “curation” so far could really be distinguished as follows:

  1. Making access for issuers invite-only, maybe keeping the whole marketplace secret (in combination with #2) until we find someone who produces cool papers/articles and who we trust and then invite them.
  2. Making access for investors/retro funders invite-only, maybe keeping the whole marketplace secret (in combination with #1) until we find an impact investor or a retro funder who we trust and then invite them.
  3. Read every certificate either before or shortly after it is published. (In combination with exposé certificates in case we make a mistake.)

Let’s say #3 is a given. Do you think the marketplace would fulfill your safety requirements if only #1, only #2, or both were added to it?

An impact market with invite-only access for issuers and investors seems safer than otherwise. But will that be a temporary phase after which our civilization ends up with a decentralized impact market that nobody can control or shut down, and people are incentivized to recruit as many new retro funders as they can? In the Toward Impact Markets post (March 2022) you wrote:

We are fairly convinced that the blockchain-based solution is going to be the culmination of our efforts one day, but we’re ambivalent over which MVP will allow us to test the market more quickly and productively.

That came after the sentence "A web2 solution like that would have a few advantages too:", after which you listed three advantages that have nothing to do with safety.

But if you enact some security measures to keep them out, you quickly reach the point where the bazaar is less attractive than the alternatives. At that point you already have no effect anymore on how much theft there is going on in the world in aggregate.

I don't think the analogy works. Right now, there seems to be no large-scale retroactive funding mechanisms for anthropogenic x-risk interventions. Launching an impact market can change that. An issuer/investor/funder who will use your impact market would probably not use Twitter or anything else to deal with retroactive funding if you did not launch your impact market. The distribution mismatch problem applies to those people. (In your analogy there's a dichotomy of good people vs. thieves, which has no clear counterpart in the domain of retroactive funding.) Also, if your success inspires others to launch/join competing impact markets, you can end up increasing the number of people who use the other markets.

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-27T12:23:17.641Z · EA · GW

The thing I'm looking for is the comparison between the benefits and the costs; are the costs larger?

Efficient impact markets would allow anyone to create certificates for a project and then sell them for a price that corresponds to a very good prediction of their expected future value. Therefore, sufficiently efficient impact markets will probably fund some high EV projects that wouldn't otherwise be funded (because it's not easy for classical EA funders to evaluate them or even find them in the space of possible projects). If we look at that set of projects in isolation, we can regard it as the main upside of creating the impact market. The problem is that the market does not reliably distinguish between those high EV projects and net-negative projects, because a potential outcome that is extremely harmful affect the expected future value of the certificate as if the outcome were neutral.

Suppose is a "random" project that has a substantial chance to prevent an existential catastrophe. If you believe that the EV of is much smaller than the EV of conditional on not causing a harmful outcome, then you should be very skeptical about impact markets. Finally, we should consider that if a project is funded if and only if impact markets exist then it means that no classical EA funder would fund it in a world without impact markets, and thus it seems more likely than otherwise to be net-negative.

Sure, I buy that adverse selection can make things worse; my guess was that the hope was that classical EA funders would also operate thru the market.

(Even if all EA funders switched to operate solely as retro funders in impact markets, I think it would still be true that an intervention that gets funded by an impact market—and wouldn't get funded in a world without impact markets—seems more likely than otherwise to be net-negative.)

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-27T04:34:20.385Z · EA · GW

We would never submit our own certificates to a prize contest that we are judging, but we’d also be open to not submitting any of our impact market–related work to any other prize contests if that’s what consensus comes to.

Does this mean that you (the Impact Markets team) may sell certificates of your work to establish an impact market on that very impact market?

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-26T13:19:45.896Z · EA · GW

I do not endorse the text written by "Imagined Ofer" here. Rather than describing all the differences between that text and what I would really say, I've now published this reply to your first comment.

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-26T13:11:40.336Z · EA · GW

Web3: Seems about as bad as any web2 solution that allows people to easily back up their data.

I think that a decentralized impact market that can't be controlled or shut down seems worse. Also, a Web3 platform will make it less effortful for someone to launch a competing platform (either with or without the certificates from the original platform).

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-26T12:56:31.247Z · EA · GW

But abandoning the project of impact markets because of the downsides seems about as misguided to us as abandoning self-driving cars because of adversarial-example attacks on street signs.

I think the analogy would work better if self-driving cars did risky things that could cause a terrible accident, in order to prevent the battery from running out reach the destination sooner.

Attributed Impact may look complicated but we’ve just operationalized something that is intuitively obvious to most EAs – expectational consequentialism. (And moral trade and something broadly akin to UDT.)

I think the following concern (quoted from the OP) is still relevant here:

  • For that approach to succeed, retro funders must be familiar with it and be sufficiently willing and able to adhere to it. However, some potential retro funders are more likely to use a much simpler approach, such as "you should buy impact that you like".
    • Other things being equal, simpler approaches are easier to communicate, more appealing to potential retro funders, more prone to become a meme and a norm, and more likely to be advocated for by teams who work on impact markets and want to get more traction.

You later wrote:

We may sometimes have to explain why it sets bad incentives to fund projects that were net-negative in ex ante expectation to start, but the more sophisticated the funder is, the less likely it is that we need to expound on this.

Does your current plan not involve explaining to all the retro funders that that they should consider the ex-ante EV as an upper bound?

We already can’t prevent anyone from becoming a retro funder. Anyone with money and a sizable Twitter following can reward people for any contributions that they so happen to want to reward them for – be it AI safety papers or how-tos for growing viruses.

I don't see how this argument works. Given that a naive impact market incentivizes people to treat extremely harmful outcomes as if they were neutral (when deciding what projects to do/fund), why should your above argument cause an update towards the view that launching a certain impact market is net-positive? How does the potential harm that other people can cause via Twitter etc. make launching a certain impact market be a better idea than it would otherwise be?

We think that very few investors will put significant money into a project that is not clearly in line with what major retro funders already explicitly profess to want to retro-fund only because there may later be someone who does.

Why? Conditional on impact markets gaining a lot of traction and retro funders spending billions of dollars in impact markets 5 years from now, why wouldn't it make sense to buy many certificates of risky projects that might end up being extremely beneficial (according to at least one relevant future retro funder)?

An important safety mechanism that we have already started implementing is to reward solutions to problems with impact markets. A general ban on using such rewards would remove this promising mechanism.

Do you intent to allow people to profit from outreach interventions that attract new retro funders? (i.e. by allowing people to sell certificates of such outreach interventions.)

“A naive implementation of this idea would incentivize people to launch a safe project and later expand it to include high-risk high-reward interventions” – That would have to be a very naive implementation because if the actual project is different from the project certified in the certificate, then the certificate does not describe it. It’s a certificate for a different project that failed to happen.

I disagree. I think this risk can easily materialize if the description of the certificate is not very specific (and in particular if it's about starting an organization, without listing specific interventions.)

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-26T07:46:51.788Z · EA · GW

(3) declaring the impact certificates not burned and allowing people some time to export their data.

That could make it easier for another team to create a new impact market that will seamlessly replace the impact market that is being shut down.

My original idea from summer 2021 was to use blockchain technology simply for technical ease of implementation (I wouldn’t have had to write any code). That would’ve made the certs random tokens among millions of others on the blockchain. But then to set up a centralized, curated marketplace for them with a smart and EA curation team.

[…]

But what do you think about the original idea? I don’t think it's so different from a fully centralized solution where you allow people to export their data or at least not prevent them from copy-pasting their certs and ledgers to back them up.

If a decentralized impact market gains a lot of traction, I don't see how the certificates being "tokens among millions of others" helps. A particular curated gallery can end up being ignored by some/most market participants (and perhaps be outcompeted by another, less scrupulous curated gallery).

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-26T06:42:51.516Z · EA · GW

[Limited liability] is a historically unusual policy (full liability came first), and seems to me to have basically the same downsides (people do risky things, profiting if they win and walking away if they lose), and basically the same upsides (according to the theory supporting LLCs, there's too little investment and support of novel projects).

Can you explain the "same upsides" part?

Can you say more about why you think this consideration is sufficient to be net negative? (I notice your post seems very 'do-no-harm' to me instead of 'here are the positive and negative effects, and we think the negative effects are larger', [...]

I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it's very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations). The EA community causes activities in anthropogenic x-risk domains, and it's extremely important that it will differentially cause net-positive activities. This is something we should optimize for rather than regard as an axiom. Therefore, we should be very wary of funding mechanisms that incentivize people to treat extremely harmful outcomes as if they were neutral (when making decisions about doing/funding projects that are related to anthropogenic x-risks).

[EDIT: Also, interventions that are carried out if and only if impact markets fund them seem selected for being more likely than otherwise to be net-negative, because they are ones that no classical EA funder would fund.]

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-25T12:26:13.681Z · EA · GW

seems like the AF disagrees about this being a problem.. no?

(Not an important point [EDIT: meaning the text you are reading in these parentheses], but I don't think that a karma of 18 points is a proof for that; maybe the people who took the time to go over that post and vote are mostly amateurs who found the topic interesting. Also, as an aside, if someone one day publishes a brilliant insight about how to develop AGI much faster, taking the post down can be net-negative due to the Streisand effect).

I'm confident that almost all the alignment researchers on Earth will agree with the following statement: conditional on such a post having a transformative impact, it is plausible [EDIT: >10% credence] that the post will end up having have an extremely harmful impact. [EDIT: "transformative impact" here means impact that is either extremely negative or extremely positive.] I argue that we should be very skeptical about potential funding mechanisms that incentivize people to treat "extremely harmful impact" here as if it were "neutral impact". A naive impact market is such a funding mechanism.

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-25T08:21:14.305Z · EA · GW

I expect this will reduce the price at which OpenAI is traded

But an impact market can still make OpenAI's certificates be worth $100M if, for example, investors have at least 10% credence in some future retro funder being willing to buy them for $1B (+interest). And that could be true even if everyone today believed that creating OpenAI is net-negative. See the "Mitigating the risk is hard" section in the OP for some additional reasons to be skeptical about such an approach.

I missed what you're replying to though. Is it the "The problem of funding net-negative projects exists also now." ?

Yes. You respond to examples of potential harm that impact markets can cause by pointing out that these things can happen even without impact markets. I don't see why these arguments should be more convincing than the flipped argument: "everything that impact markets can fund can already be funded in other ways, so we don't need impact markets". (Again, I'm not saying that the flipped argument makes sense.)

Your overall view seems to be something like: we should just create an impact market and if it causes harm then the retro funders will notice and stop buying certificates (or they will stop buying some particular certificates that are net-negative to buy). I disagree with this view because:

  1. There is a dire lack of feedback signal in the realm of x-risk mitigation. It's usually very hard to judge whether a given intervention was net-positive or net-negative. It's not just a matter of asking CEA / LW / anyone else what they think about a particular intervention, because usually no one on Earth can do a reliable, robust evaluation. (e.g. is the creation of OpenAI/Anthropic net positive or net negative?) So, if you buy the core argument in the OP (about how naive impact markets incentivize people to carry out interventions without considering potential outcomes that are extremely harmful), I think that you shouldn't create an impact market and rely on some unspecified future feedback signals to make retro funders stop buying certificates in a net-negative way at some unspecified point in the future.
  2. As I argued in the grandparent comment, we should expect the things that people in EA say about the impact of others in EA to be positively biased.

All the above assumes that by "retro funders" here you mean a set of carefully appointed Final Buyers. If instead we're talking about an impact market where anyone can become a retro funder, and retro funders can resell their impact to arbitrary future retro funders, I think things would go worse in expectation (see the first three points in the section "Mitigating the risk is hard" in the OP).

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-25T08:14:39.912Z · EA · GW

It's just an example for how a post on the alignment forum can be net-negative and how it can be very hard to judge whether it's net-negative. For any net-negative intervention that impact markets would incentivize, if people can do it without funding then the incentive to do impressive things can also cause them to carry out the intervention. In those cases, impact markets can cause those interventions to be more likely to be carried out.

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-23T22:43:12.076Z · EA · GW

I think that most interventions that have a substantial chance to prevent an existential catastrophe also have a substantial chance to cause an existential catastrophe, such that it's very hard to judge whether they are net-positive or net-negative (due to complex cluelessness dynamics that are caused by many known and unknown crucial considerations).

My model of you would say either that:

  1. funding those particular posts is net bad, or
  2. funding those two posts in particular may be net good, but it sets a precedent that will cause there to be further counterfactual AI safety posts on EA Forum due to retroactive funding, which is net bad, or
  3. posts on the EA Forum/LW/Alignment Forum being further incentivized would be net good (minus stuff such as infohazards, etc), but a more mature impact market at scale risks funding the next OpenAI or other such capabilities project, therefore it's not worth retroactively funding forum posts if it risks causing that.

My best guess is that those particular two posts are net-positive (I haven't read them entirely / at all). Of course, this does not imply that it's net-positive to use these posts in a way that leads to the creation of an impact market.

In (3) you wrote "posts on the EA Forum/LW/Alignment Forum […] (minus stuff such as infohazards, etc)". I think this description essentially assumes the problem away. Posts are merely information in a written form, so if you exclude all the posts that contain harmful information (i.e. info hazards), the remaining posts are by definition not net-negative. The hard part is to tell which posts are net-negative. (Or more generally, which interventions/projects are net-negative.)

My model of you says this certificate is net-negative. I would agree that it may be an example of the sort of situation where some people believe a project is a positive externality and some believe it's a negative externality, but the mismatch distribution means it's valuated positively by a marketplace that can observe the presence of information but not its absence. Or maybe the market thinks riskier stuff may win the confidence game. 'Variance is sexy'. This is a very provisional thought and not anything I would clearly endorse;

The distribution mismatch problem is not caused by different people judging the EV differently. It would be relevant even if everyone in the world was in the same epistemic state. The problem is that if a project ends up being extremely harmful, its certificates end up being worth $0, same as if it ended up being neutral. Therefore, when market participants who follow their local financial incentives evaluate a project, they treat potential outcomes that are extremely harmful as if they were neutral. I'm happy to discuss this point further if you don't agree with it. It's the core argument in the OP, so I want to first reach an agreement about it before discussing possible courses of action.

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-23T14:14:21.723Z · EA · GW

If someone wants to advance AI capabilities, they can already get prospective funding by opening a regular for-profit startup.

No?

Right. But without an impact market it can be impossible to profit from, say, publishing a post with a potentially transformative insight about AGI development. (See this post as a probably-harmless-version of the type of posts I'm talking about here.)

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-23T13:37:57.141Z · EA · GW

If someone thinks a net-negative project is being traded on (or run at all), how about posting about it on the forum?

As we wrote in the post, even if everyone believes that a certain project is net-negative, its certificates may be traded for a high price due to the chance that the project will end up being beneficial. For example, consider OpenAI (I'm not making here a claim that OpenAI is net-negative, but it seems that many people in EA think it is, and for the sake of this example let's imagine that everyone in EA think that). It's plausible that OpenAI will end up being extremely beneficial. Therefore, if a naive impact market had existed when OpenAI was created, it's likely that the market would have helped in funding its creation (i.e. OpneAI's certificates would have been traded for a high price).

Also, it seems that people in EA (and in academia/industry in general) usually avoid saying bad things publicly about others' work (due to reasons that are hard to nullify). Another point to consider is that saying that a project is net-negative publicly can sometimes in itself be net-negative due to drawing attention to info hazards. (e.g. "The experiment that Alice is working on is dangerous!")

The problem of funding net-negative projects exists also now.

As I already wrote in a reply to Austin, impact markets can incentivize/fund net-negative projects that are not currently of interest to for-profit investors. For example, today it can be impossible for someone to make a huge amount of money by launching an aggressive outreach campaign to make people join EA, or publishing a list of "the most dangerous ongoing experiments in virology that we should advocate to stop"; which are interventions that may be net-negative. (Also, in cases where both impact markets and existing mechanisms incentivize a project, one can flip your argument and say that the solution to funding net-positive projects already exist and so we don't need impact markets. To be clear, I'm not making that argument, I'm just trying to show that the original one is wrong.)

This is a kind of project that we can stop or change if we want to. There is a lot of human discretion. This is not like adding a government regulation that will be very hard to change, or launching a blockchain that you can pretty much never take back no matter what you do.

Shutting down an impact market, if successful, functionally means burning all the certificates that are owned by the market participants, who may have already spent a lot of resources and time in the hope to profit from selling their certificates in the future. Obviously, that may not be an easy action for the decision makers to take. Also, if the decision makers have conflicts of interest with respect to shutting down the market, things are even more problematic (which is an important topic that is discussed in the post.) [EDIT: Also, my understanding is that there was (and perhaps still is) an intention to launch a decentralized impact market (i.e. Web3 based), which can be impossible to shut down.]

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-23T12:11:16.085Z · EA · GW

I think that it's more likely to be the result of an effort to mitigate potential harm from future pandemics. One piece of evidence that supports this is the grant proposal, which was rejected by DARPA, that is described in this New Yorker article. The grant proposal was co-submitted by the president of the EcoHealth Alliance, a non-profit which is "dedicated to mitigating the emergence of infectious diseases", according to the article.

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-23T09:47:48.870Z · EA · GW

I find it hard to believe that any version of the lab leak theory involved all the main actors scrupulously doing what they thought was best for the world.

I don't find it hard to believe at all. Conditional on a lab leak, I'm pretty confident no one involved was consciously thinking: "if we do this experiment it can end up causing a horrible pandemic, but on the other hand we can get a lot of citations."

Dangerous experiments in virology are probably usually done in a way that involves a substantial amount of effort to prevent accidental harm. It's not obvious that virologists who are working on dangerous experiments tend to behave much less scrupulously than people in EA who are working for Anthropic, for example. (I'm not making here a claim that such virologists or such people in EA are doing net-negative things.)

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-22T18:59:20.270Z · EA · GW

Unless ~several people in EA had an opportunity to talk to that billionaire, I don't think this is an example of the unilateralist's curse (regardless of whether it was net negative for you to talk to them).

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-22T18:55:14.802Z · EA · GW

Is there any real-world evidence of the unilateralist's curse being realised?

If COVID-19 is a result of a lab leak that occurred while conducting a certain type of experiment (for the purpose of preventing future pandemics), perhaps many people considered conducting/funding such experiments and almost all of them decided not to.

My sense historically is that this sort of reasoning to date has been almost entirely hypothetical

I think we should be careful with arguments that such and such existential risk factor is entirely hypothetical. Causal chains that end in an existential catastrophe are entirely hypothetical and our goal is to keep them that way.

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-22T12:26:09.922Z · EA · GW

I messed up when writing that comment (see the EDIT block).

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-22T00:09:29.351Z · EA · GW

I think people following local financial incentives is always going to happen, and the point of an impact market is to structure financial incentives to be aligned with what the EA community broadly thinks is good.

It may be useful to think about it this way: Suppose an impact market is launched (without any safety mechanisms) and $10M of EA funding are pledged to be used for buying certificates as final buyers 5 years from now. No other final buyers join the market. The creation of the market causes some set of projects X to be funded and some other set of project Y to not get funded (due to the opportunity cost of those $10M). We should ask: is [the EV of X minus the EV of Y] positive or negative? I tentatively think it's negative. The projects in Y would have been judged by the funder to have positive ex-ante EV, while the projects in X got funded because they had a chance to end up having a high ex-post EV.

Also, I think complex cluelessness is a common phenomenon in the realms of anthropogenic x-risks and meta-EA. It seems that interventions that have a substantial chance to prevent existential catastrophes usually have an EV that is much closer to 0 than we would otherwise think due to also having a chance to cause an existential catastrophe. Therefore, the EV of Y seems much closer to 0 than the EV of X (assuming that the EV of X is not 0).

[EDIT: adding the text below.]

Sorry, I messed up when writing this comment (I wrote it at 03:00 am...). Firstly, I confused X and Y in the sentence that I now crossed out. But more fundamentally: I tentatively think that the EV of X is negative (rather than positive but smaller than the EV of Y), because the projects in X are ones that no funder in EA decides to fund (in a world without impact markets). Therefore, letting an impact market fund a project in X seems even worse than falling into the regular unilateralist's curse, because here there need not be even a single person who thinks that the project is (ex-ante) a good idea.

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-21T20:25:33.098Z · EA · GW

or setting up a system to short sell different projects.

I don't think that short selling would work. Suppose a net-negative project has a 10% chance to end up being beneficial, in which case its certificates will be worth $1M (and otherwise the certificates will end up being worth $0). Therefore, the certificates are worth today $100K in expectation. If someone shorts the certificates as if they are worth less than that, they will lose money in expectation.

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-21T17:14:17.756Z · EA · GW

Furthermore, people looking to make money are already funding net negative companies due to essentially the same problems (companies have non-negative evaluations), so shifting them towards impact markets could be good, if impact markets have better projects than existing markets on average.

See my reply to Austin.

Comment by ofer on Impact markets may incentivize predictably net-negative projects · 2022-06-21T17:07:10.927Z · EA · GW

Hm, naively - is this any different than the risks of net-negative projects in the for-profit startup funding markets? If not, I don't think this a unique reason to avoid impact markets.

Impact markets can incentivize/fund net-negative projects that are not currently of interest to for-profit investors. For example, today it can be impossible for someone to make a huge amount of money by launching an aggressive outreach campaign to make people join EA, or publishing a list of "the most dangerous ongoing experiments in virology that we should advocate to stop"; which are interventions that may be net-negative. (Also, in cases where both impact markets and classic classical for-profit investors incentivize a project, one can flip your statement and say that there's no unique reason to launch impact markets; I'm not sure that "uniqueness" is the right thing to look at.)

Finally: on a meta level, the amount of risk you're willing to spend on trying new funding mechanisms with potential downsides should basically be proportional to the amount of risk you see in our society at the moment. Basically, if you think existing funding mechanisms are doing a good job, and we're likely to get through the hinge of history safely, then new mechanisms are to be avoided and we want to stay the course. (That's not my current read of our xrisk situation, but would love to be convinced otherwise!)

[EDIT: removed unnecessary text.] I tentatively think that launching impact markets seem worse than a "random" change to the world's trajectory. Conditional on an existential catastrophe occurring, I think there's a high substantial chance that the catastrophe will be caused by individuals who followed their local financial incentives. We should be cautious about pushing the world (and EA especially) further towards the "big things happen due to individuals following their local financial incentives" dynamics.

Comment by ofer on Expected ethical value of a career in AI safety · 2022-06-15T07:11:49.891Z · EA · GW

I added an EDIT block in the first paragraph after quoting you (I've misinterpreted your sentence).

Comment by ofer on Expected ethical value of a career in AI safety · 2022-06-14T18:01:26.070Z · EA · GW

Hey there!

the AI safety research seems unlikely to have strong enough negative unexpected consequences to outweigh the positive ones in expectation.

The word "unexpected" sort of makes that sentence trivially true. If we remove it, I'm not sure the sentence is true. [EDIT: while writing this I misinterpreted the sentence as: "AI safety research seems unlikely to end up causing more harm than good"] Some of the things to consider (written quickly, plausibly contains errors, not a complete list):

  • The AIS field (and the competition between AIS researchers) can cause decision makers to have a false sense of safety. It can be the case that it's not feasible to solve AI alignment in a competitive way without strong coordination etc. But researchers are biased towards saying good things about the field, their colleagues and their (potential) employers. AIS researchers can make more people be more inclined to pursue capabilities research (which can contribute to race dynamics). Here's Alexander Berger:

[Michael Nielsen] has tweeted about how he thinks one of the biggest impacts of EA concerns with AI x-risk was to cause the creation of DeepMind and OpenAI, and to accelerate overall AI progress. I’m not saying that he’s necessarily right, and I’m not saying that that is clearly bad from an existential risk perspective, I’m just saying that strikes me as a way in which well-meaning increasing salience and awareness of risks could have turned out to be harmful in a way that has not been… I haven’t seen that get a lot of grappling or attention from the EA community. I think you could tell obvious parallels around how talking a lot about biorisk could turn out to be a really bad idea.

And here's the CEO of Conjecture (59:50) [EDIT: this is from 2020, probably before Conjecture was created]:

If you're a really good machine learning engineer, consider working for OpenAI consider working for DeepMind, or someone else with good safety teams.

  • AIS work can "patch" small scale problems that might otherwise make our civilization better at avoiding some existential catastrophes. Here's Nick Bostrom:

On the one hand, small-scale catastrophes might create an immune response that makes us better, puts in place better safeguards, and stuff like that, that could protect us from the big stuff. If we’re thinking about medium-scale catastrophes that could cause civilizational collapse, large by ordinary standards but only medium-scale in comparison to existential catastrophes, which are large in this context, again, it is not totally obvious what the sign of that is: there’s a lot more work to be done to try to figure that out. If recovery looks very likely, you might then have guesses as to whether the recovered civilization would be more likely to avoid existential catastrophe having gone through this experience or not.

  • The AIS field (and the competition between AIS researchers) can cause dissemination of info hazards. If a researcher thinks they came up with an impressive insight they will probably be biased towards publishing it, even if it may draw attention to potentially dangerous information. Their career capital, future compensation and status may be on the line. Here's Alexander Berger again:

I think if you have the opposite perspective and think we live in a really vulnerable world — maybe an offense-biased world where it’s much easier to do great harm than to protect against it — I think that increasing attention to anthropogenic risks could be really dangerous in that world. Because I think not very many people, as we discussed, go around thinking about the vast future. If one in every 1,000 people who go around thinking about the vast future decide, “Wow, I would really hate for there to be a vast future; I would like to end it,” and if it’s just 1,000 times easier to end it than to stop it from being ended, that could be a really, really dangerous recipe where again, everybody’s well intentioned, we’re raising attention to these risks that we should reduce, but the increasing salience of it could have been net negative.

Comment by ofer on Unflattering reasons why I'm attracted to EA · 2022-06-03T18:40:24.441Z · EA · GW

If we shame each other for using our EA activities to make friends, find mates, raise status, make a living, or feel good about ourselves, we undermine EA.

This seems plausible. On the other hand, it may be important to be nuanced here. In the realms of anthropogenic x-risks and meta-EA, it is often very hard to judge whether a given intervention is net-positive or net-negative. Conflicts of interest can cause people to be less likely to make good decisions from an EA perspective.

Comment by ofer on Experiment in Retroactive Funding: An EA Forum Prize Contest · 2022-06-02T17:29:38.350Z · EA · GW

In the original EA Forum Prize, the ex-post EV at the time of evaluation is usually similar to the ex-ante EV assuming that the evaluation happens closely after the post was written. (In a naive impact market, the price of a certificate can be high due to the chance that 3 years from now its ex-post EV will be extremely high.)

Comment by ofer on Experiment in Retroactive Funding: An EA Forum Prize Contest · 2022-06-02T16:36:11.905Z · EA · GW

The original EA Forum Prize does not seem to have had the distribution mismatch problem; the posts were presumably evaluated based on their ex-ante EV (or something like that?).

Comment by ofer on Experiment in Retroactive Funding: An EA Forum Prize Contest · 2022-06-02T16:31:21.594Z · EA · GW

Thanks for the info!

If the shareholders of the public benefit corporation will be able to receive dividends, I think there's a conflict of interest problem with this setup. The Impact Markets team will probably need to make high-stakes decisions under great uncertainty. (E.g. should an impact market be launched? Should the impact market be decentralized? Should a certain person be invited to serve as a retro funder? How to navigate the tradeoff between explaining the safety rules thoroughly and writing more engaging posts that are more conducive to gaining traction?) It's a big conflict of interest problem if the decision makers can end up making a lot of money via a (future) impact market due to making certain decisions.

Therefore, I think it's better to commit to "consume/open" (i.e. never sell) the certificates that you purchase with the grant.