Key Papers in Language Model Safety 2022-06-20T14:59:17.539Z
Yudkowsky and Christiano on AI Takeoff Speeds [LINKPOST] 2022-04-05T00:57:29.048Z
[Linkpost] Millions face starvation in Afghanistan 2021-12-04T23:46:48.001Z
Plan Your Career on Paper 2021-09-23T15:04:46.139Z
aogara's Shortform 2021-01-19T01:20:48.155Z
Best resources for introducing longtermism and AI risk? 2020-07-16T17:27:29.533Z
How to find good 1-1 conversations at EAGx Virtual 2020-06-12T16:30:26.280Z


Comment by aogara (Aidan O'Gara) on Some unfun lessons I learned as a junior grantmaker · 2022-06-25T02:37:45.290Z · EA · GW

This is a great set of guidelines for integrity. Hopefully more grantmakers and other key individuals will take this point of view.

I’d still be interested in hearing how the existing level of COIs affects your judgement of EA epistemics. I think your motivated reasoning critique of EA is the strongest argument that current EA priorities do not accurately represent the most impactful causes available. I still think EA is the best bet available for maximizing my expected impact, but I have baseline uncertainty that many EA beliefs might be incorrect because they’re the result of imperfect processes with plenty of biases and failure modes. It’s a very hard topic to discuss, but I think it’s worth exploring (a) how to limit our epistemic risks and (b) how to discount our reasoning in light of those risks.

Comment by aogara (Aidan O'Gara) on How to become more agentic, by GPT-EA-Forum-v1 · 2022-06-20T21:25:34.854Z · EA · GW

This advice totally applies here:

Good luck with your projects, hope you’re feeling better soon.

Comment by aogara (Aidan O'Gara) on How to become more agentic, by GPT-EA-Forum-v1 · 2022-06-20T09:16:35.555Z · EA · GW

This model performance is really impressive, and I'm glad you're interested in large language models. But I share some of Gavin's concerns, and I think it would be a great use of your time to write up a full theory of impact for this project. You could share it, get some feedback, and think about how to make this the most impactful while reducing risks of harm. 

One popular argument for short-term risks from advanced AI are the risks from AI persuasion. Beth Barnes has a great writeup, as does Daniel Kokotajlo. The most succinct case I can make is that the internet is already full of bots, they spread all kinds of harmful misinformation, they reduce trust and increase divisiveness, and we shouldn't be playing around with more advanced bots without seriously considering the possible consequences. 

I don't think anybody would make the argument that this project is literally an existential threat to humanity, but that shouldn't be the bar. Just as much as you need the technical skills of LLM training and the creativity and drive to pursue your ideas, you need to be able to faithfully and diligently evaluate the impact of your projects. I haven't thought about it nearly enough to say the final word on the project's impact, but before you keep publishing results, I would suggest spending some time to think and write about your impact. 

Comment by aogara (Aidan O'Gara) on A Critical Review of Open Philanthropy’s Bet On Criminal Justice Reform · 2022-06-19T02:10:32.772Z · EA · GW

I understood it as the combination of the 100x Multiplier discussed by Will MacAskill in Doing Good Better (referring to the idea that cash is 100x more valuable for somebody in extreme poverty than for someone in the global top 1%), and GiveWell's current bar for funding set at 8x GiveDirectly. This would mean that Open Philanthropy targets donation opportunities that are at least 800x (or more like 1000x on average) more impactful than giving that money to a rich person. 

Comment by aogara (Aidan O'Gara) on Blake Richards on Why he is Skeptical of Existential Risk from AI · 2022-06-15T05:55:19.858Z · EA · GW

Here's two quotes you might disagree with. If true, they seem like they would make us slightly more skeptical of x-risk from AI, though not countering the entire argument. 

Richards argues that lack of generality will make recursive self-improvement more difficult:

I’m less concerned about the singularity because if I have an AI system that’s really good at coding, I’m not convinced that it’s going to be good at other things. And so it’s not the case that if it produces a new AI system, that’s even better at coding, that that new system is now going to be better at other things. And that you get this runaway train of the singularity. 

Instead, what I can imagine is that you have an AI that’s really good at writing code, it generates other AI that might be good at other things. And if it generates another AI that’s really good at code, that new one is just going to be that: an AI that’s good at writing code. And maybe we can… So to some extent, we can keep getting better and better and better at producing AI systems with the help of AI systems. But a runaway train of a singularity is not something that concerns me...

The problem with that argument is that the claim is that the smarter version of itself is going to be just smarter across the board. Right? And so that’s where I get off the train. I’m like, “No, no, no, no. It’s going to be better at say programming or better at protein folding or better at causal reasoning. That doesn’t mean it’s going to be better at everything.”

He also argues that lack of generality will also make deception more difficult: 

One of the other key things for the singularity argument that I don’t buy, is that you would have an AI that then also knows how to avoid people’s potential control over it. Right? Because again, I think you’d have to create an AI that specializes in that. Or alternatively, if you’ve got the master AI that programs other AIs, it would somehow also have to have some knowledge of how to manipulate people and avoid their powers over it. Again, if it’s really good at programming, I don’t think it’s going to be able to be particularly good at manipulating people. 

These arguments at least indicate that generality is a risk factor for AI x-risk.  Forecasting whether superintelligent systems will be general or narrow seems more difficult but not impossible. Language models have already shown strong potential for both writing code and persuasion, which is a strike in favor of generality. Ditto for Gato's success across multiple domains (EDIT: Or is it? See below). More outside view arguments about the benefits or costs of using the one model for many different tasks seem mixed and don't sway my opinion much.  Curious to hear other considerations. 

Very glad to see this interview and the broader series. Engaging with more ML researchers seems like a good way to popularize AI safety and learn something in the process. 

Comment by aogara (Aidan O'Gara) on Expected ethical value of a career in AI safety · 2022-06-14T18:05:47.424Z · EA · GW

Also, low quality research or poor discussion can make it less likely that important decision makers will take AI safety seriously.

Comment by aogara (Aidan O'Gara) on Expected ethical value of a career in AI safety · 2022-06-14T17:57:36.097Z · EA · GW

Nice! I really like this analysis, particularly the opportunity to see how many present-day lives would be saved in expectation. I mostly agree with it, but two small disagreements:

First, I’d say that there are already more than 100 people working directly on AI safety, making that an unreasonable lower bound for the number of people working on it over the next 20 years. This would include most of the staff of Anthropic, Redwood, MIRI, Cohere, and CHAI; many people at OpenAI, Deepmind, CSET, and FHI; and various individuals at Berkeley, NYU, Cornell, Harvard, MIT, and elsewhere. There’s also tons of funding and field-building going on right now which should increase future contributions. This is a perennial question that deserves a more detailed analysis than this comment, but here’s some sources that might be useful:

Ben Todd would guess it’s about 100 people, so maybe my estimate was wrong:

Second, I strongly believe that most of the impact in AI safety will come from a handful of the most impactful individuals. Moreover I think it’s reasonable to make guesses about where you’ll fall in that distribution. For example, somebody with a history of published research who can get into a top PhD program has a much higher expected impact than somebody who doesn’t have strong career capital to leverage for AI safety. The question of whether you could become one of the most successful people in your field might be the most important component of personal fit and could plausibly dominate considerations of scale and neglectedness in an impact analysis.

For more analysis of the heavy-tailed nature of academic success, see:

But great post, thanks for sharing!

Comment by aogara (Aidan O'Gara) on aogara's Shortform · 2022-06-12T17:36:58.877Z · EA · GW

This (pop science) article provides two interesting critiques of the analogy between the human brain and neural nets. 

  1. "Neural nets are typically trained by “supervised learning”. This is very different from how humans typically learn. Most human learning is “unsupervised”, which means we’re not explicitly told what the “right” response is for a given stimulus. We have to work this out ourselves."
  2. "Another difference is the sheer scale of data used to train AI. The GPT-3 model was trained on 400 billion words, mostly taken from the internet. At a rate of 150 words per minute, it would take a human nearly 4,000 years to read this much text."

I'm not sure the direct implication for timelines here. You might be able to argue that these disanalogies mean that neural nets will require less compute than the  brain. But an interesting point of disanalogy, to correct any misconceptions that neural networks are "just like the brain".

Comment by aogara (Aidan O'Gara) on AI Could Defeat All Of Us Combined · 2022-06-12T00:38:02.062Z · EA · GW

Nearly impossible to answer. This report by OpenPhil gives it a hell of an effort, but could still be wrong by orders of magnitude. Most fundamentally, the amount of compute necessary for AGI might not be related to the amount of compute used by the human brain, because we don’t know how similar our algorithmic efficiency is compared to the brain’s.

Comment by aogara (Aidan O'Gara) on AI Could Defeat All Of Us Combined · 2022-06-11T21:47:41.810Z · EA · GW

Yes, that's how I understood it as well. If you spend the same amount on inference as you did on training, then you get a hell of a lot of inference. 

I would expect he'd also argue that, because companies are willing to spend tons of money on training, we should also expect them to be willing to spend lots on inference. 

Comment by aogara (Aidan O'Gara) on AGI Ruin: A List of Lethalities · 2022-06-08T16:57:46.831Z · EA · GW

Strongly agreed. Somehow taking over the world and preventing anybody else from building AI seems like a core part of the plan for Yudkowsky and others. (When I asked about this on LW, somebody said they expected the first aligned AGI to implement global surveillance to prevent unaligned AGIs.) That sounds absolutely terrible -- see risks from stable totalitarianism

If Yudkowsky is right and the only way to save the world is by global domination, then I think we're already doomed. But there's lots of cruxes to his worldview: short timelines, short takeoff speeds, the difficulty of the alignment problem, the idea that AGI will be a single entity rather than many different systems in different domains. Most people in AI safety are not nearly as pessimistic. I'd much rather bet on the wide range of scenarios where his dire predictions are incorrect. 

Comment by aogara (Aidan O'Gara) on Deference Culture in EA · 2022-06-08T08:49:09.665Z · EA · GW

I like this breakdown a lot. Another related reason for deferring less and building your own inside view is for figuring out your career within a field.

Choosing research questions, deciding which roles and orgs to apply to, finding role models and plotting a career trajectory, and proposing new projects can be parts of your job in just about any field, and it’ll be hard to do them well if you’re constantly deferring to experts. On niche topics, it’s even difficult to learn who the experts are and what they believe.

Personally I’ve deferred to 80,000 Hours on which high-level cause areas offer the highest potential for impact. But after spending a few months to years learning about a single cause area, I feel much less clueless about the field and have a real inside view.

Comment by aogara (Aidan O'Gara) on The chance of accidental nuclear war has been going down · 2022-06-01T02:45:35.723Z · EA · GW

“Seven of these events involve computer errors mistaking innocent things for incoming nuclear weapons and three of which involve misinterpreting enemy actions as signaling potential nuclear intention.”

Making safer computer systems for nuclear missile detection and deployment seems like a potentially impactful career goal. I know next to nothing about the topic, but some amateur thoughts here:

Great post, good luck with the new blog!

Comment by aogara (Aidan O'Gara) on Sam Bankman-Fried should spend $100M on short-term projects now · 2022-05-31T22:11:54.336Z · EA · GW

FWIW I think SBF disagrees. FTX has spent hundreds of millions on marketing so far (see here). For an organization that already believes in the power of PR, making donations that are more legibly altruistic seems like a great way to demonstrate their core values. 

Personally I would love to see a commitment to fund any charities that GiveWell projects as 5x to 8x better than direct cash transfers, which currently do not receive donations from GiveWell. You're right that we should do good things for the right reasons, and I would argue that it's the right thing to do for FTX to fill that funding gap.  

Comment by aogara (Aidan O'Gara) on There are no people to be effectively altruistic for on a dead planet: EA funding of projects without conducting Environmental Impact Assessments (EIAs), Health and Safety Assessments (HSAs) and Life Cycle Assessments (LCAs) = catastrophe · 2022-05-27T04:59:12.573Z · EA · GW

Final meta point: You mentioned that you wrote this post as part of an EA fellowship. I'm really glad the fellowship is fostering such thoughtful engagement with the EA community. While you might not agree with every EA view, I would hope that EAs can convince you of their arguments over time, and can learn something from you as well. I really don't appreciate the downvotes without discussion on this substantive post from someone who seems new to the community, and going forwards I hope you find this site to be a kind and constructive place for discussion. Cheers. 

Comment by aogara (Aidan O'Gara) on There are no people to be effectively altruistic for on a dead planet: EA funding of projects without conducting Environmental Impact Assessments (EIAs), Health and Safety Assessments (HSAs) and Life Cycle Assessments (LCAs) = catastrophe · 2022-05-27T04:57:59.610Z · EA · GW

There's a few arguments here I really like, alongside others I disagree with. 

The best one in my opinion is that GiveWell should include the carbon cost of producing bednets in its cost-effectiveness analysis of the Against Malaria Foundation. After spending 10 minutes estimating that cost, it seems like it might increase the cost of saving a life by a few hundred bucks: 

  • How much carbon does a bednet produce? I can't find any sources on the carbon footprint of a single bednet, but the carbon footprint of a plastic bag is 1.58kg. Let's use that as a baseline.
  • How many bednets does it take to save a life with AMF?  GiveWell's CEA says each bednet costs ~$5. (It's unclear whether this is the cost of purchasing a bednet or the cost of purchasing + distributing. I'll assume the latter because otherwise we'd need to guess the latter, though this assumption increases the estimated carbon footprint.) Saving one life costs $3900 - $9000 depending on the country, so that means saving a life with AMF takes roughly ~1,000 bednets.
  • What is the carbon footprint of saving a life with AMF?  Given the assumptions above, saving a life with AMF would produce 1.58 tons of carbon = 1.58kg * 1,000 bednets.
  • What is the social cost of carbon?  This recent study estimates the social cost of a ton of carbon as $112. This is a central estimate, not a high-end estimate. More importantly, it incorporates annual time discounting of 3%, where longtermists would advocate 0%.
  • What is the social cost of saving a life with AMF?  Given the assumptions above, saving a life with AMF would create a social cost of carbon of $178 = 1.58 tons * $112 SCC / ton.
  • This calculation implies that the cost of saving a life with AMF is 2% - 4% higher than reported by GiveWell after accounting for the carbon cost of producing bednets. If true, this seems important enough to include in GiveWell's CEA. However, this comes with the extremely strong caveat that this calculation makes several large assumptions which could be improved upon with more research.
  • Apparently this group is working on making bednets from recycled material, which would reduce the carbon footprint and could reduce this already-small impact.

The author also recommends quantifying the impacts of fishing with bednets (on wildlife and human health) and improper disposal of bednets (burning them can release microplastics into the air, link to discussion of impact). These both seem more speculative and difficult to quantify, and while I'd welcome attempts to do so, I don't really fault GiveWell for excluding them. 

during the past decades, the massively increased toxic load of (forever) chemicals is leading to a reduction in the human immune response (source) and human fertility (source), both of which are existential risks if they go so far as to reduce our ability to survive pandemic viruses and if our ability to reproduce dips below a certain tipping point.

This would be very important if true, but I'm skeptical without a stronger affirmative case.

There seems to be an inherent tension here between various branches of the EA movement – those focussed more on altruistic actions in the here and now, and those focussed more on longtermism, future generations, and existential risks. What matters, I feel, is that the dialog is kept open and the most recent science listened to. It is neither helpful nor rational to become so invested in a solution that we cannot pivot away from it when the science changes or becomes clearer.

This is a great nod to the problem of moral cluelessness in this context. Some interventions that are near-term beneficial but long-term questionable, making cost-effectiveness analysis very difficult. You point to some inherently long-term environmental harms, and it's worth thinking about those kinds of long-term harms even if they're tough to quantify. 

Comment by aogara (Aidan O'Gara) on There are no people to be effectively altruistic for on a dead planet: EA funding of projects without conducting Environmental Impact Assessments (EIAs), Health and Safety Assessments (HSAs) and Life Cycle Assessments (LCAs) = catastrophe · 2022-05-27T00:54:08.552Z · EA · GW

Thanks for sharing. I suspect a lot of people here will disagree, particularly on the assumption that biodiversity is of comparable importance to immediate human suffering. But this seems good faith and well sourced, deserving of a discussion or at least a rebuttal.

Comment by aogara (Aidan O'Gara) on Some unfun lessons I learned as a junior grantmaker · 2022-05-24T00:05:50.047Z · EA · GW

In general, what do you think of the level of conflict of interests within EA grantmaking? I’m a bit of an outsider to the meta / AI safety folks located in Berkeley, but I’ve been surprised to find out the frequency of close relationships between grantmakers and grant receivers. (For example, Anthropic raised a big Series A from grantmakers closely related to their president Daniella Amodei’s husband, Holden Karnofsky!)

Do you think COIs pose a significant threat to the EA’s epistemic standards? How should grantmakers navigate potential COIs? How should this be publicly communicated?

(Responses from Linch or anybody else welcome)

Comment by aogara (Aidan O'Gara) on Some unfun lessons I learned as a junior grantmaker · 2022-05-23T17:36:23.617Z · EA · GW

“ People are often grateful to you for granting them money. This is a mistake.”

How would you recommend people react when they receive a grant? Saying thank you simply seems polite and standard etiquette, but I agree that it misportrays the motives of the grantmaker and invites concerns of patronage and favoritism.

Comment by aogara (Aidan O'Gara) on We Ran an AI Timelines Retreat · 2022-05-17T08:00:09.208Z · EA · GW

Really cool! I was hoping to attend but had to be home for a family event. Would be super interested to see any participants summarize their thoughts on AI timelines, or a poll of the group's opinions. 

Comment by aogara (Aidan O'Gara) on DeepMind’s generalist AI, Gato: A non-technical explainer · 2022-05-17T07:39:37.547Z · EA · GW

Sounds like Decision Transformers (DTs) could quickly become powerful decision-making agents. Some questions about them for anybody who's interested: 

DT Progress and Predictions

Outside Gato, where have decision transformers been deployed? Gwern shows several good reasons to expect that performance could quickly scale up (self-training, meta-learning, mixture of experts, etc.). Do you expect the advantages of DTs to improve state of the art performance on key RL benchmark tasks, or are the long-term implications of DTs more difficult to measure? Focusing on the compute costs of training and deployment, will DTs be performance competitive with other RL systems at current and future levels of compute? 

Key Domains for DTs

Transformers have succeeded in data-rich domains such as language and vision. Domains with lots of data allow the models to take advantage of growing compute budgets and keep up with high-growth scaling trends. RL has similarly  benefitted from self-play for nearly infinite training data. In what domains do you expect DTs to succeed? Would you call out any specific critical capabilities that could lead to catastrophic harm from DTs? Where do you expect DTs to fail? 

My current answer would focus on risks from language models, though I'd be interested to hear about specific threats from multimodal models. Previous work has shown threats from misinformation and persuasion. You could also consider threats from offensive cyberweapons assisted by  LMs and potential paths to using weapons of mass destruction. 

These risks exist with current transformers, but DTs / RL + LMs open a whole new can of worms. You get all of the standard concerns about agents: power seeking, reward hacking, inner optimizers. If you wrote Gwern's realistic tale of doom for Decision Transformers, what would change? 

DT Safety Techniques

What current AI safety techniques would you like to see applied to decision transformers? Will Anthropic's RLHF methods help decision transformers learn more nuanced reward models for human preferences? Or will the signal be too easily Goodharted, improving capabilities without asymmetrically improving AI safety? What about Redwood's high reliability rejection sampling -- does it looks promising for monitoring the decisions made by DTs?

Generally speaking, are you concerned about capabilities externalities? Deepmind and OpenAI seem to have released several of the most groundbreaking models of the last five years, a strategic choice made by safety-minded people. Would you have preferred slower progress towards AGI at the expense of not conducting safety research on cutting-edge systems?  

Comment by aogara (Aidan O'Gara) on LW4EA: Some cruxes on impactful alternatives to AI policy work · 2022-05-17T06:40:31.415Z · EA · GW

Cool arguments on the impact of policy work for AI safety. I find myself agreeing with Richard Ngo’s support of AI policy given the scale of government influence and the uncertain nature of AI risk. Here’s a few quotes from the piece.

How AI could be influenced by policy experts:

in a few decades (assuming long timelines and slow takeoff) AIs that are less generally intelligent that humans will be causing political and economic shockwaves, whether that's via mass unemployment, enabling large-scale security breaches, designing more destructive weapons, psychological manipulation, or something even less predictable. At this point, governments will panic and AI policy advisors will have real influence. If competent and aligned people were the obvious choice for those positions, that'd be fantastic. If those people had spent several decades researching what interventions would be most valuable, that'd be even better.

This perspective is inspired by Milton Friedman, who argued that the way to create large-scale change is by nurturing ideas which will be seized upon in a crisis.

Why EA specifically could succeed:

… From the outside view, our chances are pretty good. We're a movement comprising many very competent, clever and committed people. We've got the sort of backing that makes policymakers take people seriously: we're affiliated with leading universities, tech companies, and public figures. It's likely that a number of EAs at the best universities already have friends who will end up in top government positions. We have enough money to do extensive lobbying, if that's judged a good idea.

These opposing opinions are driven by different views on timelines, takeoff speeds, and sources of risk:

More generally, Ben and I disagree on where the bottleneck to AI safety is. I think that finding a technical solution is probable, but that most solutions would still require careful oversight, which may or may not happen (maybe 50-50). Ben thinks that finding a technical solution is improbable, but that if it's found it'll probably be implemented well. I also have more credence on long timelines and slow takeoffs than he does. I think that these disagreements affect our views on the importance of influencing governments in particular.

Thanks for sharing LW4EA! Particularly the AI safety stuff. It’s an act of community service.

Comment by aogara (Aidan O'Gara) on DeepMind’s generalist AI, Gato: A non-technical explainer · 2022-05-17T05:12:27.331Z · EA · GW

This is a terrific distillation, thanks for sharing! I really like the final three sections with implications for short-term, long-term, and policy risks. 

For example, in 2019 the U.S. Food and Drug Administration issued a proposed regulatory framework for AI/ML-based software used in health care settings. Less than a week ago, the U.S. Justice Department and the Equal Employment Opportunity Commission released guidance and technical assistance documents around avoiding disability discrimination when using AI for hiring decisions.

These are some great examples of US executive agencies that make policy decisions about AI systems. You could also include financial regulators (SEC, CFPB, Treasury) and national defense (DOD, NSA, CIA, FBI). Not many people in these agencies work on AI, but 80,000 Hours argues that those who do could make impactful decisions while building career capital.

Comment by aogara (Aidan O'Gara) on Sort forum posts by: Occlumency (Old & Upvoted) · 2022-05-15T17:05:59.472Z · EA · GW

I agree, upvotes do seem a bit inflated. It creates an imbalance between new and old users that continually grows as existing users rack up more upvotes over time. This can be good for preserving culture and norms, but as time goes on, the difference between new and old users only grows. Some recalibration could help make the site more welcoming to new users.

In general, I think it would be nice if each upvote counted for roughly 1 karma. Will MacAskill’s most recent post received over 500 karma from only 250 voters, which might exaggerate the reach of the post to someone who doesn’t understand the karma system. On a smaller scale, I would expect a comment with 10 karma from 3 votes to be less useful than a comment with 10 karma from 5 - 8 votes. These are just my personal intuitions, would be curious how other people perceive it.

Comment by aogara (Aidan O'Gara) on A hypothesis for why some people mistake EA for a cult · 2022-05-12T21:40:43.507Z · EA · GW

Hey Aman, thanks for the post. It does seem a bit outdated that the top picture for altruism  is a French painting from hundreds of years ago. EA should hope to change the cultural understanding of doing good from something that's primarily religious or spiritual, to something that can be much more scientific and well-informed. 

I do think some of the accusations of EA being a cult might go a bit deeper. There aren't many other college clubs that would ask you to donate 10% of your income or determine your career plans based on their principles. One community builder who'd heard similar accusations here traced the concerns to EA's rapid growth in popularity and a certain "all-or-nothing" attitude in membership. Here's another person who had some great recommendations for avoiding the accusation. I particularly liked the emphasis on giving object-level arguments rather than appealing to authority figures within EA. 

Overall, it seems tough for an ethical framework + social movement to avoid the accusation at times, but hopefully our outreach can be high quality enough to encourage a better perception. 

Comment by aogara (Aidan O'Gara) on EA will likely get more attention soon · 2022-05-12T15:06:54.924Z · EA · GW

This is a great point. As one example of growing mainstream coverage, here’s a POLITICO Magazine piece on Carrick Flynn’s Congressional campaign. It gives a detailed explanation of effective altruism and longtermism, and seems like a great introduction to the movement for somebody new. The author sounds like he might have collaborated with CEA, but if not, maybe someone should reach out?

Comment by aogara (Aidan O'Gara) on What are your recommendations for technical AI alignment podcasts? · 2022-05-11T22:10:35.871Z · EA · GW

AXRP by Daniel Filan from CHAI is great, and The Gradient is another good one with both AI safety and general interest AI topics.

Comment by aogara (Aidan O'Gara) on The best $5,800 I’ve ever donated (to pandemic prevention). · 2022-05-11T01:51:53.160Z · EA · GW

Coverage of this post from The Hill on April 24th:  

Many of Flynn’s donors are involved in an online forum called Effective Altruism, a group that analyzes how best to spend money on philanthropic efforts. Their conclusion, according to some of the posts backing Flynn, has been that spending a few million on a congressional race could result in billions in spending on pandemic preparedness by the federal government.

Flynn is “the first person to ever run for US congress on a platform of preventing future pandemics,” wrote one user, Andrew Snyder-Beattie, who called his donation to Flynn “the best $5,800 I’ve ever donated (to pandemic prevention).” 

“Nobody in congress has made pandemic preparedness a ‘core issue,’” wrote Snyder-Beattie, whose online profiles say he leads Open Philanthropy’s work on biosecurity and pandemic preparedness. “Carrick will make this a priority, and has committed to devoting a full time staff member to focus on pandemic preparedness issues.” 

Sounds like good coverage! Though some of the local media is strongly against Carrick (1, 2, 3). 

Separately, Oregon Guy deserves better from us. He's clearly an Oregon voter who is surprised to see millions of dollars pouring in to support a candidate he's never heard of.  Downvoting him to hell will not open his ears to the virtues of our preferred candidate. It would be great if someone had the time and expertise to give Oregon Guy a warm welcome and a sincere recommendation for Carrick. 

Comment by aogara (Aidan O'Gara) on Why Helping the Flynn Campaign is especially useful right now · 2022-05-10T08:56:13.032Z · EA · GW

Donated because of this post. Thanks for sharing and good luck to Carrick.

Comment by aogara (Aidan O'Gara) on Axby's Shortform · 2022-05-04T06:31:21.375Z · EA · GW

Hey, this is a great question with good context for potential answers too. If you don’t get any substantive responses here, I’d suggest posting as a question on the frontpage — the shortforms really don’t get much visibility.

Comment by aogara (Aidan O'Gara) on Best person to contact for quick media opportunity? · 2022-05-03T19:51:54.524Z · EA · GW

Sounds cool! You should send an email to, and also check out their article on talking to journalists:

Comment by aogara (Aidan O'Gara) on Information security considerations for AI and the long term future · 2022-05-03T17:00:08.792Z · EA · GW

Great overview of an important field for AI safety, thanks for sharing. A few questions if you have the time:

First, what secrets would be worth keeping? Most AI research today is open source, with methods described in detailed papers and code released on GitHub. That which is not released is often quickly reverse-engineered: OpenAI’s GPT-3 and DALLE-2 systems, for example, both have performant open-source implementations. On the other hand, many government and military applications seemingly must be confidential. 

What kinds of AI research is kept secret today, and are you happy about it? Do you expect the field to become much more secretive as we move towards AGI? In what areas is this most important?

Second, do you view infosec as a dual-use technology? That is, if somebody spends their career developing better infosec methods, can those methods be applied by malicious actors (e.g. totalitarian governments) just as easily as they can be applied by value-aligned actors? This would make sense if the key contributions would be papers and inventions that the whole field can adopt. But if infosec is an engineering job that must be built individually by each organization pursuing AGI, then individuals working in the field could choose which actors they’d be willing to work with.

Finally, a short plug: I brainstormed why security engineering for nuclear weapons could be an impactful career path. The argument is that AGI’s easiest path to x-risk runs through existing WMDs such as nuclear and bio weapons, so we should secure those weapons from cyber attacks and other ways an advanced AI could take control of them. Do you think infosec for WMDs would be high expected impact? How would you change my framing?

Comment by aogara (Aidan O'Gara) on There are currently more than 100 open EA-aligned tech jobs · 2022-05-01T21:43:24.585Z · EA · GW

Thanks for sharing. It seems like the most informed people in AI Safety have strongly changed their views on the impact of OpenAI and Deepmind compared to only a few years ago. Most notably, I was surprised to see ~all of the OpenAI safety team leave for Anthropic . This shift and the reasoning behind it have been fairly opaque to me, although I try to keep up to date. Clearly there are risks with publicly criticizing these important organizations, but I'd be really interested to hear more about this update from anybody who understands it.

Comment by aogara (Aidan O'Gara) on Big EA and Its Infinite Money Printer · 2022-04-29T20:12:30.855Z · EA · GW

Very important perspective from someone on the front lines of recruiting new EAs. Thanks for sharing!

Comment by aogara (Aidan O'Gara) on How effective is sending a pre-interview project for a job you want? · 2022-04-24T23:13:13.816Z · EA · GW

Take-home projects are a great opportunity to show your skills. If possible, I would ask if there's a work trial before inventing your own non-solicited project.

Comment by aogara (Aidan O'Gara) on Calling for Student Submissions: AI Safety Distillation Contest · 2022-04-23T21:55:35.939Z · EA · GW

Hi, would Anthropic's research agenda be a good candidate for distilling?

Comment by aogara (Aidan O'Gara) on My GWWC donations: Switching from long- to near-termist opportunities? · 2022-04-23T20:06:40.580Z · EA · GW

This makes a lot of sense to me. Personally I'm trying to use my career to work on longtermism, but focusing my donations on global poverty. A few reasons, similar to what you outlined above:

  • I don't want to place all my bets on longtermism. I'm sufficiently skeptical of arguments about AI risk, and sufficiently averse to pinning all my personal impact on a low-probability high-EV cause area, that I'd like to do some neartermist good with my life. Also, this
  • Comparatively speaking, longtermism needs more people and global poverty needs more cash. GiveWell has maintained their bar for funding as "8x better than GiveDirectly", and is delaying grants that would not meet that bar because they expect to find more impactful opportunities over the next few years. Meanwhile longtermists seem to have lowered the bar to funding significantly, with funding readily available for any individuals interested in working on or towards impactful longtermist projects. (Perhaps the expected value of longtermist giving still looks good because the scale is so much bigger, but getting a global poverty grant seems to require a much more established organization with a proven track record of success.) 
  • The best pitch for EA in my experience is the opportunity to reliably save lives by donating to global poverty charities. When I tell people about EA, I want to be able to tell them that I do the thing I'm recommending. (Though, maybe I should be learning to pitch x-risk instead

On the whole, it seems reasonable to me for somebody to donate to neartermist causes despite the fact that they believe in the longtermist argument. This is particularly true for people who do or will work directly on longtermism and would like to diversify their opportunities for impact. 

Comment by aogara (Aidan O'Gara) on Free-spending EA might be a big problem for optics and epistemics · 2022-04-22T21:31:23.617Z · EA · GW

There are lots of ways to accurately predict a job applicant’s future success. See the meta-analysis linked below, which finds general mental ability tests, work trials, and structured interviews all to be more predictive of future overall job performance than unstructured interviews, peer ratings, or reference checks.

I’m not a grantmaker and there are certainly benefits to informal networking-based grants, but on the whole I wish EA grantmaking relied less on social connections to grantmakers and more on these kinds of objective evaluations.

Meta-analysis (>6000 citations):

Comment by aogara (Aidan O'Gara) on How much current animal suffering does longtermism let us ignore? · 2022-04-21T19:45:16.162Z · EA · GW

Strongly agreed, and I think it’s one of the most important baseline arguments against AI risk. See Linch’s motivated reasoning critique of effective altruism:

Comment by aogara (Aidan O'Gara) on Will FTX Fund publish results from their first round? · 2022-04-19T05:02:49.213Z · EA · GW

Tyler Cowen’s Emergent Ventures fast grants program also releases the funded project with a short summary of their work. Seems like a very good idea, though maybe not the highest priority for the FTX team right now.

Comment by aogara (Aidan O'Gara) on Why not offer a multi-million / billion dollar prize for solving the Alignment Problem? · 2022-04-17T17:54:48.702Z · EA · GW

Yeah that's a good point. Another hack would be training a model on text that specifically includes the answers to all of the TruthfulQA questions. 

The real goal is to build new methods and techniques that reliably improve truthfulness over a range of possible measurements. TruthfulQA is only one such measurement, and performing well on it does not guarantee a signficant contribution to alignment capabilities. 

I'm really not sure what the unhackable goal looks like here. 

Comment by aogara (Aidan O'Gara) on aogara's Shortform · 2022-04-17T16:50:55.740Z · EA · GW

Fun fact: For 20 years at the peak of the Cold War, the US nuclear launch code was “00000000”.

…Are you freaking kidding me??? EAs at the top level of DOE, please!

H/t: Gavin Leech

Comment by aogara (Aidan O'Gara) on Why not offer a multi-million / billion dollar prize for solving the Alignment Problem? · 2022-04-17T16:34:29.888Z · EA · GW

For example, TruthfulQA is a quantitative benchmark for measuring the truthfulness of a language model. Achieving strong performance on this benchmark would not alone solve the alignment problem (or anything close to that), but it could potentially offer meaningful progress towards the valuable goal of more truthful AI.

This could be a reasonable benchmark for which to build a small prize, as well as a good example of the kinds of concrete goals that are most easily incentivized.

Here’s the paper:

Comment by aogara (Aidan O'Gara) on Why not offer a multi-million / billion dollar prize for solving the Alignment Problem? · 2022-04-17T16:29:12.309Z · EA · GW

The main challenge seems to be formulating the goal in a sufficiently specific way. We don’t currently have a benchmark that would serve as a clear indicator of solving the alignment problem. Right now, any proposed solution ends up being debated by many people who often disagree on the solution’s merits.

FTX Future Fund listed AI Alignment Prizes on their ideas page and would be interested in funding them. Given that, it seems like coming up with clear targets for AI safety research would be very impactful.

Comment by aogara (Aidan O'Gara) on The Effective Institutions Project is hiring · 2022-04-17T05:29:54.265Z · EA · GW

Fantastic to see such strong progress on the institutional decision making front. Hoping that all goes well, and that EA’s newfound riches might even enable better funding for your hiring plans.

Comment by aogara (Aidan O'Gara) on aogara's Shortform · 2022-04-14T23:43:16.000Z · EA · GW

Collected Thoughts on AI Safety

Here are of some of my thoughts on AI timelines:

And here are some thoughts on other AI Safety topics:

Generally speaking, I believe in longer timelines and slower takeoff speeds. But short timelines seem more dangerous, so I'm open to alignment work tailored to short timelines scenarios. Right now, I'm looking for research opportunities on risks from large language models. 

Comment by aogara (Aidan O'Gara) on How effective is sending a pre-interview project for a job you want? · 2022-04-14T06:51:14.896Z · EA · GW

Of course Warren, hope it’s helpful! I had a strong sense of what each company was looking for before investing time in a project. Usually this was from talking with them first, though in the case of AI Impacts it came from a public call for collaborators on the 80K podcast. I also always submit a normal job application, and usually I would only do a work project after speaking with someone and learning what they’re looking for, which usually comes from the application. (When I have a dream job that I know a ton about, then I’m more inclined to take the time to build a project before sending an application that would be otherwise unimpressive.)

I have only ever successfully applied to organizations of <100 people. My best guess is that large organizations get far more applications per role, have more general purpose hiring needs, and look for more traditional skills and credentials in their applications. Smaller organizations are instead often hiring to fill a very particular need, far more specific than the job title would let on, and will be very impressed by direct proof that you can do exactly what they need you to do. (Also, I’ve mainly applied for part-time remote work, and several large organizations have told me that this was a dealbreaker for them.)

But, an important counterpoint! I would also recommend sending out a bunch of really quick applications to places you’re not even sure you’d like to work. I’d say make these resume-only, no cover letter necessary. Over the course of a few hours you could send low effort applications to a dozen or more job postings, which could realistically lead to an interview. Perhaps you’d then want to invest more time learning about the organization and demonstrating your interest, but in general, EAs seem to be too averse to applying quickly rather than the opposite.

Here’s a great recent post on the topic:

Comment by aogara (Aidan O'Gara) on aogara's Shortform · 2022-04-14T00:07:19.314Z · EA · GW

Update on Atlas Fellowship: They've extended their application period by one week! Good decision for getting more qualified applications into the pipeline. I wonder how many applications they've received overall. 

Comment by aogara (Aidan O'Gara) on How effective is sending a pre-interview project for a job you want? · 2022-04-13T08:13:07.279Z · EA · GW

I got a four month work trial at AI Impacts after spending ~20 hours on an unsolicited pre-interview project, parts of which were later published on their website. I’m not sure if I would’ve gotten the interview otherwise; I was an undergraduate with no experience in AI at the time.

20 hours is definitely overkill, but in general, my goal in interviews and work trials is to ask lots of specific questions about what the employer needs and figure out how I can provide it. You can describe their problem and your specific skills in a PowerPoint or simply in your conversation. This is particularly important for smaller and less organized employers that are hiring to solve specific problems, rather than for general cookie-cutter roles.

Perhaps most important is your very first message to a potential employer, where it is extremely helpful to show a specific demonstrated interest in their organization and provide potential solutions for them. Even if the ideas are not new to them, the fact that you arrived at them will show your ability and interest. Off the top of my head, I would guess than the response rate to my job applications has been at least 2x and up to 8x higher when I wrote specific emails rather than just submitting my resume. (But, I only write specific emails when I believe I have unusually good fit for the role, so this number is probably biased upwards.)

Comment by aogara (Aidan O'Gara) on Effective data science projects · 2022-04-11T20:15:47.612Z · EA · GW

Hey, I think this is a great idea. Credo AI is an organization working on data science-type projects for AI safety, maybe one of their projects could give you inspiration?