Transformative AI and Compute [Summary]

post by lennart · 2021-09-23T13:53:17.477Z · EA · GW · 5 comments


  0. Executive Summary
    Epistemic Status
    Highlights per Section
      1. Compute
      2. Compute in AI Systems
      3. Compute and AI Alignment
      4. Forecasting Compute
      5. Better Compute Forecasts
      6. Compute Governance
      7. Conclusions

This is the summary of the series Transformative AI and Compute - A holistic approach. You can find the sequence here [? · GW] and the links to the posts below:

  1. What is Compute? [1/4] [EA · GW]
  2. Forecasting Compute [2/4] [EA · GW]
  3. Compute Governance and Conclusions [3/4] [EA · GW]
  4. Compute Research Questions and Metrics [4/4] [EA · GW]

0. Executive Summary

This series attempts to:

  1. Introduce a simplified model of computing which serves as a foundational concept (Part 1 - Section 1 [EA · GW]).
  2. Discuss the role of compute for AI systems (Part 1 - Section 2 [EA · GW]).
  3. Explore the connection of compute trends and more capable AI systems over time (Part 1 - Section 3 [EA · GW]).
  4. Discuss the compute component in forecasting efforts on transformative AI timelines (Part 2 - Section 4 [EA · GW])
  5. Propose ideas for better compute forecasts (Part 2 - Section 5 [EA · GW]).
  6. Briefly outline the relevance of compute for AI Governance (Part 3 - Section 6 [EA · GW]).
  7. Conclude this report and discuss next steps (Section 7 [EA · GW]).
  8. Provide a list of connected research questions (Appendix A [EA · GW]).
  9. Present common compute metrics and discusses their caveats (Appendix B [EA · GW]).
  10. Provide a list of Startups in the AI Hardware domain (Appendix C [EA · GW]).


Modern progress in AI systems has been driven and enabled mainly by acquiring more computational resources. AI systems rely on computation-intensive training runs — they require massive amounts of compute.

Learning about the compute requirements for training existing AI systems and their capabilities allows us to get a more nuanced understanding and take appropriate action within the technical and governance domain to enable a safe development of potential transformative AI systems.

To understand the role of compute, I decided to (a) do a literature review, (b) update existing work with new data, (c) investigate the role of compute for timelines, and lastly, (d) explore concepts to enhance our analysis and forecasting efforts.

In this piece, I present a brief analysis of AI systems’ compute requirements and capabilities, explore compute’s role for transformative AI timelines, and lastly, discuss the compute governance domain.

I find that compute, next to data and algorithmic innovation, is a crucial contributor to the recent performance of AI systems. We identify a doubling time of 6.2 months for the compute requirements of the final training run of state-of-the-art AI systems from 2012 to the present.
Next to more powerful hardware components, the spending on AI systems and the algorithmic innovation are other factors that inform the amount of effective compute available — which itself is a component for forecasting models on transformative AI.

Therefore, as compute is a significant component and driver of AI systems’ capabilities, understanding the developments of the past and forecasting future results is essential. Compared to the other components, the quantifiable nature of compute makes it an exciting aspect for forecasting efforts and the safe development of AI systems.

I consequently recommend additional investigations in highlighted components of compute, especially AI hardware. As compute forecasting and regulations require an in-depth understanding of hardware, hardware spending, the semiconductor industry, and much more, we recommend an interdisciplinary effort to inform compute trends interpretations and forecasts. Those insights can then be used to inform policymaking, and potentially regulate access to compute.

Epistemic Status

This article is Exploratory to My Best Guess. I've spent roughly 300 hours researching this piece and writing it up. I am not claiming completeness for any enumerations. Most lists are the result of things I learned on the way and then tried to categorize.

I have a background in Electrical Engineering with an emphasis on Computer Engineering and have done research in the field of ML optimizations for resource-constrained devices — working on the intersection of ML deployments and hardware optimization. I am more confident in my view on hardware engineering than in the macro interpretation of those trends for AI progress and timelines.

This piece was a research trial to test my prioritization, interest, and fit for this topic. Instead of focusing on a single narrow question, this paper and research trial turned out to be more broad — therefore a holistic approach. In the future, I’m planning to work more focused on a narrow relevant research question within this domain. Please reach out.

Views and mistakes are solely my own.

Highlights per Section

1. Compute [EA · GW]

2. Compute in AI Systems [EA · GW]

3. Compute and AI Alignment [EA · GW]

4. Forecasting Compute [EA · GW]

5. Better Compute Forecasts [EA · GW]

6. Compute Governance [EA · GW]

7. Conclusions [EA · GW]


This work was supported and conducted as a summer fellowship at the Stanford Existential Risks Initiative (SERI). Their support is gratefully acknowledged. I am thankful for joining this program and would like to thank the organizers for enabling this, and the other fellows for the insightful discussions.

I am incredibly grateful to Ashwin Acharya and Michael Andregg for their mentoring throughout the project. Michaels thoughts on AI hardware nudged me to reconsider my current research interest and learn more about AI and compute. Ashwin for bouncing off ideas, the wealth of expertise in the domain, and helping me put things into the proper context. Thanks for the input! I was looking forward to every meeting and the thought-provoking discussions.

Thanks to the Swiss Existential Risk Initiative (CHERI) for providing the social infrastructure during my project. Having the opportunity to organize such an initiative in a fantastic team and being accompanied by motivated young researchers is astonishing.

I would like to express my thanks to Jaime Sevilla, Charlie Giattino, Will Hunt, Markus Anderljung, and Christopher Phenicie for your input and discussing ideas.

Thanks to Jaime Sevilla, Jeffrey Ohl, Christopher Phenicie, Aaron Gertler, and Kwan Yee Ng for providing feedback on this piece.


  1. Transformative AI, as defined by Open Philanthropy in this blogpost: “Roughly and conceptually, transformative AI is AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution.↩︎


Comments sorted by top scores.

comment by MichaelA · 2021-09-29T10:59:46.318Z · EA(p) · GW(p)

Very minor comment, but I think there's a mistaken pair of commas in the first sentence here, which confuses the sentence a bit:

Applications, with enough demand, will move to the fast lane by designing and benefitting from specialized processors. In contrast, others will be stuck in the slow lane running on general-purpose processors.


Replies from: lennart
comment by lennart · 2021-09-30T07:58:29.777Z · EA(p) · GW(p)

Thanks, I've edited it.

comment by MichaelA · 2021-09-29T10:58:49.709Z · EA(p) · GW(p)

Thanks for this - I really appreciated how clear and focused-on-actual-takeaways (rather than "I'll discuss X, Y, Z topic") this summary is.

OpenAI observed in 2018 that since 2012 the amount of compute used in the largest AI training runs has been doubling every 3.4 months.

In our updated analysis (n=57, 1957 to 2021), we observe a doubling time of 6.2 months between 2012 and mid-2021.

  1. What is n counting here? Like studies included in the lit review?
  2. Why is n referring to things from 1957-2021 if the conclusion given is just about 2012-2021? It seems like the relevant n would be just the items about 2012-2021?
  3. Is the difference between your estimate and OpenAI's because 2018-2021 involved much slower doubling times (perhaps three times as long?), because OpenAI missed or misinterpreted some things that happened in 2012-2018, because you're including pre-2012 things and those involved slower doubling times, or some mix of those factors?

(Probably I can find the answers in your next post, but maybe this is worth clarifying here too?)

Replies from: lennart
comment by lennart · 2021-09-30T07:40:03.836Z · EA(p) · GW(p)

Thanks, Michael.

  1. n is counting the number of ML systems in the analysis at the point of writing. (We have added more systems in the meantime). An example for such a system is GPT-3, AlphaFold, etc. - basically a row in our dataset.
  2. Right, good point. I'll add the number of systems for the given time period.
  3. That's hard to answer. I don't think OpenAI misinterpreted anything. For the moment, I think it's probably a mixture of:
    • the inclusion criteria for the systems on which we base this trend
    • actual slower doubling times for reasons which we should figure out Nonetheless, as outlined in Part 1 - Section 2.3 [EA · GW], I did not interpret those trends yet but I'm interested in a discussion and trying to write up my thoughts on this in the future.
comment by MarkusAnderljung · 2021-09-24T22:00:11.874Z · EA(p) · GW(p)

Thanks for this! I really look forward to seeing the rest of the sequence, especially on the governance bits.