Hide table of contents

Transformative AI and Compute - A holistic approach - Part 3 out of 4

This is part three of the series Transformative AI and Compute - A holistic approach. You can find the sequence here and the summary here.

This work was conducted as part of Stanford’s Existential Risks Initiative (SERI) at the Center for International Security and Cooperation, Stanford University. Mentored by Ashwin Acharya (Center for Security and Emerging Technology (CSET)) and Michael Andregg (Fathom Radiant).

This post attempts to:

  1. Briefly outline the relevance of compute for AI Governance (Section 6).
  2. Conclude this report and discuss next steps (Section 7).

Epistemic Status

This article is Exploratory to My Best Guess. I've spent roughly 300 hours researching this piece and writing it up. I am not claiming completeness for any enumerations. Most lists are the result of things I learned on the way and then tried to categorize.

I have a background in Electrical Engineering with an emphasis on Computer Engineering and have done research in the field of ML optimizations for resource-constrained devices — working on the intersection of ML deployments and hardware optimization. I am more confident in my view on hardware engineering than in the macro interpretation of those trends for AI progress and timelines.

This piece was a research trial to test my prioritization, interest and fit for this topic. Instead of focusing on a single narrow question, this paper and research trial turned out to be more broad — therefore a holistic approach. In the future, I’m planning to work more focused on a narrow relevant research questions within this domain. Please reach out.

Views and mistakes are solely my own.

Previous Post: Forecasting Compute

You can find the previous post "Forecasting Compute [2/4]" here.

6. Compute Governance

Highlights

  • Compute is a unique AI governance node due to the required physical space, energy demand, and the concentrated supply chain. Those features make it a governable candidate.
  • Controlling and governing access to compute can be harnessed to achieve better AI safety outcomes, for instance restricting compute access to non-safety-aligned actors.
  • As compute becomes a dominant factor of costs at the frontier of AI research, it may start to resemble high-energy physics research, where a significant amount of the budget is spent on infrastructure (unlike previous trends of CS research where the equipment costs have been fairly low).

Lastly, I want to motivate the topic of compute governance as a subfield of AI governance and briefly highlight the unique aspect of compute governance.

Compute has three unique features which might make it more governable than other domains of AI governance (such as talent, ideas, and data) (Anderljung and Carlier 2021):

  1. Compute requires physical space for the computing hardware — football-field-sized supercomputer centers are the norm (Los Alamos National Laboratory 2013). Compared to software, this makes compute easier to track.
    • Additionally, compute is often highly centralized due to the dominance of cloud providers, such as Amazon Web Services (AWS), Google Cloud, and others. Moreover, current leading hardware, such as Google TPUs, is only available as a service. Consequently, this feature makes it more governable.
  2. The energy (and water demands). For running those supercomputers, massive amounts of energy and water for cooling are required (Los Alamos National Laboratory 2013).
  3. The supply chain of the semiconductor is highly concentrated, which could enable monitoring and governance (Khan 2021) — see “The Semiconductor Supply Chain” by CSET for more.

Second, according to my initial research and talking to people in the field of AI governance, there seems to be more of a consensus on what to do with compute regarding governance: restricting and regulating access to compute resources for less cautious actors.[1] This does not include a consensus on the concrete policies but at least in regards to the goal. Whereas for other aspects in the field of AI governance, there seems to be no clear consensus on which intermediate goals to pursue (see a discussion in this post).

6.1 Funding Allocation

Within this decade, we will and should see a switch in funding distribution at publicly funded AI research groups. Whereas AI and computer science (CS) research groups usually had relatively low overhead costs for equipment, this will change in the future to the increased need for spending more funding on compute to maintain state-of-the-art research. Those groups will become more like high-energy physics or biology research groups where considerable funding is being spent on infrastructure (e.g., equipment and hardware). If this does not happen, publicly funded groups will not be able to compete. We can already observe this compute divide (Ahmed and Wahed 2020).

6.2 Research Questions

For a list of research questions see some “Some AI Governance Research Ideas” (Anderljung and Carlier 2021). My research questions are listed in Appendix A, including some notes on compute governance-related points.

7. Conclusions

Highlights

  • In terms of published papers, the research on compute trends, compute spending, and algorithmic efficiency (the field of macro ML research) is minor and more work on this intersection could quickly improve our understanding.
  • The field is currently bottlenecked by available data on macro ML trends: total compute used to train a model is rarely published, nor is spending. With these it would be easier to estimate algorithmic efficiency and build better forecasting models.
  • The importance of compute also highlights the need for ML engineers working on AI safety to be able to deploy gigantic models.
    • Therefore, more people should consider becoming an AI hardware expert or working as an ML engineer at safety-aligned organizations and enabling their deployment success.
  • But also working on the intersection of technology and economics is relevant to inform spending and understanding of macro trends.
  • Research results in all of the mentioned fields could then be used to inform compute governance.

Compute is a substantial component of AI systems and has been a driver of their capabilities. Compared to data and algorithmic innovation, it provides a unique quantifiability that enables more efficient analysis and governance.

The effective available compute is mainly informed by the compute prices, the spending, and algorithmic improvements. Nonetheless, we should also explore the downsides of purely focusing on computational power and consider using metrics based on our understanding of the interconnect and memory capacity.

We have discussed components of hardware progress and discussed the recent trends such as Moore’s law, chip architectures, and hardware paradigms. Focusing on only one trend comes with significant shortcomings; instead, I suggest we inform our forecasts by combining such models. I would be especially excited to break down existing compute trends into hardware improvements and increased spending.

Limited research in the field of macro AI

My research is based on a small set of papers, whereas most focus on certain sub aspects. Overall, the research field of macro ML trends in used compute is, to my understanding, fairly small. Seeing more research efforts on compute trends and algorithmic innovation could be highly beneficial. This could lead to a better understanding of past trends, and forecasting future trends — for example, breaking down the trend into increased spending and hardware progress can give us some insights into potential upper limits.

Limited data for analyzing AI trends

Another limitation, and perhaps the cause of limited research, is that , there is also limited data available. Consequently, researchers first need to build the required dataset. I would be excited to see bigger datasets of compute requirements or experiments to measure algorithmic efficiency.

We share in this work our public ML progress dataset and a dataset using MLCommons training benchmarks (MLCommons 2021) for measuring the performance progress of modern AI hardware and ask others to share their insights and data.

ML deployment engineers

As the role of compute is significant for AI progress, there is a strong need for ML engineers who can efficiently deploy AI systems. This was also discussed by Olah in an 80’000 hours episode #107. Consequently, ML engineers should consider working at safety-aligned organizations and enable the deployment of gigantic models which are —ideally— reliable, interpretable and steerable.

Interdisciplinary research

An essential component for compute prices and spending are economic models — either based on spending, or the computing industry, such as the semiconductor industry. Interdisciplinary research on those questions could be of great benefit. Examples of such work are (Thompson et al. 2020; Thompson and Spanuth 2021).

I plan to work on aspects of this research in the future and would be especially interested in exploring collaboration or other synergies. Please reach out. The exact research questions are still to be determined.

Appendix A lists various research questions that I would be interested in exploring and also want others to explore.

Next Post: Compute Research Questions and Metrics

The appendix "Compute Research Questions and Metrics [4/4]" will attempt to:

  1. Provide a list of connected research questions (Appendix A).
  2. Present common compute metrics and discusses their caveats (Appendix B).
  3. Provide a list of Startups in the AI Hardware domain (Appendix C).

Acknowledgments

You can find the acknowledgments in the summary.

References

The references are listed in the summary.


  1. It seems reasonable and somewhat likely to me that we will be regulating and restricting the export of AI hardware even harsher and might classify it legally as weapons within the next decades. ↩︎

Comments3
Sorted by Click to highlight new comments since:

Really nice work, just got to reading it.

Those groups will become more like high-energy physics or biology research groups where considerable funding is being spent on infrastructure (e.g., equipment and hardware). If this does not happen, publicly funded groups will not be able to compete.

How certain are you about this? Your analogies for extremely costly research are both publicly funded groups, so it wouldn't seem too surprising to me if governments will start opening their pockets for research into what seems to have similar or greater scientific and public "excitement levels" than physics and biology.

I'm still holding the same view that (a) we will probably see a switch in funding distribution and (b) if this does not happen those groups won't be able to compete with SOTA models.

we will and should see a switch in funding distribution at publicly funded AI research groups

I would change my mind if we find more evidence towards algorithmic innovation being a stronger or the significant driver.

Some recent updates in regards to providing more funding for infrastructure include The National AI Research Cloud which is currently being investigated by the US government or Compute Canada.

Just realized that I misunderstood the original quote, yes, thanks, this makes total sense. 

Curated and popular this week
 ·  · 20m read
 · 
Once we expand to other star systems, we may begin a self-propagating expansion of human civilisation throughout the galaxy. However, there are existential risks potentially capable of destroying a galactic civilisation, like self-replicating machines, strange matter, and vacuum decay. Without an extremely widespread and effective governance system, the eventual creation of a galaxy-ending x-risk seems almost inevitable due to cumulative chances of initiation over time across numerous independent actors. So galactic x-risks may severely limit the total potential value that human civilisation can attain in the long-term future. The requirements for a governance system to prevent galactic x-risks are extremely demanding, and they need it needs to be in place before interstellar colonisation is initiated.  Introduction I recently came across a series of posts from nearly a decade ago, starting with a post by George Dvorsky in io9 called “12 Ways Humanity Could Destroy the Entire Solar System”. It’s a fun post discussing stellar engineering disasters, the potential dangers of warp drives and wormholes, and the delicacy of orbital dynamics.  Anders Sandberg responded to the post on his blog and assessed whether these solar system disasters represented a potential Great Filter to explain the Fermi Paradox, which they did not[1]. However, x-risks to solar system-wide civilisations were certainly possible. Charlie Stross then made a post where he suggested that some of these x-risks could destroy a galactic civilisation too, most notably griefers (von Neumann probes). The fact that it only takes one colony among many to create griefers means that the dispersion and huge population of galactic civilisations[2] may actually be a disadvantage in x-risk mitigation.  In addition to getting through this current period of high x-risk, we should aim to create a civilisation that is able to withstand x-risks for as long as possible so that as much of the value[3] of the univers
 ·  · 13m read
 · 
  There is dispute among EAs--and the general public more broadly--about whether morality is objective.  So I thought I'd kick off a debate about this, and try to draw more people into reading and posting on the forum!  Here is my opening volley in the debate, and I encourage others to respond.   Unlike a lot of effective altruists and people in my segment of the internet, I am a moral realist.  I think morality is objective.  I thought I'd set out to defend this view.   Let’s first define moral realism. It’s the idea that there are some stance independent moral truths. Something is stance independent if it doesn’t depend on what anyone thinks or feels about it. So, for instance, that I have arms is stance independently true—it doesn’t depend on what anyone thinks about it. That ice cream is tasty is stance dependently true; it might be tasty to me but not to you, and a person who thinks it’s not tasty isn’t making an error. So, in short, moral realism is the idea that there are things that you should or shouldn’t do and that this fact doesn’t depend on what anyone thinks about them. So, for instance, suppose you take a baby and hit it with great force with a hammer. Moral realism says: 1. You’re doing something wrong. 2. That fact doesn’t depend on anyone’s beliefs about it. You approving of it, or the person appraising the situation approving of it, or society approving of it doesn’t determine its wrongness (of course, it might be that what makes its wrong is its effects on the baby, resulting in the baby not approving of it, but that’s different from someone’s higher-level beliefs about the act. It’s an objective fact that a particular person won a high-school debate round, even though that depended on what the judges thought). Moral realism says that some moral statements are true and this doesn’t depend on what people think about it. Now, there are only three possible ways any particular moral statement can fail to be stance independently true: 1. It’s
 ·  · 2m read
 · 
Summary Arkose is an early-stage AI safety fieldbuilding nonprofit focused on accelerating the involvement of experienced machine learning professionals in technical AI safety research through direct outreach, one-on-one calls, and public resources. Between December 2023 and June 2025, we had one-on-one calls with 311 such professionals. 78% of those professionals said their initial call accelerated their involvement in AI safety[1].  Unfortunately, we’re closing due to a lack of funding.  We remain excited about other attempts at direct outreach to this population, and think the right team could have impact here. Why are we closing? Over the past year, we’ve applied for funding from all of the major funders interested in AI safety fieldbuilding work, and several minor funders. Rather than try to massively change what we're doing to appeal to funders, with a short funding runway and little to no feedback, we’re choosing to close down and pursue other options. What were we doing? Why? * Calls: we ran 1:1 calls with mid-career machine learning professionals. Calls lasted an average of 37 minutes (range: 10-79), and we had a single call with 96% of professionals we spoke with (i.e. only 4% had a second or third call with us). On call, we focused on: * Introducing existential and catastrophic risks from AI * Discussing research directions in this field, and relating them to the professional’s areas of expertise. * Discussing specific opportunities to get involved (e.g. funding, jobs, upskilling), especially ones that would be a good fit for the individual. * Giving feedback on their existing plans to get involved in AI safety (if they have them). * Connecting with advisors to support their next steps in AI safety, if appropriate (see below). * Supportive Activities: * Accountability: after calls, we offered an accountability program where participants set goals for next steps, and we check in with them. 114 call participants set goals for check