Aman Patel and I have been thinking a lot about the AI Safety Pipeline. In the course of doing so we haven't come across any visualizations. Here's a second draft of a visualized pipeline of what organizations/programs are relevant to moving through the pipeline (mostly in terms of theoretical approaches). It is incomplete. There are orgs that are missing because we don't know enough about them (e.g., ERIs). The existing orgs are probably not exactly where they should be, and there's tons of variation within participants of each program; there should be really long tails. The pipeline moves left to right and we have placed organizations generally where they seem to fit in on the pipeline. The height of each program represents its capacity or number of participants.

This is a super rough draft, but it might be useful for some people to see the visualization and recognize where their beliefs/knowledge differs from ours.

We support somebody else putting more than an hour into this project and making a nicer/more accurate visual.

Current version as of 4/11:

11

0
0

Reactions

0
0
Comments8


Sorted by Click to highlight new comments since:

It's misleading to say you're mapping the AIS pipeline but then to illustrate AIS orgs alone. Most AIS researchers will let their junior level research be non-AIS research with a professor at their school. Professional level research will usually be a research masters or PhD (or sometimes industry). So you should draw these.

Similar messaging must be widespread, because many of the undergraduates that I meet at EAGs get the impression that they are highly likely to have to drop out of school to do AIS. Of course, that can be a solid option if you have an AIS job lined up, or in some number other cases (the exact boundaries of which are subject to ongoing debate).

Bottom line - omitting academic pathways here and elsewhere will perpetuate a message that is descriptively inaccurate, and pushes people in an unorthodox direction whose usefulness is debateable.

Thank you for your comment. Personally, I'm not too bullish on academia, but you make good points as to why it should be included. I've updated the graphic and is says this "*I don’t know very much about academic programs in this space. They seem to vary in their relevance, but it is definitely possible to gain the skills in academia to contribute to professional alignment research. This looks like a good place for further interest: https://futureoflife.org/team/ai-existential-safety-community/"

If you have other ideas you would like expressed in the graphic I am happy to include them!

This is potentially useful but I think it would be more helpful if you explained what the acronyms refer to (thinking mainly about AGISF which I haven't heard of SERI MATS which I know about but others may not) and link to the webpages that represent the organizations so people know where to get more information.

Thanks for the reminder of this! Will update. Some don't have websites but I'll link what I can find.

Thanks for making this!

A detail: I wonder if it'd make sense to nudge the SERI summer program bar partway to "junior level research" (since most participants' time is spent on original research)? My (less informed) impression is that CERI and CHERI are also around there.

Thanks! Nudged. I'm going to not include CERI and CHERI at the moment because I don't know much about them. I'll make a note of them

Thanks for this!

Does "Learning the Basics" specifically mean learning AI Safety basics, or does this also include foundational AI/ML (in general, not just safety) learning? I'm wondering because I'm curious if you mean that the things under "Learning the Basics" could be done with little/no background in ML.

Good question. I think "Learning the Basics" is specific to AI Safety basics and does not require a strong background in AI/ML. My sense is that the AI Safety basics and ML are slightly independent. The ML side of things simply isn't pictured here. For example, the MLAB (Machine Learning for Alignment Bootcamp) program which ran a few months ago focused on taking people with good software engineering skills and bringing them up to speed on ML. As far as I can tell, the focus was not on alignment specifically, but was intended for people likely to work in alignment. I think the story of what's happening is way more complicated than a 1 dimensional (plus org size) chart, and the skills needed might be an intersection of software engineering, ML, and AI Safety basics. 

Curated and popular this week
 ·  · 13m read
 · 
Notes  The following text explores, in a speculative manner, the evolutionary question: Did high-intensity affective states, specifically Pain, emerge early in evolutionary history, or did they develop gradually over time? Note: We are not neuroscientists; our work draws on our evolutionary biology background and our efforts to develop welfare metrics that accurately reflect reality and effectively reduce suffering. We hope these ideas may interest researchers in neuroscience, comparative cognition, and animal welfare science. This discussion is part of a broader manuscript in progress, focusing on interspecific comparisons of affective capacities—a critical question for advancing animal welfare science and estimating the Welfare Footprint of animal-sourced products.     Key points  Ultimate question: Do primitive sentient organisms experience extreme pain intensities, or fine-grained pain intensity discrimination, or both? Scientific framing: Pain functions as a biological signalling system that guides behavior by encoding motivational importance. The evolution of Pain signalling —its intensity range and resolution (i.e., the granularity with which differences in Pain intensity can be perceived)— can be viewed as an optimization problem, where neural architectures must balance computational efficiency, survival-driven signal prioritization, and adaptive flexibility. Mathematical clarification: Resolution is a fundamental requirement for encoding and processing information. Pain varies not only in overall intensity but also in granularity—how finely intensity levels can be distinguished.  Hypothetical Evolutionary Pathways: by analysing affective intensity (low, high) and resolution (low, high) as independent dimensions, we describe four illustrative evolutionary scenarios that provide a structured framework to examine whether primitive sentient organisms can experience Pain of high intensity, nuanced affective intensities, both, or neither.     Introdu
 ·  · 3m read
 · 
We’ve redesigned effectivealtruism.org to improve understanding and perception of effective altruism, and make it easier to take action.  View the new site → I led the redesign and will be writing in the first person here, but many others contributed research, feedback, writing, editing, and development. I’d love to hear what you think, here is a feedback form. Redesign goals This redesign is part of CEA’s broader efforts to improve how effective altruism is understood and perceived. I focused on goals aligned with CEA’s branding and growth strategy: 1. Improve understanding of what effective altruism is Make the core ideas easier to grasp by simplifying language, addressing common misconceptions, and showcasing more real-world examples of people and projects. 2. Improve the perception of effective altruism I worked from a set of brand associations defined by the group working on the EA brand project[1]. These are words we want people to associate with effective altruism more strongly—like compassionate, competent, and action-oriented. 3. Increase impactful actions Make it easier for visitors to take meaningful next steps, like signing up for the newsletter or intro course, exploring career opportunities, or donating. We focused especially on three key audiences: * To-be direct workers: young people and professionals who might explore impactful career paths * Opinion shapers and people in power: journalists, policymakers, and senior professionals in relevant fields * Donors: from large funders to smaller individual givers and peer foundations Before and after The changes across the site are aimed at making it clearer, more skimmable, and easier to navigate. Here are some side-by-side comparisons: Landing page Some of the changes: * Replaced the economic growth graph with a short video highlighting different cause areas and effective altruism in action * Updated tagline to "Find the best ways to help others" based on testing by Rethink
 ·  · 7m read
 · 
The company released a model it classified as risky — without meeting requirements it previously promised This is the full text of a post first published on Obsolete, a Substack that I write about the intersection of capitalism, geopolitics, and artificial intelligence. I’m a freelance journalist and the author of a forthcoming book called Obsolete: Power, Profit, and the Race to Build Machine Superintelligence. Consider subscribing to stay up to date with my work. After publication, this article was updated to include an additional response from Anthropic and to clarify that while the company's version history webpage doesn't explicitly highlight changes to the original ASL-4 commitment, discussion of these changes can be found in a redline PDF linked on that page. Anthropic just released Claude 4 Opus, its most capable AI model to date. But in doing so, the company may have abandoned one of its earliest promises. In September 2023, Anthropic published its Responsible Scaling Policy (RSP), a first-of-its-kind safety framework that promises to gate increasingly capable AI systems behind increasingly robust safeguards. Other leading AI companies followed suit, releasing their own versions of RSPs. The US lacks binding regulations on frontier AI systems, and these plans remain voluntary. The core idea behind the RSP and similar frameworks is to assess AI models for dangerous capabilities, like being able to self-replicate in the wild or help novices make bioweapons. The results of these evaluations determine the risk level of the model. If the model is found to be too risky, the company commits to not releasing it until sufficient mitigation measures are in place. Earlier today, TIME published then temporarily removed an article revealing that the yet-to-be announced Claude 4 Opus is the first Anthropic model to trigger the company's AI Safety Level 3 (ASL-3) protections, after safety evaluators found it may be able to assist novices in building bioweapons. (The