Hide table of contents

The Moonshot Alignment Program is a 5-week, high-intensity research sprint for people ready to tackle the alignment problem head on. Each week pushes you through a new phase;  from designing bold alignment strategies to building and testing them with real, empirical evidence.

What to Expect 

This program is built around four core research tracks, each tackling a crucial angle of the alignment challenge.

  • Agent Foundations Theory focuses on building formal models of agents and understanding how values are formed at a fundamental level.
  • Applied Agent Foundations takes these ideas further by implementing and rigorously testing agent models.
  • Neuroscience based AI Alignment explores architectures inspired by how the brain encodes and processes values, bringing insights from cognitive science into AI safety.
  •  Improved Preference Optimization works on designing oversight methods that embed values deeply into systems while remaining robust and scalable.

For participants with bold and original ideas that do not fit neatly into any of these tracks, we have created an Open Track designed to give you the freedom to explore uncharted territory in alignment research.

Program Details

The Program builds on the momentum and lessons from our previous hackathons, which have already produced remarkable results. Past teams have gone on to publish their work, including a paper accepted at ICML, while others have continued collaborating long after the events ended, with one team now developing a new evaluation for alignment faking. This program draws from that same spirit of ambitious, fast paced research, shaped by insights from the many researchers we have interviewed and learned from. Their guidance has been invaluable in designing an experience that gives participants the best chance to create work that truly moves the field forward.

This program is guided by an exceptional group of mentors who bring both depth of expertise and a track record of advancing alignment research. They include:

  • Abram Demski, widely regarded as the world’s leading researcher on Tiling Agents. 
  • Cole Wyeth, a PhD researcher at the University of Waterloo, supervised by Professor Ming Li and mentored by Marcus Hutter, on AIXI.
  • Peter Trocsanyi , holds a Masters in Theoretical & Mathematical Physics and is an experienced AI Alignment researcher.
  • Leonard Piff, the Intro Fellowship Lead at Cornell AI Alignment and a Member of Technical Staff at AI Plans.

The Application Process

Stage 1: Expression of Interest

Here you will submit your CV, indicate your estimated likelihood of being able to commit at least 10 hours per week starting August 2, and select the tracks you are most interested in. You are also welcome to share anything else relevant that is not captured in your CV. We guarantee personalized feedback to the first 186 applicants.

Stage 2: Knowledge Check

You will complete 15 timed multiple-choice questions tailored to the tracks you selected. For example, if you choose Agent Foundations, your questions may cover topics like theoretical computer science, probability, and decision theory. We will also accept expressions of interest from applicants who would like to serve as team leaders during the program.

Stage 3: Team Formation and Idea Submission

Qualified applicants will gain access to a private Discord server where teams will take shape and ideas will start to form. Each team will choose a research direction and submit a proposal, laying the foundation for their project. To guide this process, we provide clear, track specific resources that highlight the current state of the field, proven methods, and key bottlenecks identified through conversations with senior researchers. The first 100 proposals will receive detailed feedback, ensuring teams have the insights they need to refine and strengthen their approach. Evaluation will not be based solely on initial ideas, but on how well teams respond to feedback, adapt their strategies, and continuously improve their work over time.

Demo Day

The program will culminate in a high-energy public poster session and virtual job fair, your chance to put your research on the map. In an interactive conference on GatherTown, each team will own a dedicated space to showcase their breakthroughs, answer tough questions, and defend their methods in front of peers and senior researchers. The best projects will be recognized through votes from leading alignment experts, giving standout teams the spotlight they deserve. Following the poster session, top research organizations, labs, and startups will host booths at the virtual job fair, offering participants direct access to career opportunities, collaborations, and future projects that could shape the next wave of AI alignment research.

Apply here by July 25

This program is hosted by AI Plans. You can apply to join our team here and/or support our work here.

If you have any questions about the program, please reach out to kabir@ai-plans.com

5

0
0

Reactions

0
0

More posts like this

Comments
No comments on this post yet.
Be the first to respond.
Curated and popular this week
Relevant opportunities