Hide table of contents

TLDR; We present the AI safety ideas and research platform AI Safety Ideas in open alpha. Add and explore research ideas on the website here: aisafetyideas.com.

AI Safety Ideas has been accessible for a while in an alpha state (4 months, on-and-off development) and we now publish it in open alpha to receive feedback and develop it continuously with the community of researchers and students in AI safety. All of the projects are either from public sources (e.g. AlignmentForum posts) or posted on the website itself.

The current website represents the first steps towards an accessible crowdsourced research platform for easier research collaboration and hypothesis testing.

The gap in AI safety

Research prioritization & development

Research prioritization is hard and even more so in a pre-paradigmatic field like AI safety. We can grok the highest-karma post on the AlignmentForum but is there another way?

With AI Safety Ideas, we introduce a collaborative way to prioritize and work on specific agendas together through social features. We hope this can become a scalable research platform for AI safety.

Successful examples of less systematized but similar, collaborative, online, and high quality output projects can be seen in Discord servers such as EleutherAI, CarperAI, Stability AI, and Yannic Kilcher’s Discord, in hackathons, and in competitions such as the inverse scaling competition.

Additionally, we are also missing an empirically driven impact evaluation of AI safety projects. With the next steps of development described further down, we hope to make this easier and more available while facilitating more iteration in AI safety research. Systemized hypotheses testing with bounties can help funders directly fund specific results and enables open evaluation of agendas and research projects.

Mid-career & student newcomers

Novice and entrant participation in AI safety research is mostly present in two forms at the moment: 1) Active or passive part-time course participation with a capstone project (AGISF, ML Safety) and 2) flying to London or Berkeley for three months to participate in full-time paid studies and research (MLAB, SERI MATS, PIBBSS, Refine).

Both are highly valuable but a third option seems to be missing: 3) An accessible, scalable, low time commitment, open research opportunityVery few people work in AI safety and allowing decentralized, volunteer or bounty-driven research will allow many more to contribute to this growing field.

Choices oh choices

By allowing this flexible research opportunity, we can attract people who cannot participate in option (2) because of visa, school / life / work commitments, location, rejection, or funding while we can attract a more senior and active audience compared to option (1).

Next steps

OctReleasing and building up the user base and crowdsourced content. Create an insider build to test beta features. Apply to join the insider build here.
NovImplementing hypothesis testing features: Creating hypotheses, linking ideas and hypotheses, adding negative and positive results to hypotheses. Creating an email notification system.
DecCollaboration features: Contact others interested in the same idea and mentor ideas. A better commenting system with a results comment that can indicate if the project has been finished or not, what the results are, and by who was it done.
JanAdding moderation features: Accepting results, moderating hypotheses, admin users. Add bounty features for the hypotheses and a simple user karma system.
FebShare with ML researchers and academics in EleutherAI and CarperAI. Implement the ability to create special pages with specific private and public ideas curated for a specific purpose (title and description included). Will help integrate with local events, e.g. the Alignment Jams.
Mar<Allow editing and save editing history of hypotheses and ideas. Get DOIs for reviewed hypothesis result pages. Implement the EigenKarma karma system. Implement automatic auditing by NLP. Monitor the progress on different clusters of hypotheses and research ideas (research agendas). Release meta-science research on the projects that have come out of the platform and the general progress.

 

Risks

  1. Wrong incentives on the AI Safety ideas platform leads to people working on others’ agendas instead of working on their own inside view.
  2. AI Safety Ideas does not receive traction and by extension becomes less useful than would be expected.
  3. Some users who do alignment research without a profound understanding of why alignment is important, discover ideas that have the potential to help AI capabilities, without being worried enough about info hazards to contain them properly.
  4. Project bounties on the AI Safety Ideas platform will be occupied by capabilities-first agendas and mislead new researchers.

Risk mitigation

Several of these are not implemented yet but will be as we develop it further.

  1. Ensure that specific agendas do not get special attention compared to others and implement incentives to work on new or updated projects and hypotheses. Have structured meetings and feedback sessions with leaders in AI safety field-building and conduct regular research about how the platform is used.
  2. Do regular, live user interviews and ensure giving feedback is quick and easy. We have interviewed 18 until now and have automated feedback monitoring on our server. We will embed a feedback form directly on the website. Evaluate usefulness of features by creating an insider build.
  3. Restricting themes within AI safety and nudging towards safety thinking in communication. It is also a risk if these capabilities-grokking capable researchers work independently and we might be able to pivot their attitude towards safety by providing this platform.
  4. Ensure vetting of the ideas and users. Make the purpose and policies of the website very clear. Invite admin users based on AlignmentForum karma with the ability to downvote ideas, leading to hiding it until further evaluation.

Feedback

Give anonymous feedback on the website here or write your feedback in the comments. If you end up using the website, we also appreciate your in-depth feedback here (2-5 min). If you want any of your ideas removed or rephrased on the website, please send an email to operations@apartresearch.com.

PS: It is still very much in alpha and there might be mistakes in the research project descriptions. Please do point out any problems in the "Report an issue".

Help out

The platform is open source and we appreciate any pull requests on the insider branch. Add any bugs or feature requests on the issues page

Apply to join the insider builds here to give feedback for the next versions. Join our Discord to discuss the development.

Thanks to Plex, Maris Sala, Sabrina Zaki, Nonlinear, Thomas Steinthal, Michael Chen, Aqeel Ali, JJ Hepburn, Nicole Nohemi, and Jamie Bernardi.

Comments14


Sorted by Click to highlight new comments since:

This looks like an interesting platform for sharing ideas about AI safety.

You mention that AI safety is a 'pre-paradigmatic field' -- however, to a newcomer like me, the safety ideas and projects on the AI Safety Ideas site so far look pretty 'paradigmatic', in the sense of closely following the standard EA AGI X-risk paradigm that's centered around the ideas of Yudkowsky, Bostrom, MIRI, utility maximization, instrumental convergence, deep learning, fast takeoff, etc.

I worry that this reflects & encourages a premature convergence onto a governing paradigm that may deter newcomers from contributing new ideas that fall outside the current 'Overton window' of AI alignment. For example: (1) a lot of AI alignment work seems based on a tacit assumption that AI won't become an global catastrophic risk until it reaches AGI level, but I can see reasonable arguments that even quite narrow AI could be severely risky long before AGI is reached. Also, (2) a lot of AI alignment seems focused much more on how to align specific AI systems with specific human users, rather than on how to align human groups that are using AI, with other potentially conflicting groups that are using AI.

So, I guess the question arises: to what extent do you want AI Safety Ideas to elicit new ideas within the current paradigm, versus new ideas that stray outside the current paradigm?

You raise a very good point that I agree with. Right now, the platform is definitely biased towards the existing paradigm. This will probably be the case during the first few months, but we hope that it will help make the exploration of new directions and paradigms easier at the same time. 

This also raises the point of the ideas currently playing into the canon of AI safety instead of looking at the vast literature outside of AI safety that concerns itself with the same topics but with another framing.

So to answer your questions; we want AISI to make it easier to elicit new ideas in all paradigms and directions with our personal bias moving that more towards new perspectives as we implement better functionality.

Esben -- thanks very much for your reply. That all makes sense -- to develop a gradual broadening-out from the current paradigm to welcoming new perspectives from other existing research traditions.

[comment deleted]1
0
0

Small note: the title made me think the platform is made by the organization Open AI

Same. I suggest "AI Safety Ideas: a collaborative AI safety research platform"

Very true! We have graciously adopted your formulation. Thank you.

Yeah, I thought this too. 

The application form is showing up as private for me. Very cool idea though, the success of Eleuther and Stability suggests that this is a viable model. Excited to see it unfold and hopefully contribute!

Thank you! It should be fixed now.

One way of doing automated AI safety research is for AI safety researchers to create AI safety ideas on aisafetyideas.com and then use the titles as prompts for a language model. Here is GPT-3 generating a response to one of the ideas:

Uuh, interesting! Maybe I'll do that as a weekend project for fun. An automatic comment based on the whole idea as a prompt.

Curated and popular this week
 ·  · 4m read
 · 
Forethought[1] is a new AI macrostrategy research group cofounded by Max Dalton, Will MacAskill, Tom Davidson, and Amrit Sidhu-Brar. We are trying to figure out how to navigate the (potentially rapid) transition to a world with superintelligent AI systems. We aim to tackle the most important questions we can find, unrestricted by the current Overton window. More details on our website. Why we exist We think that AGI might come soon (say, modal timelines to mostly-automated AI R&D in the next 2-8 years), and might significantly accelerate technological progress, leading to many different challenges. We don’t yet have a good understanding of what this change might look like or how to navigate it. Society is not prepared. Moreover, we want the world to not just avoid catastrophe: we want to reach a really great future. We think about what this might be like (incorporating moral uncertainty), and what we can do, now, to build towards a good future. Like all projects, this started out with a plethora of Google docs. We ran a series of seminars to explore the ideas further, and that cascaded into an organization. This area of work feels to us like the early days of EA: we’re exploring unusual, neglected ideas, and finding research progress surprisingly tractable. And while we start out with (literally) galaxy-brained schemes, they often ground out into fairly specific and concrete ideas about what should happen next. Of course, we’re bringing principles like scope sensitivity, impartiality, etc to our thinking, and we think that these issues urgently need more morally dedicated and thoughtful people working on them. Research Research agendas We are currently pursuing the following perspectives: * Preparing for the intelligence explosion: If AI drives explosive growth there will be an enormous number of challenges we have to face. In addition to misalignment risk and biorisk, this potentially includes: how to govern the development of new weapons of mass destr
jackva
 ·  · 3m read
 · 
 [Edits on March 10th for clarity, two sub-sections added] Watching what is happening in the world -- with lots of renegotiation of institutional norms within Western democracies and a parallel fracturing of the post-WW2 institutional order -- I do think we, as a community, should more seriously question our priors on the relative value of surgical/targeted and broad system-level interventions. Speaking somewhat roughly, with EA as a movement coming of age in an era where democratic institutions and the rule-based international order were not fundamentally questioned, it seems easy to underestimate how much the world is currently changing and how much riskier a world of stronger institutional and democratic backsliding and weakened international norms might be. Of course, working on these issues might be intractable and possibly there's nothing highly effective for EAs to do on the margin given much attention to these issues from society at large. So, I am not here to confidently state we should be working on these issues more. But I do think in a situation of more downside risk with regards to broad system-level changes and significantly more fluidity, it seems at least worth rigorously asking whether we should shift more attention to work that is less surgical (working on specific risks) and more systemic (working on institutional quality, indirect risk factors, etc.). While there have been many posts along those lines over the past months and there are of course some EA organizations working on these issues, it stil appears like a niche focus in the community and none of the major EA and EA-adjacent orgs (including the one I work for, though I am writing this in a personal capacity) seem to have taken it up as a serious focus and I worry it might be due to baked-in assumptions about the relative value of such work that are outdated in a time where the importance of systemic work has changed in the face of greater threat and fluidity. When the world seems to
Sam Anschell
 ·  · 6m read
 · 
*Disclaimer* I am writing this post in a personal capacity; the opinions I express are my own and do not represent my employer. I think that more people and orgs (especially nonprofits) should consider negotiating the cost of sizable expenses. In my experience, there is usually nothing to lose by respectfully asking to pay less, and doing so can sometimes save thousands or tens of thousands of dollars per hour. This is because negotiating doesn’t take very much time[1], savings can persist across multiple years, and counterparties can be surprisingly generous with discounts. Here are a few examples of expenses that may be negotiable: For organizations * Software or news subscriptions * Of 35 corporate software and news providers I’ve negotiated with, 30 have been willing to provide discounts. These discounts range from 10% to 80%, with an average of around 40%. * Leases * A friend was able to negotiate a 22% reduction in the price per square foot on a corporate lease and secured a couple months of free rent. This led to >$480,000 in savings for their nonprofit. Other negotiable parameters include: * Square footage counted towards rent costs * Lease length * A tenant improvement allowance * Certain physical goods (e.g., smart TVs) * Buying in bulk can be a great lever for negotiating smaller items like covid tests, and can reduce costs by 50% or more. * Event/retreat venues (both venue price and smaller items like food and AV) * Hotel blocks * A quick email with the rates of comparable but more affordable hotel blocks can often save ~10%. * Professional service contracts with large for-profit firms (e.g., IT contracts, office internet coverage) * Insurance premiums (though I am less confident that this is negotiable) For many products and services, a nonprofit can qualify for a discount simply by providing their IRS determination letter or getting verified on platforms like TechSoup. In my experience, most vendors and companies