During the last 2.5 or 3 years, I have been trying to learn and get experience on AI Safety so that when I finish my PhD in Physics I might be able to contribute to the AI Safety community efforts. During these years I have noticed that:
- There seem to be large groups of potential junior researchers, and the community has several programs in place for them such as the AI Safety Camp or AI Safety Research Program.
- Funding is growing, but still largely concentrated in a handful of places such as CHAI, FHI, CSER or institutes not affiliated to universities (eg Center for Long Term Risks); and a few companies (DeepMind, OpenAI, Anthropic). But it seems to me that there are still great places out there where research could happen and is currently not.
So, given that the existence of more senior researchers seems to be a bottleneck: what can the community do to get them more interested in this topic? I read somewhere that there are two ways to get people involved:
- Telling them this is an important problem.
- Telling them this is an interesting problem.
I think it may be worth trying out a combination of both, eg: "hey, this is an interesting problem that could be important to make systems reliable". I really don't think one needs to convince them about longtermism as a previous step.
In any case, I wanted to use this question to generate concrete actions such that people such as the EA Long Term Fund managers could put money to solve this bottleneck. The only time I have seen that something similar is the "Claudia Shi ($5,000): Organizing a "Human-Aligned AI” event at NeurIPS." donation registered in https://funds.effectivealtruism.org/funds/payouts/september-2020-long-term-future-fund-grants.
There might also be other ways, but I don't think I know academic dynamics so well to know. In any case, I am aware that publication of technical AI Safety in these conferences does not seem to be an issue, so I believe the bottleneck is in getting them to be genuinely interested on the topic.
Academics choose to work on things when they're doable, important, interesting, publishable, and fundable. Importance and interestingness seem to be the least bottlenecked parts of that list.
The root of the problem is difficulty in evaluating the quality of work. There's no public benchmark for AI safety that people really believe in (nor do I think there can be, yet - talk about AI safety is still a pre-paradigmatic problem), so evaluating the quality of work actually requires trusted experts sitting down and thinking hard about a paper - much harder than just checking if it beat the state of the art. This difficulty restricts doability, publishability, and fundability. It also makes un-vetted research even less useful to you than it is in other fields.
Perhaps the solution is the production of a lot more experts, but becoming an expertise on this "weird" problem takes work - work that is not particularly important or publishable, and so working academics aren't going to take a year or two off to do it. At best we could sponsor outreach events/conferences/symposia aimed at giving academics some information and context to make somewhat better evaluations of the quality of AI safety work.
Thus I think we're stuck with growing the ranks of experts not slowly per se (we could certainly be growing faster), but at least gradually, and then we have to leverage that network of trust both to evaluate academic AI safety work for fundability / publishability, and also to inform it to improve doability.