IMO it is harmful on expectation for a technical safety researcher to work at DeepMind, OpenAI or Anthropic.
Four reasons:
- Interactive complexity. The intractability of catching up – by trying to invent general methods for AI corporations to somehow safely contain model interactions, as other engineers scale models' combinatorial complexity and outside connectivity.
- Safety-capability entanglements
- Commercialisation. Model inspection and alignment techniques can support engineering and productisation of more generally useful automated systems.
- Infohazards. Researching capability risks within an AI lab can inspire researchers hearing about your findings to build new capabilities.
- Shifts under competitive pressure
- DeepMind merged with Google Brain to do commercialisable research,
OpenAI set up a company and partnered with Microsoft to release ChatGPT,
Anthropic pitched to investors they'd build a model 10 times more capable. - If you are an employee at one of these corporations, higher-ups can instruct you to do R&D you never signed up to do.[1] You can abide, or get fired.
- Working long hours surrounded by others paid like you are, by a for-profit corp, is bad for maintaining bearings and your epistemics on safety.[2]
- DeepMind merged with Google Brain to do commercialisable research,
- Safety-washing. Looking serious about 'safety' helps labs to recruit idealistic capability researchers, lobby politicians, and market to consumers.
- 'let's build AI to superalign AI'
- 'look, pretty visualisations of what's going on inside AI'
This is my view. I would want people to engage with the different arguments, and think for themselves what ensures that future AI systems are actually safe.
- ^
I heard via via that Google managers are forcing DeepMind safety researchers to shift some of their hours to developing Gemini for product-ready launch.
I cannot confirm whether that's correct. - ^
For example, I was in contact with a safety researcher at an AGI lab who kindly offered to read my comprehensive outline on the AGI control problem, to consider whether to share with colleagues. They also said they're low energy. They suggested I'd remind them later, and I did, but they never got back to me. They're simply too busy it seems.
I did read that compilation of advice, and responded to that in an email (16 May 2023):
"Dear [a],
People will drop in and look at job profiles without reading your other materials on the website. I'd suggest just writing a do-your-research cautionary line about OpenAI and Anthropic in the job descriptions itself.
Also suggest reviewing whether to trust advice on whether to take jobs that contribute to capability research.
Totally up to you of course.
Warm regards,
Remmelt"
This is what the article says:
"All that said, we think it’s crucial to take an enormous amount of care before working at an organisation that might be a huge force for harm. Overall, it’s complicated to assess whether it’s good to work at a leading AI lab — and it’ll vary from person to person, and role to role."
So you are saying that people are making a decision about working for an AGI lab that might be (or actually is) a huge force for harm. And that whether it's good (or bad) to work at an AGI lab depends on the person – ie. people need to figure this out for them personally.
Yet you are openly advertising various jobs at AGI labs on the job board. People are clicking through and applying. Do you know how many read your article beforehand?
~ ~ ~
Even if they did read through the article, both the content and framing of the advice seems misguided. Noticing what is emphasised in your considerations.
Here are the first sentences of each consideration section:
(ie. as what readers are most likely to read, and what you might most want to convey).
"Labs also often don’t have enough staff... to figure out what they should be lobbying governments for (we’d guess that many of the top labs would lobby for things that reduce existential risks)."
~ ~ ~
After that, there is a new section titled "How can you mitigate the downsides of this option?"