IMO it is harmful on expectation for a technical safety researcher to work at DeepMind, OpenAI or Anthropic.
Four reasons:
- Interactive complexity. The intractability of catching up – by trying to invent general methods for AI corporations to somehow safely contain model interactions, as other engineers scale models' combinatorial complexity and outside connectivity.
- Safety-capability entanglements
- Commercialisation. Model inspection and alignment techniques can support engineering and productisation of more generally useful automated systems.
- Infohazards. Researching capability risks within an AI lab can inspire researchers hearing about your findings to build new capabilities.
- Shifts under competitive pressure
- DeepMind merged with Google Brain to do commercialisable research,
OpenAI set up a company and partnered with Microsoft to release ChatGPT,
Anthropic pitched to investors they'd build a model 10 times more capable. - If you are an employee at one of these corporations, higher-ups can instruct you to do R&D you never signed up to do.[1] You can abide, or get fired.
- Working long hours surrounded by others paid like you are, by a for-profit corp, is bad for maintaining bearings and your epistemics on safety.[2]
- DeepMind merged with Google Brain to do commercialisable research,
- Safety-washing. Looking serious about 'safety' helps labs to recruit idealistic capability researchers, lobby politicians, and market to consumers.
- 'let's build AI to superalign AI'
- 'look, pretty visualisations of what's going on inside AI'
This is my view. I would want people to engage with the different arguments, and think for themselves what ensures that future AI systems are actually safe.
- ^
I heard via via that Google managers are forcing DeepMind safety researchers to shift some of their hours to developing Gemini for product-ready launch.
I cannot confirm whether that's correct. - ^
For example, I was in contact with a safety researcher at an AGI lab who kindly offered to read my comprehensive outline on the AGI control problem, to consider whether to share with colleagues. They also said they're low energy. They suggested I'd remind them later, and I did, but they never got back to me. They're simply too busy it seems.
80,000 Hours handpicks jobs at AGI labs.
Some of those jobs don't even focus on safety – instead they look like policy lobbying roles or engineering support roles.
Nine months ago, I wrote my concerns to 80k staff:
80k removed one of the positions I flagged:
Software Engineer, Full-Stack, Human Data Team (reason given: it looked potentially more capabilities-focused than the original job posting that came into their system).
For the rest, little has changed:
research engineer product, prompt engineer, IT support, senior software engineer.
Two others in AI Safety also discovered the questionable job listings. They are disappointed in 80k.
Feeling exasperated about this. Thinking of putting out another post just to discuss this issue.
Hi Conor,
Thank you.
I’m glad to see that you already linked to clarifications before. And that you gracefully took the feedback, and removed the prompt engineer role. I feel grateful for your openness here.
It makes me feel less like I’m hitting a brick wall. We can have more of a conversation.
~ ~ ~
The rest is addressed to people on the team, and not to you in particular:
There are grounded reasons why 80k’s approaches to recommending work at AGI labs – with the hope of steering their trajectory – has supported AI corporations to scale. While disabling effort... (read more)