Should you work at a leading AI lab? (including in non-safety roles)

Benjamin Hilton; 80000_Hours

This post is a (slightly) edited cross-post of a new 80,000 Hours career review on working at a a leading AI lab. See LessWrong comments here.

Summary

Working at a leading AI lab is an important career option to consider, but the impact of any given role is complex to assess. It comes with great potential for career growth, and many roles could be (or lead to) highly impactful ways of reducing the chances of an AI-related catastrophe — one of the world’s most pressing problems. However, there’s a risk of doing substantial harm in some cases. There are also roles you should probably avoid.

Pros

Many roles have a high potential for impact by reducing risks from AI
Among the best and most robust ways to gain AI-specific career capital
Possibility of shaping the lab’s approach to governance, security, and standards

Cons

Can be extremely competitive to enter
Risk of contributing to the development of harmful AI systems
Stress and frustration, especially because of a need to carefully and frequently assess whether your role is harmful

Key facts on fit

Excellent understanding of the risks posed by future AI systems, and for some roles, comfort with a lot of quick and morally ambiguous decision making. You’ll also need to be a good fit for the specific role you’re applying for, whether you’re in research, comms, policy, or something else (see our related career reviews).

Overall recommendation: it's complicated

We think there are people in our audience for whom this is their highest impact option — but some of these roles might also be very harmful for some people. This means it's important to take real care figuring out whether you're in a harmful role, and, if not, whether the role is a good fit for you.

Review status

Based on a medium-depth investigation

This review is informed by two surveys of people with expertise about this path — one on whether you should be open to roles that advance AI capabilities (written up here), and a second follow-up survey. We also performed an in-depth investigation into at least one of our key uncertainties concerning this path. Some of our views will be thoroughly researched, though it's likely there are still some gaps in our understanding, as many of these considerations remain highly debated.

Why might it be high-impact to work for a leading AI lab?

We think AI is likely to have transformative effects over the coming decades. We also think that reducing the chances of an AI-related catastrophe is one of the world’s most pressing problems.

So it’s natural to wonder — if you’re thinking about your career — whether it would be worth working in the labs that are doing the most to build, and shape, these future AI systems.

Working at a top AI lab, like Google DeepMind, OpenAI, or Anthropic, might be an excellent way to build career capital to work on reducing AI risk in the future. Their work is extremely relevant to solving this problem, which suggests you’ll likely gain directly useful skills, connections, and credentials (more on this later).

In fact, we suggest working at AI labs in many of our career reviews; it can be a great step in technical AI safety and AI governance and coordination careers. We’ve also looked at working in AI labs in our career reviews on information security, software engineering, data collection for AI alignment, and non-technical roles in AI labs.

What’s more, the importance of these organisations to the development of AI suggests that they could be huge forces for either good or bad (more below). If the former, they might be high-impact places to work. And if the latter, there’s still a chance that by working in a leading lab you may be able to reduce the risks.

All that said, we think it’s crucial to take an enormous amount of care before working at an organisation that might be a huge force for harm. Overall, it’s complicated to assess whether it’s good to work at a leading AI lab — and it’ll vary from person to person, and role to role. But we think this is an important option to consider for many people who want to use their careers to reduce the chances of an existential catastrophe (or other harmful outcomes) resulting from the development of AI.

What relevant considerations are there?

Labs could be a huge force for good — or harm

We think that a leading — but careful — AI project could be a huge force for good, and crucial to preventing an AI-related catastrophe. Such a project could, for example:

Engage in defensive deployment: using early, safe, but powerful AI systems to make the situation safer — for example, by using AI systems to contribute to AI safety research, produce evidence and demonstrations of risks, contribute to information security, and help with monitoring the risks. (Note that whether this will be possible is debated.)
Perform valuable empirical research into making sure that AI systems are safe, using state-of-the-art systems (possibly ones far more advanced than would be available outside a leading AI project)
Put huge effort into designing tests for danger, and credibly warning others if it does see designs of danger in its own systems
Set examples for other projects on governance, security, and adherence to standards.
Coordinate effectively with other AI companies and projects — for example, by sharing important safety findings and techniques, and possibly, if needed acquiring other projects or otherwise gaining visibility, influence, and control with which to prevent the deployment of any dangerous systems
Credibly and effectively lobby the government for helpful measures to reduce the risk

But a leading and uncareful — or just unlucky — AI project could be a huge danger to the world. It could, for example, generate hype and acceleration (which we’d guess is harmful), make it more likely (through hype, open-sourcing or other actions) that incautious players enter the field, normalise disregard for governance, standards and security, and ultimately it could even produce the very systems that cause a catastrophe.

So, in order to successfully be a force for good, a leading AI lab would need to balance continuing their development of powerful AI (and possibly even retaining a leadership position), whilst also appropriately prioritising doing things that reduce the risk overall.

This tightrope seems difficult to walk, with constant tradeoffs to make between success and caution. And it seems hard to assess from the outside which labs are doing this well. The top labs — as of 2023, OpenAI, Google DeepMind, and Anthropic — seem reasonably inclined towards safety, and it’s plausible that any or all of these could be successfully walking the tightrope, but we’re not really sure.

We don’t feel confident enough to give concrete recommendations on which of these labs people should or should not work for. We can only really recommend that you put work into forming your own views about whether a company is a force for good. But the fact that labs could be such a huge force for good is part of why we think it’s likely there are many roles at leading AI labs that are among the world’s most impactful positions.

It’s often excellent career capital

Top AI labs are high-performing, rapidly growing organisations. In general, one of the best ways to gain career capital is to go and work with any high-performing team — you can just learn a huge amount about getting stuff done. They also have excellent reputations more widely (AI is one of the world’s most sought-after fields right now, and the top labs are top for a reason). So you get the credential of saying you’ve worked in a leading lab, and you’ll also gain lots of dynamic, impressive connections. So even if we didn’t think the development of AI was a particularly pressing problem, they’d already seem good for career capital.

But you will also learn a huge amount about and make connections within AI in particular, and, in some roles, gain technical skills which could be much harder to learn elsewhere.

We think that, if you’re early in your career, this is probably the biggest effect of working for a leading AI lab, and the career capital is (generally) a more important consideration than the direct impact of the work. You’re probably not going to be having much impact at all, whether for good or for bad, when you’re just getting started.

However, your character is also shaped and built by the jobs you take, and matters a lot for your long-run impact, so is one of the components of career capital. Some experts we’ve spoken to warn against working at leading AI labs because you should always assume that you are psychologically affected by the environment you work in. That is, there’s a risk you change your mind without ever encountering an argument that you’d currently endorse (for example, you could end up thinking that it’s much less important to ensure that AI systems are safe, purely because that’s the view of people around you). Our impression is that leading labs are increasingly concerned about the risks, which makes this consideration less important — but we still think it should be taken into account in any decision you make. There are ways of mitigating this risk, which we’ll discuss later.

Of course, it’s important to compare working at an AI lab with other ways you might gain career capital. For example, to get into technical AI safety research, you may want to go do a PhD instead. Generally, the best option for career capital will depend on a number of factors, including the path you’re aiming for longer term and your personal fit for the options in front of you.

You might advance AI capabilities, which could be (really) harmful

We’d guess that, all else equal, we’d prefer that progress on AI capabilities was slower.

This is because it seems plausible that we could develop transformative AI fairly soon (potentially in the next few decades). This suggests that we could also build potentially dangerous AI systems fairly soon — and the sooner this occurs the less time society has to successfully mitigate the risks. As a broad rule of thumb, less time to mitigate risks seems likely to mean that the risks are higher overall.

But that’s not necessarily the case. There are reasons to think that advancing at least some kinds of AI capabilities could be beneficial. Here are a few:

This distinction between ‘capabilities’ research and ‘safety’ research is extremely fuzzy, and we have a somewhat poor track record of predicting which areas of research will be beneficial for safety work in the future. This suggests that work that advances some (and perhaps many) kinds of capabilities faster may be useful for reducing risks.
Moving faster could reduce the risk that AI projects that are less cautious than the existing ones can enter the field.
Lots of work that makes models more useful — and so could be classified as capabilities (for example, work to align existing large language models) — probably does so without increasing the risk of danger . This kind of work might allow us to use these models to reduce the risk overall, for example, through the kinds of defensive deployment discussed earlier.
It’s possible that the later we develop transformative AI, the faster (and therefore more dangerously) everything will play out, because other currently-constraining factors (like the amount of compute available in the world) could continue to grow independently of technical progress. Slowing down advances now could increase the rate of development in the future, when we’re much closer to being able to build transformative AI systems. This would give the world less time to conduct safety research with models that are very similar to ones we should be concerned about but which aren’t themselves dangerous. (When this is caused by a growth in the amount of compute, it’s often referred to as a hardware overhang.)

Overall, we think not all capabilities research is made equal — and that many roles advancing AI capabilities (especially more junior ones) will not be harmful, and could be beneficial. That said, our best guess is that the broad rule of thumb that there will be less time to mitigate the risks is more important than these other considerations — and as a result, broadly advancing AI capabilities should be regarded overall as probably harmful.

This raises an important question. In our article on whether it’s ever OK to take a harmful job to do more good, we ask whether it might be morally impermissible to do a job that causes serious harm, even if you think it’s a good idea on net.

It’s really unclear to us how jobs that advance AI capabilities fall into the framework proposed in that article.

This is made even more complicated by our view that a leading AI project could be crucial to preventing an AI-related catastrophe — and failing to prevent a catastrophe seems, in many value systems, similarly bad to causing one.

Ultimately, answering the question of moral permissibility is going to depend on ethical considerations about which we’re just hugely uncertain. Our guess is that it’s good for us to sometimes recommend that people work in roles that could harmfully advance AI capabilities — but we could easily change our minds on this.

For another article, we asked the 22 people we thought would be most informed about working in roles that advance AI capabilities — and who we knew had a range of views — to write a summary of their takes on the question: if you want to help prevent an AI-related catastrophe, should you be open to roles that also advance AI capabilities, or steer clear of them? There’s a range of views among the 11 responses we received, which we’ve published here.

You may be able to help labs reduce risks

As far as we can tell, there are many roles at leading AI labs where the primary effects of the roles could be to reduce risks.

Most obviously, these include research and engineering roles focused on AI safety. Labs also often don’t have enough staff in relevant teams to develop and implement good internal policies (like on evaluating and red-teaming their models and wider activity), or to figure out what they should be lobbying governments for (we’d guess that many of the top labs would lobby for things that reduce existential risks). We’re also particularly excited about people working in information security at labs to reduce risks of theft and misuse.

Beyond the direct impact of your role, you may be able to help guide internal culture in a more risk-sensitive direction. You probably won’t be able to influence many specific decisions, unless you’re very senior (or have the potential to become very senior), but if you’re a good employee you can just generally become part of the ‘conscience’ of an organisation. Just like anyone working at a powerful institution, you can also — if you see something really harmful occurring — consider organising internal complaints, whistleblowing, or even resigning. Finally, you could help foster good, cooperative working relationships with other labs as well as the public.

To do this well, you’d need the sorts of social skills that let you climb the organisational ladder and bring people round to your point of view. We’d also guess that you should spend almost all of your work time focused on doing your job well; criticism is usually far more powerful coming from a high performer.

There’s a risk that doing this badly could accidentally cause harm, for example, by making people think that arguments for caution are unconvincing.

How can you mitigate the downsides of this option?

There are a few things you can do to mitigate the downsides of taking a role in a leading AI lab:

Don’t work in certain positions unless you feel awesome about the lab being a force for good. This includes some technical work, like work that improves the efficiency of training very large models, whether via architectural improvements, optimiser improvements, improved reduced-precision training, or improved hardware. We’d also guess that roles in marketing, commercialisation, and fundraising tend to contribute to hype and acceleration, and so are somewhat likely to be harmful.
Think carefully, and take action if you need to. Take the time to think carefully about the work you’re doing, and how it’ll be disclosed outside the lab. For example, will publishing your research lead to harmful hype and acceleration? Who should have access to any models that you build? Be an employee who pays attention to the actions of the company you’re working for, and speaks up when you’re unhappy or uncomfortable.
Consult others. Don’t be a unilateralist. It’s worth discussing any role in advance with others. We can give you 1-1 advice, for free. If you know anyone working in the area who’s concerned about the risks, discuss your options with them. You may be able to meet people through our community, and our advisors can also help you make connections with people who can give you more nuanced and personalised advice.
Continue to engage with the broader safety community. To reduce the chance that your opinions or values will drift just because of the people you’re socialising with, try to find a way to spend time with people who more closely share your values. For example, if you’re a researcher or engineer, you may be able to spend some of your working time with a safety-focused research group.
Be ready to switch. Avoid being in a financial or psychological situation where it’s just going to be really hard for you to switch jobs into something more exclusively focused on doing good. Instead, constantly ask yourself whether you’d be able to make that switch, and whether you’re making decisions that could make it harder to do so in the future.

How to predict your fit in advance

In general, we think you’ll be a better fit for working at an AI lab if you have an excellent understanding of risks from AI. If the positive impact of your role comes from being able to persuade others to make better decisions, you’ll also need very good social skills. You’ll probably have a better time if you’re pragmatic and comfortable with making decisions that can, at times, be difficult, time-pressured, and morally ambiguous.

While a career in a leading AI lab can be rewarding and high impact for some, it’s not suitable for everyone. People who should probably not work at an AI lab include:

People who can’t follow tight security practices: AI labs often deal with sensitive information that needs to be handled responsibly.
People who aren’t able to keep their options open — that is, they aren’t (for a number of possible reasons) financially or psychologically prepared to leave if it starts to seem like the right idea. (In general, whatever your career path, we think it’s worth trying to build at least 6-12 months of financial runway.)
People who are more sensitive than average to incentives and social pressure: you’re just more likely to do things you wouldn’t currently endorse.

More specifically than that, predicting your fit will depend on the exact career path you’re following, and for that you can check out our other related career reviews.

How to enter

Some labs have internships (e.g. at Google DeepMind) or residency programmes (e.g. at OpenAI) — but the path to entering a leading AI lab can depend substantially on the specific role you’re interested in. So we’d suggest you look at our other career reviews for more detail, as well as plenty of practical advice.

Recommended organisations

We’re really not sure. It seems like OpenAI, Google DeepMind, and Anthropic are currently taking existential risk more seriously than other labs. Some people we spoke to have strong opinions about which of these is best, but they disagree with each other substantially.

Big tech companies like Apple, Microsoft, Meta, Amazon, and NVIDIA — which have the resources to potentially become rising stars in AI — are also worth considering, as there’s a need for more people in these companies who care about AI safety and ethics. Relatedly, plenty of startups can be good places to gain career capital, especially if they’re not advancing dangerous capabilities. However, the absence of teams focused on existential safety means that we’d guess these are worse choices for most of our readers.

Learn more

Learn more about making career decisions where there’s a risk of harm:

Relevant career reviews (for more specific and practical advice):

If you think you might be a good fit for this path and you’re ready to start looking at job opportunities that are currently accepting applications, see our list of opportunities for this path.

Want one-on-one advice?

If you think working at a leading AI lab might be a great option for you, but you need help deciding or thinking about what to do next, the 80,000 Hours team might be able to help.

We can help you compare options, make connections, and possibly even help you find jobs or funding opportunities.

Apply to speak to the 80,000 Hours team here.

^{^}
The linked article is by Holden Karnofsky. Karnofsky co-founded Open Philanthropy, 80,000 Hours’ largest funder.

Yonatan CaleJul 27 20238

TL;DR: "which lab" seems important, no?

You wrote:

Don’t work in certain positions unless you feel awesome about the lab being a force for good.

First of all I agree, thumbs up from me! 🙌

But you also wrote:

Recommended organisations
We’re really not sure. It seems like OpenAI, Google DeepMind, and Anthropic are currently taking existential risk more seriously than other labs.

I assume you don't recommend people go work for whatever lab "currently [seems like they're] taking existential risk more seriously than other labs" ?

Do you have further recommendations on how to pick a lab?

(Do you agree this is a really important part of an AI-Safety-Career plan, or does it seem sort-of-secondary to you?)

I'm asking in the context of an engineer considering working on capabilities (and if they're building skill - they might ask themselves "what am I going to use this skill for", which I think is a good question). Also, I noticed you wrote "broadly advancing AI capabilities should be regarded overall as probably harmful", which I agree with, and seems to make this question even more important.

Yonatan CaleJul 27 20236

For transparency: I'd personally encourage 80k to be more opinionated here, I think you're well positioned and have relevant abilities and respect and critical-mass-of-engineers-and-orgs. Or at least as a fallback (if you're not confident in being opinionated) - I think you're well positioned to make a high quality discussion about it, but that's a long story and maybe off topic.

Benjamin HiltonJul 28 20233

I don't currently have a confident view on this beyond "We’re really not sure. It seems like OpenAI, Google DeepMind, and Anthropic are currently taking existential risk more seriously than other labs."

But I agree that if we could reach a confident position here (or even just a confident list of considerations), that would be useful for people — so thanks, this is a helpful suggestion!

Greg_Colbourn ⏸️ Jul 27 20237

I think at this point, we are not far off this being

"Should you work at a leading oil company? (including in non-renewables roles)".

Or even

"Hans Bethe has just calculated that the chance of the first A-bomb test igniting the atmosphere is 10%; should you work at the Manhattan Project? (including in non-shutting-it-down roles)".

EA has already contributed massively to the safety-washing of the big AI companies (not to mention kicking off and accelerating the race toward AGI in the first place!) I think EAs should be focusing more on applying external pressure now. There are ways to have higher leverage on existential safety by joining (not yet captured) AI governance, lobbying and public campaigning efforts.

Benjamin HiltonJul 28 20236

Thanks, this is an interesting heuristic, but I think I don't find it as valuable as you do.

First, while I do think it'd probably be harmful in expectation to work at leading oil companies / at the Manhattan project, I'm not confident in that view — I just haven't thought about this very much.

Second, I think that AI labs are in a pretty different reference class from oil companies and the development of nuclear weapons.

Why? Roughly:

Whether, in a broad sense, capabilities advances are good or bad is pretty unclear. (Note some capabilities advances in particular areas are very clearly harmful.) In comparison, I do think that, in a broad sense, the development of nuclear weapons, and the release of greenhouse gases are harmful.
Unlike with oil companies and the Manhattan Project, I think that there's a good chance that a leading, careful AI project could be a huge force for good, substantially reducing existential risk — and so it seems weird not to consider working at what could be one of the world's most (positively) impactful organisations. Of course, you should also consider the chance that the organisation could be one of the world's most negatively impactful organisations.

Because these issues are difficult and we don’t think we have all the answers, I also published a range of opinions about a related question in our anonymous advice series. Some of the respondents took a very sceptical view of any work that advances capabilities, but others disagreed.

Greg_Colbourn ⏸️ Aug 1 20233

I think that there's a good chance that a leading, careful AI project could be a huge force for good, substantially reducing existential risk

I think the burden of proof should be on the big AI companies to show that this is actually a possibility. Because right now, the technology, as based on the current paradigm, looks like it's fundamentally uncontrollable.

Yonatan CaleAug 3 20231

TL;DR: I don't like talking about "burden of proof"

I prefer talking about "priors".

Seems like you ( @Greg_Colbourn ) have priors that AI labs will cause damage, and I'd assume @Benjamin Hilton would agree with that?

I also guess you both have priors that ~random (average) capabilities research will be net negative?

If so, I suggest we should ask if the AI lab (or the specific capabilities research) has overcome that prior somehow.

wdyt?

Greg_Colbourn ⏸️ Aug 4 20234

I don't think any of the big AI labs have overcome that prior, but I also have the prior that their safety plans don't even make sense theoretically - hence the "burden of proof" is on them to show that it is possible to align the kind of AI they are building. Another thing pointing in the opposite direction.

Yonatan CaleAug 5 20232

Whoever downvoted this, I'd really prefer if you tell me why

You can do it anonymously:

https://docs.google.com/forms/d/e/1FAIpQLSca6NOTbFMU9BBQBYHecUfjPsxhGbzzlFO5BNNR1AIXZjpvcw/viewform

Yonatan CaleJul 26 20233

What do you think about the effect of many people (EAs) joining top AI labs - on the race dynamics between those labs?

Hard for me to make up my mind here

Adding [edit] :

This seems especially important as you're advising many people to consider entering the field, where one of the reasons to do it is "Moving faster could reduce the risk that AI projects that are less cautious than the existing ones can enter the field." (but you're sending people to many different orgs).

In other words: It seems maybe negative to encourage many people to enter a race, on many different competing "teams", if you want the entire field to move slowly, no?

When I talk to people, I sometimes explicitly say that this is a way of thinking that I hope most people WON'T use.

Hey, there is a common plan I hear that maybe you'd like to respond to directly.

It goes something like this: "I'll go work at a top AI lab as an engineer, build technical skills, and I care about safety so I can push a bit towards safe decisions, or push a lot if it's important, overall it seems good to have people there who care about safety like me. I don't have a good understanding of how to do alignment but there are some people I trust"

If you're willing to reply to this, I'll probably refer people directly to your answer sometimes

Benjamin HiltonJul 26 20235

Hi Yonatan,

I think that for many people (but not everyone) and for many roles they might work in (but not all roles), this is a reasonable plan.

Most importantly, I think it's true that working at a top AI lab as an engineer is one of the best ways to build technical skills (see the section above on "it's often excellent career capital").

I'm more sceptical about the ability to push towards safe decisions (see the section above on "you may be able to help labs reduce risks").

The right answer here depends a lot on the specific role. I think it's important to remember than not all AI capabilities work is necessarily harmful (see the section above on "you might advance AI capabilities, which could be (really) harmful"), and that top AI labs could be some of the most positive-impact organisations in the world (see the section above on "labs could be a huge force for good - or harm"). On the other hand, there are roles that seem harmful to me (see "how can you mitigate the downsides of this option").

I'm not sure of the relevance of "having a good understanding of how to do alignment" to your question. I'd guess that lots of knowing "how to do alignment" is being very good at ML engineering or ML research in general, and that working at a top AI lab is one of the best ways to learn those skills.

Yonatan CaleJul 26 20232

Hi! Thanks for your answer. TL;DR: I understand and don't have further questions on this point

What I mean by "having a good understanding of how to do alignment" is "being opinionated about (and learning to notice) which directions make sense, as opposed to only applying one's engineering skills towards someone else's plan".

I think this is important if someone wants to affect the situation from inside, because the alternative is something like "trust authority".

But it sounds like you don't count on "the ability to push towards safe decisions" anyway

EA Forum Bot Site
EA Forum

Should you work at a leading AI lab? (including in non-safety roles)

38

Summary

Pros

Cons

Key facts on fit

Overall recommendation: it's complicated

Review status

Why might it be high-impact to work for a leading AI lab?

What relevant considerations are there?

Labs could be a huge force for good — or harm

It’s often excellent career capital

You might advance AI capabilities, which could be (really) harmful

You may be able to help labs reduce risks

How can you mitigate the downsides of this option?

How to predict your fit in advance

How to enter

Recommended organisations

Learn more

Want one-on-one advice?

38

Reactions

More posts like this

Recommended organisations