If you work for a frontier AI company, either because you think they care about saving the world or especially if you think that you will be the one to influence them, you are deluded. Wake up and quit.
If you care about protecting the world, you will quit, even though it will be hard to give up the money and the prestige and the hope that they would fix the problem. The actual path to reducing AI risk is not as glamorous or as clear at this point as following the instructions of a wealthy and well-organized corporation, but at least you will be going in the right direction.
The early 80k-style advice to work at an AI lab was mainly to make technical discoveries for safety that e.g. academia didn't have the resources for. When they were small, it also made some sense to try to influence the industry culture. Now, this advice is crazy-- there is no way 1 EA joining a 1000 person company with duties to their investors and locked in a death race is going to "influence" it. The influence goes entirely the other way. If you weren't frogboiled, you would never have selected this path for influence.
There's a lot more to say on this, but I think this is the crux. Your chance for positive marginal impact for AI Safety is not with the labs. If you work for the labs, you're probably just a henchman for a supervillain megaproject, and you can have some positive counterfactual impact right now by quitting. Don't sell out.
I think there was a time when it seemed like a good idea, back when the companies were small and there was more of a chance of setting their standards and culture. Back in 2016 I thought on balance we should try to put Safety people in OpenAI, for instance. OpenAI was supposed to be explicitly Safety-oriented, but any company's safety division seemed like it might pay off to stock with Safety people.
I think everything had clearly changed around the chatGPT moment. The companies had a successful paradigm for making the models, the product was extremely valuable, and the race was very clearly on. At this time, EAs still believed that OpenAI and Anthropic were on their side because they had Safety teams (including many EAs) and talked a lot about Safety, in fact claiming to be developing AGI for the sake of Safety. Actual influence from EA employees to do things that were safe that weren't good for the mission of those companies was already lost at this point, imo.
It was proven in the ensuing two years that the Safety teams at OpenAI were expendable. Sam Altman has used up and thrown away EA, and he no longer feels any need to pretend OpenAI cares about Safety, despite having very fluently talked to the talk for years before. He was happy to use the EA board members and the entire movement as scapegoats.
Anthropic is showing signs of going the same way. They do Safety research, but nothing stops them developing further, including former promises not to advance the frontier. The main thing they do is develop bigger and bigger models. They want to be attractive to natsec, and whether the actual decisionmakers at the top ultimately believe their agenda is for the sake of Safety or not, it's clearly not up to the marginal Safety hire or hingeing on their research results. Other AI companies don't even claim to care about Safety particularly.
So, I do not think it is effective to work at these places. But the real harm is that working for AI labs keeps EAs from speaking out about AI danger, whether because they are under NDA, or because they want to be hireable by a lab, or they want to cooperate with people working at labs, or because they defer to their friends and general social environment and so they think the labs are good (at least Anthropic). imo this price is unacceptably high, and EAs would have a lot more of the impact they hoped to get from being "in the room" at labs by speaking out and contributing to real external pressure and regulation.