Hide table of contents

We examine what factors predicted advancement in our engineering hiring round. We show two trends which seem common in EA hiring[1]: first, candidates with substantial experience (including at prestigious employers) were often unsuccessful, and second, candidates with limited experience and/or limited formal education were sometimes successful.

We sometimes hear of people being hesitant to apply to jobs out of a fear that they are hard to get. This post gives quantitative evidence that people can receive EA job offers even if their seemingly more qualified peers are rejected (and, indeed, traditional qualifications are almost uncorrelated with getting an offer).

In summary:

  • None of the factors we looked at were statistically significant.
  • Having previously worked at a Big Tech “FAANG” company was the only factor which had a consistently positive central estimate, although with confidence intervals that comfortably included both positive and negative effect sizes.
  • Years of experience, typographical errors, and the level of university qualification seemed to have little predictive power.

This builds on our previous post which found that participation in EA had limited ability to predict success in our hiring round.

Context

There were 85 applicants for the role. The success rates for candidates in each stage are shown below. Some candidates voluntarily withdrew between the screening interview and trial task, hence there are fewer people taking part in the trial task than passed the interview.        

StageNumber participatingNumber passingSuccess rate
Initial application sift854857%
Screening interview484594%
Trial task35823%

After the recruitment process was completed, we aggregated information about each applicant using the CVs and LinkedIn profiles they provided with their application. The metrics we were interested in were[2]:

  • Did any previous role include the word “senior” in its title?
  • Did any previous role include the word “manager” in its title?
  • How many years of experience did the candidate have?
  • How many typos were in the application?
  • What was the highest degree obtained by the candidate?
  • We coded this as: 1 for a bachelor-level degree, 2 for a master-level degree, and 3 for a doctoral degree.
  • Has the applicant ever worked at a FAANG company?

This is not rigorous analysis; a “proper” model would include as many explanatory factors as possible, and the factors should be independent. This is reflected in the eventual predictive power of the models.

Findings

We fitted logistic regression models to the data; with the dependent variable being whether a candidate passed a given stage, and the independent variables being the factors listed above.

We then calculated modelled odds ratios and probabilities associated with each "predictor". The results of this are shown below, with a data table in the appendix.

  • Binary variables include examples like “Has senior in title” or “Has FAANG company”. The odds ratio indicates how much more likely it is that people who were successful were “exposed” to that variable than not.
  • Continuous variables include examples like “Years of experience” or “Highest degree”. The odds ratio indicates how much more likely it is that people who were successful were “exposed” to one unit increase in the variable than not.

Predictors for passing an initial sift

This model predicts whether all submitted applicants (N=85) would pass an initial sift and be invited to the screening interview, with sensitivity 56% and specificity 76%.

Predictors for passing screening interview

This model predicts whether all invited applicants (N=85) would pass the screening interview, with sensitivity 56% and specificity 70%.

Predictors for passing trial task

This model predicts whether applicants who did not withdraw prior to this point (N=83) would pass the trial task, with sensitivity 88% and specificity 58%.

Commentary from Ben

Discourse about EA hiring is sometimes simplified to "EA jobs are hard to get" (and therefore you shouldn't bother applying unless you are very qualified) or "there is a big talent gap" (and therefore everyone should apply).

This post gives evidence that “hard versus easy” isn’t really the right axis: it's hard to get a job (in the sense that well-qualified applicants were rejected) but also easy (in the sense that applicants with limited qualifications were accepted). 

"When in doubt, just apply" continues to seem like good advice to me.

From the hiring manager’s perspective: This builds on our previous post which found that participation in EA had limited ability to predict success in our hiring rounds. Together, these posts make me pessimistic that simple automated screening criteria like “you need X years of experience” will be useful.

Appendix: Summary of modelled parameters

 

Passing initial sift

(N=85)

Passing screening interview

(N=85)

Passing trial task

(N=75)

PredictorsOdds RatiospOdds RatiospOdds Ratiosp
(Intercept)

1.06

(0.43 – 2.63)

0.893

1.10

(0.45 – 2.74)

0.829

0.36

(0.08 – 1.29)

0.135

years of experience

mean=9.4

1.01

(0.95 – 1.08)

0.781

1.01

(0.95 – 1.08)

0.795

0.90

(0.74 – 1.03)

0.181

has senior in title

n=17

2.23

(0.70 – 7.99)

0.189

2.63

(0.82 – 9.53)

0.116

0.81

(0.04 – 7.19)

0.860

has manager in title

n=14

0.50

(0.14 – 1.67)

0.263

0.44

(0.12 – 1.48)

0.197

0.90

(0.04 – 7.09)

0.925

number of typos

mean=0.9

0.93

(0.66 – 1.31)

0.648

0.99

(0.70 – 1.43)

0.968

0.90

(0.41 – 1.50)

0.738

has faang company

n=7

1.92

(0.37 – 14.55)

0.464

2.44

(0.46 – 18.88)

0.325

2.04

(0.09 – 20.83)

0.575

highest degree

mean=1.0

1.10

(0.52 – 2.36)

0.807

0.85

(0.39 – 1.80)

0.667

0.82

(0.19 – 2.85)

0.765

 

 

 

  1. ^

     They seem common in the authors’ experience; we would appreciate feedback  in the comments from other hiring managers about their own experience.

  2. ^

    We collected other factors but ultimately chose to exclude them from the analysis:
    - University rankings - we could not obtain these for enough candidates, which reduced the sample size considered and affected their accuracy.
    - Likely salaries in the candidate’s previous position - we used online sources to estimate the typical salary for the candidate’s most recent position and company, but again could not obtain this for enough candidates.
    - Whether candidates had worked in a company with more than 1000 employees - we excluded this in favour of looking at whether candidates had worked at a FAANG company; it was not possible to include both since the variables are not independent.

Comments19


Sorted by Click to highlight new comments since:

Assets aren't showing up:

Images should be fixed now, thanks for pointing this out.

Yep, images are broken. My guess is the document was copy-pasted from a Google Doc, with the images hosted in a way that isn't publicly accessible.

Thanks for this post! I'll be interested in data from CEA hiring overall, even with the obvious caveat that hiring across different roles will require different skillsets and experiences.

Thanks! In case you haven't already seen that: this post is part of a sequence about EA hiring; other posts have information about hiring different roles.

This was a great write up, interesting topic, informational and easy to follow.

One question I had is if below were the only words you were looking for in a CV and why so. For example, you did not list "Lead", which I'd think is frequently used for engineering roles.
I'm assuming either these were just examples (so not a complete list), or applicants only used these 2 terms?

Did any previous role include the word “senior” in its title? 
Did any previous role include the word “manager” in its title?

Thanks! Yeah, maybe we should also have looked for "lead" but we didn't. No strong reason why this is a bad idea, I just didn't think of it.

So uh you guys/girls have n=7 samples of people in this FAANG group, and you're using this to get coefficients for one of the regressions. Then for the next regression for the FAANG people making it a cut further, you probably only have 3 observations that regression?

 

So I think the norm here is to show "summary stats" style of data, e.g. a table that says "For the FAANG applicants, of these 7 made it). I think this table would be better. 

Basically, a regression model doesn't add a lot, with this level of data. 

 

Also, at this extremely low amount of data, I'm unsure, but there might be weird "degree of freedom" sort of things, where due to an interaction, the signs/magnitudes explode/implode.

 

Can you share your code for the regressions that made this table?

Basically, a regression model doesn't add a lot, with this level of data

Yes, I agree that this is the conclusion of the piece, but I feel like you are implying that this means the methodology was flawed?

We aren't trying to do some broad scientific analysis, we are just practically trying to identify ways that we can speed up our hiring process. And given that we do, in practice, have a relatively small number of people applying to each round, we are (apparently) not able to use automated methods to identify the most promising candidates with high accuracy.

(Maybe my stats/prob/econometrics is rusty, feel free to stomp this comment)

Yeah, you guys have a 94% pass rate for one dataset you use in one regression.

So you could only be getting any inference from the literally 3 people who failed for the screening interview.

So, like, in a logical, "Shannon information sense", that is all the info you have to go with, to get magnitudes and statistical power, for that particular regression. Right?

So how are you getting a whole column of coefficients for it? 

 

No, "This model predicts whether all invited applicants (N=85) would pass the screening interview." So it's 45/85.

Yes, understood, thanks, I was just confused.

94% pass rate

Also, it does seem that, at least ex post, they might benefit from raising the bar a bit on this round. 

Yeah, the point of the screening interview is mostly for the candidate to ask questions. I endorse the belief that we should be measuring programmers through programming tests instead of interviews (i.e. the pass rate of the screening interview should be very high), but I go back and forth on whether the screening interview should come first or second.

Yes, raising the bar would make the interviews more useful. This is a good thought that makes a lot of sense to me. 

I think what you said makes sense and is logical. 

 

Since I'm far away and uninformed, I think I'm more reluctant to say anything about the process and there could be other explanations.

For example, maybe Ben or his team wanted to meet with many applicants because he/they viewed them highly and cared about their EA activities beyond CEA, and this interview had a lot of value, like a sort of general 1on1.

The "vision" for the hiring process might be different. For example, maybe Ben's view was to pass anyone who met resume screening. For the interview, maybe he just wanted to use it to make candidates feel there was appropriate interest from CEA, before asking them to invest in a vigorous trial exercise.

Ben seems to think hard about issues of recruiting and exclusivity, and has used these two posts to express and show a lot of investment in making things fair.

- Whether candidates had worked in a company with more than 1000 employees - we excluded this in favour of looking at whether candidates had worked at a FAANG company; it was not possible to include both since the variables are not independent.

I'm confused, why can't you include two predictors if they are not independent? I'm assuming that with "independent" you mean correlation 0, if you instead mean no collinearity, i.e., linearly independent vectors of predictors, then feel free to ignore my comment.

Am I reading correctly that you made an offer to 8 developers and had 85 applicants?

So a 9% offer rate? That seems very high, am I missing something?

There is an additional on-site after this, and some people withdrew. We ended up making three offers from this round.

To highlight (from this comment and reply) the hire rate for this position was 3.5%

Curated and popular this week
 ·  · 22m read
 · 
The cause prioritization landscape in EA is changing. Prominent groups have shut down, others have been founded, and everyone’s trying to figure out how to prepare for AI. This is the third in a series of posts critically examining the state of cause prioritization and strategies for moving forward. Executive Summary * An increasingly common argument is that we should prioritize work in AI over work in other cause areas (e.g. farmed animal welfare, reducing nuclear risks) because the impending AI revolution undermines the value of working in those other areas. * We consider three versions of the argument: * Aligned superintelligent AI will solve many of the problems that we currently face in other cause areas. * Misaligned AI will be so disastrous that none of the existing problems will matter because we’ll all be dead or worse. * AI will be so disruptive that our current theories of change will all be obsolete, so the best thing to do is wait, build resources, and reformulate plans until after the AI revolution. * We identify some key cruxes of these arguments, and present reasons to be skeptical of them. A more direct case needs to be made for these cruxes before we rely on them in making important cause prioritization decisions. * Even on short timelines, the AI transition may be a protracted and patchy process, leaving many opportunities to act on longer timelines. * Work in other cause areas will often make essential contributions to the AI transition going well. * Projects that require cultural, social, and legal changes for success, and projects where opposing sides will both benefit from AI, will be more resistant to being solved by AI. * Many of the reasons why AI might undermine projects in other cause areas (e.g. its unpredictable and destabilizing effects) would seem to undermine lots of work on AI as well. * While an impending AI revolution should affect how we approach and prioritize non-AI (and AI) projects, doing this wisel
 ·  · 4m read
 · 
TLDR When we look across all jobs globally, many of us in the EA community occupy positions that would rank in the 99.9th percentile or higher by our own preferences within jobs that we could plausibly get.[1] Whether you work at an EA-aligned organization, hold a high-impact role elsewhere, or have a well-compensated position which allows you to make significant high effectiveness donations, your job situation is likely extraordinarily fortunate and high impact by global standards. This career conversations week, it's worth reflecting on this and considering how we can make the most of these opportunities. Intro I think job choice is one of the great advantages of development. Before the industrial revolution, nearly everyone had to be a hunter-gatherer or a farmer, and they typically didn’t get a choice between those. Now there is typically some choice in low income countries, and typically a lot of choice in high income countries. This already suggests that having a job in your preferred field puts you in a high percentile of job choice. But for many in the EA community, the situation is even more fortunate. The Mathematics of Job Preference If you work at an EA-aligned organization and that is your top preference, you occupy an extraordinarily rare position. There are perhaps a few thousand such positions globally, out of the world's several billion jobs. Simple division suggests this puts you in roughly the 99.9999th percentile of job preference. Even if you don't work directly for an EA organization but have secured: * A job allowing significant donations * A position with direct positive impact aligned with your values * Work that combines your skills, interests, and preferred location You likely still occupy a position in the 99.9th percentile or higher of global job preference matching. Even without the impact perspective, if you are working in your preferred field and preferred country, that may put you in the 99.9th percentile of job preference
 ·  · 6m read
 · 
I am writing this to reflect on my experience interning with the Fish Welfare Initiative, and to provide my thoughts on why more students looking to build EA experience should do something similar.  Back in October, I cold-emailed the Fish Welfare Initiative (FWI) with my resume and a short cover letter expressing interest in an unpaid in-person internship in the summer of 2025. I figured I had a better chance of getting an internship by building my own door than competing with hundreds of others to squeeze through an existing door, and the opportunity to travel to India carried strong appeal. Haven, the Executive Director of FWI, set up a call with me that mostly consisted of him listing all the challenges of living in rural India — 110° F temperatures, electricity outages, lack of entertainment… When I didn’t seem deterred, he offered me an internship.  I stayed with FWI for one month. By rotating through the different teams, I completed a wide range of tasks:  * Made ~20 visits to fish farms * Wrote a recommendation on next steps for FWI’s stunning project * Conducted data analysis in Python on the efficacy of the Alliance for Responsible Aquaculture’s corrective actions * Received training in water quality testing methods * Created charts in Tableau for a webinar presentation * Brainstormed and implemented office improvements  I wasn’t able to drive myself around in India, so I rode on the back of a coworker’s motorbike to commute. FWI provided me with my own bedroom in a company-owned flat. Sometimes Haven and I would cook together at the residence, talking for hours over a chopping board and our metal plates about war, family, or effective altruism. Other times I would eat at restaurants or street food booths with my Indian coworkers. Excluding flights, I spent less than $100 USD in total. I covered all costs, including international transportation, through the Summer in South Asia Fellowship, which provides funding for University of Michigan under