Announcing Athena - Women in AI Alignment Research

Claire Short

110

I'm really excited about this! :)

One further thought on pitching Athena: I think there is an additional, simpler, and possibly less contentious argument about why increasing diversity is valuable for AI safety research, which is basically "we need everyone we can get". If a large percentage of relevant people don't feel as welcome/able to work on AI safety because of, e.g., their gender, then that is a big problem. Moreover, it is a big problem even if one doesn't care about diversity intrinsically, or even if one is sceptical of the benefits of more diverse research teams.

To be clear, I think we should care about diversity intrinsically, but the argument above nicely sidesteps replies of the form "yes, diversity is important, but we need to prioritise reducing AI x-risk above that, and you haven't given me a detailed story for how diversity in-and-of-itself helps AI x-risk, e.g., one's gender does not, prima facie, seem very relevant to one's ability to conduct AI safety research". This also isn't to dispute any of your reasons in the post, by the way, merely to add to them :)

Larks

2y

92

Thanks for sharing these studies explaining why you are doing this. Unfortunately, in general I am very skeptical of the sort of studies you are referencing. The researchers typically have a clear agenda - they know what conclusions they want to come to ahead of time, and what conclusions will most advantageous to their career - and the statistical rigour is often lacking, with small sample sizes, lack of pre-registration, p-hacking, and other issues. I took a closer look at the four sources you referenced to see if these issues applied.

When more women participate in traditionally male-dominated fields like the sciences, the breadth of knowledge in that area usually grows, a surge in female involvement directly correlates with advancements in understanding[1]. [emphasis added]

The link you provide here, to a 2014 article in National Geographic, has a lot of examples of cases where male researchers supposedly overlooked the needs of women (e.g. not adequately studying how women's biology affects how drugs and seat belts should work, or the importance of cleaning houses), and suggests that increasing number of female scientists helped address this. But female scientists being better at understanding women seems less relevant to AI technical alignment work, because AIs are not female or male. Maybe it is useful for understanding what distinctly female values we want AIs to promote, but it doesn't seem particularly relevant for things like Interpretability or most other current research agendas. The article also suggests that women are more communal and emotionally aware, vs men who are more agentic. But it doesn't really make any claims about overall levels of understanding 'directly correlating' with female involvement, especially in more abstract, less biological fields, and the word 'correlate' literally does not appear in the text.

Cox & Fisher (2008) found that women in a single-sex environment in a software engineering course reported higher levels of enjoyment, fairness, motivation, support, and comfort and allowed them to perform at a level that exceeded that of the all-male groups in the class [1].

The first paper describes a n=7 study of a female group project, which apparently scored more highly than other group projects run by men. The study was not pre-registered, blinded or randomised, the researcher was an active participant, and there was no control. The author also obliquely references the need to avoid ''rigid marking schemes' if these might reveal the all-female group performing worse, which suggests a bias to me.

Kahveci (2008) explored a program for women in science, mathematics, and engineering and found that it helped marginalized women move towards legitimate participation in these fields and enhanced a sense of community and mutual engagement [2].

The second paper describes a n=74 study of a women-in-science program, where the positive result is basically that the participants gave positive reviews to the program and said it made them more likely to do science. The study was not pre-registered, blinded or randomised, the researcher was an active participant, and there was no control. The only concrete example provided of a student switching major was from Biology to Exercise Physiology, which seems like a move away from core science.

“It is not about men against women, but there is evidence to show through research that when you have more women in public decision-making, you get policies that benefit women, children and families in general. When women are in sufficient numbers in parliaments they promote women’s rights legislation, children’s rights and they tend to speak up more for the interests of communities, local communities, because of their close involvement in community life. [2]

The link here goes to a web page with a quote from Oxfam. There are no links to the evidence or research that supposedly backs up the claim.

Overall, my opinion of the linked research is it has very little scientific merit. They provide some interesting anecdotes, and the authors have some theories that someone else could test. But to the extent you are highlighting them because they are cruxes for your theory of change, they seem very weak. If your 'Why We Are Doing This' had been premised on 'well some women just like sex-segregated programs, so proving this option will help with recruitment' then I would have said fair enough. But if, as this post suggests, your theory of change is based on these sorts of dubious studies then that makes me significantly less optimistic about the project.

Neel Nanda

2y

59

I upvoted this comment, since I think it's a correct critique of poor quality studies and adds important context, but I also wanted to flag that I also broadly think Athena is a worthwhile initiative and I'm glad it's happening! (In line with Lewis' argument below). I think it can create bad vibes for the highest voted comment on a post about promoting diversity to be critical

[anonymous]

2y

24

Usually, if someone proposes something and then cites loads of weak literature supporting it, criticism is warranted. I think it is a good norm for people promoting anything to make good arguments for it and provide good evidence.

Neel Nanda

2y

4

Agreed!

Angelina Li

2y

6

I think it can create bad vibes for the highest voted comment on a post about promoting diversity to be critical

+1, I appreciate you for upvoting the parent comment and then leaving this reply :)

(Edit: for what it's worth, I am also excited Athena is happening)

Benevolent_Rain

2y

13

Maybe it is useful for understanding what distinctly female values we want AIs to promote, but it doesn't seem particularly relevant for things like Interpretability or most other current research agendas.

Could it be that perhaps the research agendas themselves could benefit from a more diverse set of perspectives? I have not thought this through as carefully as you have but the seatbelt analogy seem perhaps appropriate - perhaps the issue there was exactly that the research agenda on seat belts did not include the impact on e.g. pregnant women (speculation from my side). Half the people affected by AI will be women so maybe mostly-men teams could possibly overlook considerations that apply less to men and more to women?

rachelAF

2y

43

I’m strongly in support of this initiative, and hope to help out as my schedule permits.

I agree with Larks that the linked studies have poor methodology and don’t provide sufficient support for their claims. I wish that there was better empirical research on this topic, but I think that’s unlikely to happen for various reasons (specifying useful outcome metrics is extremely difficult, political and researcher bias pushes hard toward a particular conclusion, human studies are expensive, etc.).

In lieu of reliable large-scale data, I’m basing my opinion on personal experiences and observations from my 5 years as a full time (cis female) AIS researcher, as well as several years of advising junior and aspiring researchers. I want to be explicit that I’d like better data than this, but am using it because it’s the best I have available.

I see two distinct ways that this initiative could be valuable for AIS research:

It could help us to recruit and retain more promising researchers. As Lewis commented, we need all the help we can get. While this community tries hard to be meritocratic, and is much less overtly hostile to women than neighboring communities I’ve experienced, I have personally noticed and experienced unintentional-yet-systemic patterns of behavior that can make it particularly difficult to remain and advance in this field as a woman. I’d prefer not to get into an in-depth discussion of that on here, though I have written about a bit of it in a related comment.^[1] I believe that a more gender-balanced environment, and particularly more accessible senior female researchers and mentors, would likely reduce this.

I also suspect that more balanced gender representation would make more people feel comfortable entering the field. I am often the only woman at lab meetings, research workshops, and other AIS events, and very often the only woman who isn’t married to a man who is also in attendance. This doesn’t bother me, but I think that’s just a random quirk of my personality. I think it’s totally reasonable and not uncommon for people to be reluctant to join a group where their demographics make them stand out, and we could be losing female entrants this way. (Though I have noticed much more gender diversity in AIS researchers who’ve joined in the past <2 years than in those who joined >=5 years ago, so it’s possible this problem is already going away!)
Women (or any member of an underrepresented group or background) could provide important perspective for some areas of AIS research. It's important to distinguish between different research areas here, so I’m gonna messily put AIS topics on a spectrum between “fundamental” and “applied”. By “fundamental”, I mean topics like interpretability, decision theory, science of deep learning, etc — work to understand, predict, and figure out how to control AI behavior at all. By “applied”, I mean topics like practical implications of RLHF when teachers have differing preferences, or constructing meaningful evaluations for foundation models — work to understand, predict, and dictate how AI interacts with the real world and groups of humans.

On the “fundamental” end of the spectrum, I don’t think that diversity in researcher background and life experience really matters either way. But in topics further toward the “applied” end of the spectrum, it can help a whole lot. There’s plausibly-important safety work happening all along this spectrum, especially now that surprisingly powerful AI systems are deployed in the real world, so there are areas where researchers with diverse backgrounds can be particularly valuable.

Overall, I think that this is an excellent thing to dedicate some resources to on the margin.

^{^}
A relevant excerpt: "most of these interactions were respectful, and grew to be a problem only because they happened so systematically -- for a while, it felt like every senior researcher I tried to get project mentorship from tried to date me instead, then avoided me after I turned them down, which has had serious career consequences."

Andrea Murillo

2y

37

Wow. This is exactly the type of opportunity I was looking for. I'm excited to apply!

[anonymous]

2y

36

I'm sceptical that there are substantial benefits to generating AI safety research ideas from gender diversity. I haven't read the literature here, but my prior on these types of interventions is that the effect size is small.

I regardless think Athena is good for the same reasons Lewis put forward in his comment - the evidence that women are excluded from male-dominated work environments seems strong and it's very important that we get as many talented researchers into AI safety as possible. This also seems especially like a problem in the AIS community where anecdotal claims of difficulties from unwanted romantic/sexual advances are common.

I think the intellectual benefits from gender diversity claims haven't been subjected to sufficient scrutiny because it's convenient to believe. For this kind of claim, I would need to see high-quality causal inference research to believe it and I haven't seen this research and the article linked doesn't cite such research. The linked NatGeo article doesn't seem to me to bring relevant evidence to bear on the question. I completely buy that having more women in the life sciences leads to better medical treatment for women, but that causal mechanism at work here doesn't seem like it would apply to AI safety research.

[anonymous]

2y

37

Do you have evidence that women are excluded from male-dominated work environments? This large meta-analysis finds that in academia, women need far fewer citations in order to be hired in male dominated subjects, and received a much higher score on tests when their gender was unblinded vs blinded.

…In summary, all of the seven administrative reports reveal substantial evidence that women applicants were at least as successful as and usually more successful than male applicants were—particularly in GEMP fields.

(GEMP: geosciences, engineering, economics, mathematics/computer science, and physical science)

In a natural experiment, French economists used national exam data for 11 fields, focusing on PhD holders who form the core of French academic hiring (Breda & Hillion, 2016). They compared blinded and nonblinded exam scores for the same men and women and discovered that women received higher scores when their gender was known than when it was not when a field was male dominant (math, physics, philosophy), indicating a positive bias, and that this difference strongly increased with a field’s male dominance. Specifically, women’s rank in male-dominated fields increased by up to 40% of a standard deviation. In contrast, male candidates in fields dominated by women (literature, foreign languages) were given a small boost over expectations based on blind ratings, but this difference was small and rarely significant.⁶

Many organisations have an explicit bias in favour of hiring women. e.g. according to this paper for a given level of performance in econometrics, women are much more likely to be elected fellows of the Econometric Society. This is due to an explicit bias in favour of women in that society.

titotal

2y

7

I think you are bordering on cherry picking here. The meta-analysis studied 6 areas of bias, and found parity in 3, advantage for women in 1, and advantage for men in 2. It also makes no sense to link a particular study when you just linked a meta-analysis: presumably I could dig through and selectively find studies supporting the opposite position.

There is also no mention of the key issue of sexual harassment, which has been the source of most of the gender based complaints in EA, and is compounded in situations of large gender imbalance, just from the pure maths of potential predators vs potential targets.

[anonymous]

2y

17

I agree on the sexual harassment problem and have written about that at length on here.

I don't think I am cherry picking. The commenter's claim that I challenged was that women are excluded from male-dominated (eg STEM) environments. With respect to the main post, the question is whether women are excluded from computer science jobs in particular. The paper found that with respect to hiring, biases are in favour of women in a range of STEM disciplines. In computer science in particular, there is evidence either of equal treatment or of a large bias in favour of women.

In computer science, Way et al. (2016) found that more highly ranked departments hired women and men at comparable rates, holding constant publications, department prestige, geography, and postdoc experience.
The Computer Research Association commissioned a national audit of U.S. and Canadian computer-science hiring (Stankovic & Aspray, 2003). They found that new women recipients of PhDs applied for far fewer academic jobs than men: Women with PhDs applied for six positions, whereas men applied for 25 positions. However, female PhDs were offered twice as many interviews per application (0.77), whereas men received only 0.37. Further, women received 0.55 job offers per application, whereas men received only 0.19: “Obviously women were much more selective in where they applied, and also much more successful in the application process” (Stankovic & Aspray, 2003, p. 31).

The same is true for other STEM disciplines. Thus, it is more accurate to say that the evidence suggests that men are on balance excluded from jobs in male-dominated STEM disciplines, in academia at least. I haven't seen anything about hiring in STEM-related companies.

This is from the author's own analysis:

If men and women were treated equally in STEM in academia, we would expect the blue and orange line to be overlapping, but the blue line consistently tracks above the orange line, indicating a bias to hiring women.

In their systematic review, the authors cite a range of studies finding similar results in hiring in STEM

Kessel and Nelson (2011) reported that female PhDs had similar or higher probabilities than men of entering assistant professorships in 100 top “highly quantitative” departments but not in other STEM fields. Ceci et al. (2014) compared the percentage of female PhDs with the percentage of female assistant professors 5 to 6 years later in GEMP fields and found similar results. And in philosophy—the humanities field most like GEMP in gender composition and quantitative emphasis—among 2008 to 2019 PhDs, women had a 10% to 17% greater likelihood than men of entering permanent academic placements (Allen-Hermanson, 2017; Kallens et al., 2022).
Among political scientists, Schröder et al. (2021) found that female political scientists had a 20% greater likelihood of obtaining a tenured position than comparably accomplished males in the same cohort after controlling for personal characteristics and accomplishments (publications, grants, children, etc.). Lutter and Schröder (2016) found that women needed 23% to 44% fewer publications than men to obtain a tenured job in German sociology departments

Data from the National Research Council shows that the fraction of women hired in STEM subjects in academia is higher than the number of female applicants across the board.

Data from individual universities confirms this picture:

At the University of Western Ontario, across departments in 1992 to 1999, women constituted 23.2% of applicants, 30.4% of interviewees, and 36.2% of hires for tenuretrack jobs (The University of Western Ontario, Office of the Provost and Vice-President [Academic], 2001). At Simon Fraser University and University of British Columbia in 2001, of 4,525 applicants, women were more likely than men to be one of the 105 hired, comprising 38.9% of applicants but 41.0% of those hired (Kimura, 2002).
An analysis by Moratti (2020) of hiring for the decade from 2007 to 2017 at Norway’s largest university revealed no gender bias in hiring: Seventy-seven searches generated 1,009 applicants for new associate professorships, with women slightly more likely than men to be hired, leading Moratti to conclude that

[anonymous]

2y

2

Only anecdotal and not a peer-reviewed opinion, but after reading a lot of these comments I can see how women might feel excluded from male-dominated spaces. Totally agree that reasoning transparency is important and that it is good to flag such things, but I think there is a difference between flagging some issues and a few people writing essays about how these claims are super wrong, challenging her to bet on her success and that gender diversity actually works etc. At least I find such comments really discouraging and as a women wouldn't want to start a project in the AI safety / EA space. Also, don't wanna give the whole classic talk about it's so much easier to give snappy remarks on the internet than in real life, but maybe think about feedback norms?

Also, just stating that a lot of work that is being done in the AI safety space is based on lesswrong posts and arxiv papers, rather than peer-reviewed studies, so maybe we can give the project a little bit the benefit of the doubt that getting more female AI safety researchers into the space is actually good?

[anonymous]

2y

19

I disagree with this comment and others. A lot of people's take seems to be that as long as some people think the vibe of something is good, then we should suspend all standards of evidence and argument.

In this case, should we not mention the fact that 75% of the evidence presented making the case for a project is weak? The whole point of the discussion is about whether 'gender diversity works' in the sense of improving the performance of organisations in certain domains. It seems like you are saying that it does. If so, then you need to present some evidence and arguments.
If something is 'superwrong', doesn't that make it more important to point that out? Or are you denying that it is superwrong? If so, what are your arguments?
I wrote an 'essay' because someone said I was cherrypicking and so I wanted to present all of the evidence presented in the paper without anyone having to read the paper, which I assumed few people would do. Usually on the forum, presenting lots of high quality evidence for something is deemed a positive, not a negative. This is also consistent with my comments on other topics - I often post screenshots and quotes from relevant literature for the simple reason that I think this is good practice, and I don't see why we should suspend that practice on this topic because it has 'bad vibes'. The implicit idea behind your suggestion is that instead of citing high quality literature, we should be guided by a vague sense of moral outrage.
I think it is a good norm for nonprofits to precommit to achieving certain outcomes. Or do you disagree?

Regarding feedback norms, I think the reverse is true. I think for the most part people are scared of challenging this sort of thinking for fear of social censure. Notably, this is exactly what has happened here. I personally would express these views in person and have done so several times. I think most people who share these views don't want to be called bigoted and so don't bother. It is noteworthy that only around 2-4 people on the forum publicly criticise arguments for demographic favouritism, but that the constituency publicly for it on the forum is much larger. In broader society, the constituency in favour of demographic favouritism has taken over almost the entirety of the public, private and nonprofit sectors. In this context, I think it is strange to be upset by a minimal amount of niche online pushback.

I don't know why this would make you less keen to start an EA or AI project. If you have a good one, you should be able to get funding for it.

Linch

2y

19

Regarding feedback norms, I think the reverse is true. I think for the most part people are scared of challenging this sort of thinking for fear of social censure. Notably, this is exactly what has happened here. I personally would express these views in person and have done so several times. I think most people who share these views don't want to be called bigoted and so don't bother. It is noteworthy that only around 2-4 people on the forum publicly criticise arguments for demographic favouritism, but that the constituency publicly for it on the forum is much larger. In broader society, the constituency in favour of demographic favouritism has taken over almost the entirety of the public, private and nonprofit sectors. In this context, I think it is strange to be upset by a minimal amount of niche online pushback.

I think it might be helpful to take a step back. The default in political or otherwise charged discussions is to believe that your side is the unfairly persecuted and tiny minority^[1], and it's an act of virtue and courage to bravely speak up.

I think self-belief in this position correlates weakly at best with shared social reality; I expect many people on multiple sides will hold near-symmetric beliefs.

In this case, it's reasonable for you (and many upvoters) to believe that the anti-"demographic favoritism" position is unfairly marginalized and persecuted, in part because you can point to many examples of pro-"demographic favoritism" claims in . Likewise, I also think it's reasonable for detractors (like anon above and titotal) to believe that pro-"demographic favoritism" are unfairly marginalized and persecuted, in part because the very existence of your comments and (many) upvoters suggest that this is the majority position on the EA Forum, and people who disagree will be disadvantaged and are taking more of a "brave" stance in doing so.

For what it's worth, I do think local norms tilts more against people who disagree with you. Broadly I think it in fact is harder/more costly on the forum to argue for pro-diversity positions on most sub-issues, at least locally.

That said, I think it's overall helpful to reduce (but not necessarily abandon) a persecution framing for viewpoints, as it is rarely conducive to useful discussions.

See also SSC on Against Bravery Debates.^[2]

^{^}
Or "moral majority" as the case may be, where your side is the long-suffering and silent majority, who doesn't deign to get into political disputes.
^{^}
He also had an ever better post about this exact phenomenon, but alas I couldn't find it after a more extensive search.

[anonymous]

2y

5

I'm not sure that local norms do tilt in favour of my position. Many EA orgs already have demographically-biased hiring, so it's fair to say I'm not winning the argument. And there just seem to be a lot more people willing to propose this stuff than criticise it. As I mentioned, the only public pushback comes from 2-4 people, and I do think it is personally costly for me to do this.

I think it is important to consider the social costs of discussing this. Demographic favouritism has ~completely taken over the public, private and nonprofit sector. I think recognising why this has happened requires one to analyse the social costs of opposing it given that at least a significant fraction, if not a majority, of voters are opposed to demographic favouritism. Because people are scared to push back for fear of being called bigoted, weak evidence can be adduced in favour of demographic favouritism. eg People often share things like magazine articles allegedly showing that a gender diverse board increases your stock price.

In this case, weak evidence has been adduced and unsubstantiated claims have been made, and some people have criticised it. The response to this has not been that the criticism is wrong, but that it is wrong to criticise at all in this domain. Several commenters have basically tried to guilt trip the critics even though they don't disagree with what the critics said. This never happens for any other topic. No-one ever mockingly argues against citing high quality peer reviewed literature in any other domain. No-one ever says correctly criticising obviously bad literature has bad vibes in any other domain.

I think the correct response would not be to get annoyed about the vibes, but to get better evidence and arguments

NunoSempere

2y

3

challenging her to bet on her success

Note that if she bets on her success and wins, she can extract money from the doubters, in a way which she couldn't if the doubters restricted themselves to mere talk. The reciprocal is also true, though.

JWS 🔸

2y

1

Edit: The original author has deactivated their account so have removed their username from the below. From my pov, the fact that they felt the need to do so is a signal that initiatives like Athena are valuable and worth supporting.

EA Forum, we need to talk. Why does this comment (at time of writing) have a negative vote score?^[1] We have separate downvote and disagreevotes for a reason.

Nothing they says in this comment is against Forum Norms. They state very clearly that these are their own thoughts from anecdotes. They're sharing their perspective on why the relative strong scrutiny in this thread might be off-putting for the very people Athena is meant to target.

From someone who naturally sits more on the 'contextualising' side of the Decoupling v Contextualising framework, it's pretty clear why the response would make some people less keen on proposing or attending programs like this.

I think there's a broader debate about EAs relationship to ideas of gender diversity and social justice, as well as points of disagreement and agreement on both empirical and philosophy axes. I just don't think that the right place for such a discussion is the comments here.

^{^}
-10 when I initial started this comment, -3 when I posted

[anonymous]

2y

13

I can say why I downvoted it - the commenter argued that presenting good arguments and evidence, and asking for results is bad because they don't like the vibe and find those things upsetting.

[anonymous]

2y

1

My views here are just deferring to gender scholars I respect.

Quadratic Reciprocity

2y

10

Am also skeptical about the intellectual benefits directly from gender diversity.

However, I think one pretty plausible way it could happen is because women tend to specialise in different fields from men (more women in life sciences, biology, and psychology as opposed to computer science) and maybe the benefits result from the diversity of expertise in different fields. Eg: PIBBS seems to have greater diversity in their fellows than other programmes and it seems like a good idea for it to exist.

[anonymous]

2y

5

Yes I agree with this - but if this is part of the theory of change then Athena should probably privilege applicants with these different backgrounds and I don't know if they intend to do this.

Claire Short

2y

26

I appreciate your feedback and comments! To clarify - my vision for this program is to emphasize that it's not a choice between x-risk and diversity; rather both can be pursued simultaneously and effectively. The core of this experiment is to integrate these two objectives. It's not about promoting diversity solely as a means to mitigate x-risk (though my intuition says that more diversity leads to a diverse range of ideas that will help with x-risk solutions). Instead, this program aims to address x-risk while concurrently supporting and helping retain qualified women in a predominantly male field that may otherwise consider leaving. This approach is based primarily on recent interviews I conducted with women in the alignment field and in EA, which revealed specific patterns and issues.

It's important to note that this program isn't solely based on the research I mentioned here. In hindsight I regret emphasizing those papers, as it may have diverted attention from the program's primary objectives and reasoning for some. The cited studies were meant to provide additional insights rather than serve as definitive proof of the program's value, which I should have stated more clearly. I’m not very surprised that there are small sample sizes or limited studies on this, which should not be misinterpreted as a lack of such gender-based issues. Ultimately whether the program is a success or a learning experience, we'll just have to wait and let the results speak for themselves :)

Larks

2y

7

Ultimately whether the program is a success or a learning experience, we'll just have to wait and let the results speak for themselves :)

If you think the results will speak for themselves, would you like to pre-register what would count as a success and what would count as a failure?

Dawn Drescher

2y

3

(E.g., using markets on Manifold. :-D)

Chris Leong

2y

15

I’ll have a think about which women I know whom I should suggest apply for this program. Do you have any more details about the kinds of candidates that you’re looking for?

Shakeel Hashim

2y

9

This is really cool, thanks for organising it!

Announcing Athena - Women in AI Alignment Research

Announcing Athena - Women in AI Alignment Research

Who should apply?

Application process

Questions?

Why are we doing this