L

lukeprog

4138 karmaJoined

Comments
238

I am copying footnote 19 from the post above into this comment for easier reference/linking:

The "defense in depth" concept originated in military strategy (Chierici et al. 2016; Luttwak et al. 2016, ch. 3; Price 2010), and has since been applied to reduce risks related to a wide variety of contexts, including nuclear reactors (International Nuclear Safety Advisory Group 1996, 1999, 2017; International Atomic Energy Agency 2005; Modarres & Kim 2010; Knief 2008, ch. 13.), chemical plants (see "independent protection layers" and "layers of protection analysis" in Center for Chemical Process Safety 2017), aviation (see "Swiss cheese model" in Shappell & Wiegmann 2000), space vehicles (Dezfuli 2015), cybersecurity and information security (McGuiness 2021; National Security Agency 2002 & 2010; Amoroso 2011; Department of Homeland Security 2016; Riggs 2003; Lohn 2019), software development (Including for purposes beyond software security, e.g. software resilience; Adkins et al. 2020, ch. 8), laboratories studying dangerous pathogens (WHO 2020; CDC 2020; Rappert & McLeish 2007; National Academies 2006, which use different terms for "defense in depth"), improvised explosive devices (see "web of prevention" in Revill 2016), homeland security (Echevarria II & Tussing 2003), hospital security (see "layers of protection" in York & MacAlister 2015), port security (McNicholas 2016, ch. 10), physical security in general (Patterson & Fay 2017, ch. 11), control system safety in general (see "layers of protection" in Barnard 2013; Baybutt 2013), mining safety (Bonsu et al. 2016), oil rig safety (see "Swiss cheese model" in Ren et al. 2008), surgical safety (Collins et al. 2014), fire management (Okray & Lubnau II 2003, pp. 20-21), health care delivery (Vincent et al. 1998), and more. Related (and in some cases near-identical) concepts include the "web of prevention" (Rappert & McLeish 2007; Revill 2016), the "Swiss cheese model" (Reason 1990; Reason et al. 2006; Larouzee & Le Coze 2020), "layers of protection" (Center for Chemical Process Safety 2017), "multilayered defense" or "diversity of defense" (Chapple et al. 2018, p. 352), "onion skin" or "lines of defense" (Beaudry 2016, p. 388), or "layered defense" (May et al. 2006, p. 115). Example partially-overlapping "defense layers" for high-stakes AI development and deployment projects might include: (1) tools for blocking unauthorized access to key IP, e.g. secure hardware enclaves for model weights, (2) tools for blocking unauthorized use of developed/trained IP, akin to the PALs on nuclear weapons, (3) tools and practices for ensuring safe and secure behavior by the humans with access to key IP, e.g. via training, monitoring, better interfaces, etc., (4) methods for scaling human supervision and feedback during and after training high-stakes ML systems, (5) technical methods for gaining high confidence in certain properties of ML systems, and properties of the inputs to ML systems (e.g. datasets), at all stages of development (a la Ashmore et al. 2019), (6) background checks & similar for people being hired or promoted to certain types of roles, (7) legal mechanisms for retaining developer control of key IP in most circumstances, (8) methods for avoiding or detecting supply chain attacks, (9) procedures for deciding when and how to engage one's host government to help with security/etc., (10) procedures for vetting and deciding on institutional partners, investors, etc. (11) procedures for deciding when to enter into some kinds of cross-lab (and potentially cross-state) collaborations, tools for executing those collaborations, and tools for verifying another party's compliance with such agreements, (12) risk analysis and decision support tools specific to high-stakes AI system developers, (13) whistleblowing / reporting policies, (14) other features of high-reliability organizations, a la Dietterich (2018) and Shneiderman (2020), (15) procedures for balancing concerns of social preference / political legitimacy and ethical defensibility, especially for deployment of systems with a large and broad effect on society as a whole, e.g. see Rahwan (2018); Savulescu et al. (2021), (16) special tools for spot-checking / double-checking / cross-checking whether all of the above are being used appropriately, and (17) backup plans and automatic fail-safe mechanisms for all of the above.

Oops, my colleague checked again and the Future Perfect inclusions (Keley and Sigal) are indeed a mistake; OP hasn't funded Future Perfect. Thanks for the correction. (Though see e.g. this similar critical tweet from OP grantee Matt Reardon.)

Re: Eric Neyman. We've funded ARC before and would do so again depending on RFMF/etc.

I'm following up here with a convenience sample of examples of OP staff and grantees criticizing frontier AI companies, collected by one of my colleagues, since some folks seem to doubt how common this is:

Yudkowsky's message is "If anyone builds superintelligence, everyone dies." Zvi's version is "If anyone builds superintelligence under anything like current conditions, everyone probably dies."

Yudkowsky contrasts those framings with common "EA framings" like "It seems hard to predict whether superintelligence will kill everyone or not, but there's a worryingly high chance it will, and Earth isn't prepared," and seems to think the latter framing is substantially driven by concerns about what can be said "in polite company."

Obviously I can't speak for all of EA, or all of Open Phil, and this post is my personal view rather than an institutional one since no single institutional view exists, but for the record, my inside view since 2010 has been "If anyone builds superintelligence under anything close to current conditions, probably everyone dies (or is severely disempowered)," and I think the difference between me and Yudkowsky has less to do with social effects on our speech and more to do with differing epistemic practices, i.e. about how confident one can reasonably be about the effects of poorly understood future technologies emerging in future, poorly understood circumstances. (My all-things-considered view, which includes various reference classes and partial deference to many others who think about the topic, is more agnostic and hasn't consistently been above the "probably" line.)

Moreover, I think those who believe some version of "If anyone builds superintelligence, everyone dies" should be encouraged to make their arguments loudly and repeatedly; the greatest barrier to actually-risk-mitigating action right now is the lack of political will.

That said, I think people should keep in mind that:

  • Public argumentation can only get us so far when the evidence for the risks and their mitigations is this unclear, when AI has automated so little of the economy, when AI failures have led to so few deaths, etc.
  • Most concrete progress on worst-case AI risks — e.g. arguably the AISIs network, the draft GPAI code of practice for the EU AI Act, company RSPs, the chip and SME export controls, or some lines of technical safety work — comes from dozens of people toiling away mostly behind-the-scenes for years, not from splashy public communications (though many of the people involved were influenced by AI risk writings years before). Public argumentation is a small portion of the needed work to make concrete progress. It may be necessary, but it’s far from sufficient.

I'd rather not spend more time engaging here, but see e.g. this.

If you know people who could do good work in the space, please point them to our RFP! As for being anti-helpful in some cases, I'm guessing that was cases where we thought the opportunity wasn't a great opportunity despite it being right-of-center (which is a point in favor, in my opinion), but I'm not sure.

Replying to just a few points…

I agree about tabooing "OP is funding…"; my team is undergoing that transition now, leading to some inconsistencies in our own usage, let alone that of others.

Re: "large negative incentive for founders and organizations who are considering working more with the political right." I'll note that we've consistently been able to help such work find funding, because (as noted here), the bottleneck is available right-of-center opportunities rather than available funding. Plus, GV can and does directly fund lots of work that "engages with the right" (your phrasing), e.g. Horizon fellows and many other GV grantees regularly engage with Republicans, and seem likely to do even more of that on the margin given the incoming GOP trifecta.

Re: "nothing has changed in the last year." No, a lot has changed, but my quick-take post wasn't about "what has changed," it was about "correcting some misconceptions I'm encountering."

Re: "De-facto GV was and is likely to continue to be 95%+ of the giving that OP is influencing." This isn't true, including specifically for my team ("AI governance and policy").

I also don't think this was ever true: "One was also able to roughly assume that if OP decides to not recommend a grant to GV, that most OP staff do not think that grant would be more cost-effective than other grants referred to GV." There's plenty of internal disagreement even among the AI-focused staff about which grants are above our bar for recommending, and funding recommendation decisions have never been made by majority vote.

Good Ventures did indicate to us some time ago that they don't think they're the right funder for some kinds of right-of-center AI policy advocacy, though (a) the boundaries are somewhat fuzzy and pretty far from the linked comment's claim about an aversion to opportunities that are "even slightly right of center in any policy work," (b) I think the boundaries might shift in the future, and (c) as I said above, OP regularly recommends right-of-center policy opportunities to other funders.

Also, I don't actually think this should affect people's actions much because: my team has been looking for right-of-center policy opportunities for years (and is continuing to do so), and the bottleneck is "available opportunities that look high-impact from an AI GCR perspective," not "available funding." If you want to start or expand a right-of-center policy group aimed at AI GCR mitigation, you should do it and apply here! I can't guarantee we'll think it's promising enough to recommend to the funders we advise, but there are millions (maybe tens of millions) available for this kind of work; we've simply found only a few opportunities that seem above-our-bar for expected impact on AI GCR, despite years of searching.

lukeprog
164
17
2
19

Recently, I've encountered an increasing number of misconceptions, in rationalist and effective altruist spaces, about what Open Philanthropy's Global Catastrophic Risks (GCR) team does or doesn't fund and why, especially re: our AI-related grantmaking. So, I'd like to briefly clarify a few things:

  • Open Philanthropy (OP) and our largest funding partner Good Ventures (GV) can't be or do everything related to GCRs from AI and biohazards: we have limited funding, staff, and knowledge, and many important risk-reducing activities are impossible for us to do, or don't play to our comparative advantages.
    • Like most funders, we decline to fund the vast majority of opportunities we come across, for a wide variety of reasons. The fact that we declined to fund someone says nothing about why we declined to fund them, and most guesses I've seen or heard about why we didn't fund something are wrong. (Similarly, us choosing to fund someone doesn't mean we endorse everything about them or their work/plans.)
    • Very often, when we decline to do or fund something, it's not because we don't think it's good or important, but because we aren't the right team or organization to do or fund it, or we're prioritizing other things that quarter.
    • As such, we spend a lot of time working to help create or assist other philanthropies and organizations who work on these issues and are better fits for some opportunities than we are. I hope in the future there will be multiple GV-scale funders for AI GCR work, with different strengths, strategies, and comparative advantages — whether through existing large-scale philanthropies turning their attention to these risks or through new philanthropists entering the space.
  • While Good Ventures is Open Philanthropy's largest philanthropic partner, we also regularly advise >20 other philanthropists who are interested to hear about GCR-related funding opportunities. (Our GHW team also does similar work partnering with many other philanthropists.) On the GCR side, we have helped move tens of millions of non-GV money to GCR-related organizations in just the past year, including some organizations that GV recently exited. GV and each of those other funders have their own preferences and restrictions we have to work around when recommending funding opportunities.
    • Among the AI funders we advise, Good Ventures is among the most open and flexible funders.
    • We're happy to see funders enter the space even if they don’t share our priorities or work with us. When more funding is available, and funders pursue a broader mix of strategies, we think this leads to a healthier and more resilient field overall.
  • Many funding opportunities are a better fit for non-GV funders, e.g. due to funder preferences, restrictions, scale, or speed. We've also seen some cases where an organization can have more impact if they're funded primarily or entirely by non-GV sources. For example, it’s more appropriate for some types of policy organizations outside the U.S. to be supported by local funders, and other organizations may prefer support from funders without GV/OP’s past or present connections to particular grantees, AI companies, etc. Many of the funders we advise are actively excited to make use of their comparative advantages relative to GV, and regularly do so.
  • We are excited for individuals and organizations that aren't a fit for GV funding to apply to some of OP’s GCR-related RFPs (e.g. here, for AI governance). If we think the opportunity is strong but a better fit for another funder, we'll recommend it to other funders.
    • To be clear, these other funders remain independent of OP and decline most of our recommendations, but in aggregate our recommendations often lead to target grantees being funded.
  • We believe reducing AI GCRs via public policy is not an inherently liberal or conservative goal. Almost all the work we fund in the U.S. is nonpartisan or bipartisan and engages with policymakers on both sides of the aisle. However, at present, it remains the case that most of the individuals in the current field of AI governance and policy (whether we fund them or not) are personally left-of-center and have more left-of-center policy networks. Therefore, we think AI policy work that engages conservative audiences is especially urgent and neglected, and we regularly recommend right-of-center funding opportunities in this category to several funders.
  • OP's AI teams spend almost no time directly advocating for specific policy ideas. Instead, we focus on funding a large ecosystem of individuals and organizations to develop policy ideas, debate them, iterate them, advocate for them, etc. These grantees disagree with each other very often (a few examples here), and often advocate for different (and sometimes ~opposite) policies.
  • We think it's fine and normal for grantees to disagree with us, even in substantial ways. We've funded hundreds of people who disagree with us in a major way about fundamental premises of our GCRs work, including about whether AI poses GCR-scale risks at all (example).
  • I think frontier AI companies are creating enormous risks to humanity, I think their safety and security precautions are inadequate, and I think specific reckless behaviors should be criticized. AI company whistleblowers should be celebrated and protected. Several of our grantees regularly criticize leading AI companies in their official communications, as do many senior employees at our grantees, and I think this happens too infrequently.
  • Relatedly, I think substantial regulatory guardrails on frontier AI companies are needed, and organizations we've directed funding to regularly propose or advocate policies that ~all frontier AI companies seem to oppose (alongside some policies they tend to support).
  • I'll also take a moment to address a few misconceptions that are somewhat less common in EA or rationalist spaces, but seem to be common elsewhere:
    • Discussion of OP online and in policy media tends to focus on our AI grantmaking, but AI represents a minority of our work. OP has many focus areas besides AI, and has given far more to global health and development work than to AI work.
    • We are generally big fans of technological progress. See e.g. my post about the enormous positive impacts from the industrial revolution, or OP's funding programs for scientific research, global health R&D, innovation policy, and related issues like immigration policy. Most technological progress seems to have been beneficial, sometimes hugely so, even though there are some costs and harms along the way. But some technologies (e.g. nuclear weapons, synthetic pathogens, and superhuman AI) are extremely dangerous and warrant extensive safety and security measures rather than a "move fast and break [the world, in this case]" approach.
    • We have a lot of uncertainty about how large AI risk is, exactly which risks are most worrying (e.g. loss of control vs. concentration of power), on what timelines the worst-case risks might materialize, and what can be done to mitigate them. As such, most of our funding in the space has been focused on (a) talent development, and (b) basic knowledge production (e.g. Epoch AI) and scientific investigation (example), rather than work that advocates for specific interventions.

I hope these clarifications are helpful, and lead to fruitful discussion, though I don't expect to have much time to engage with comments here.

Re: why our current rate of spending on AI safety is "low." At least for now, the main reason is lack of staff capacity! We're putting a ton of effort into hiring (see here) but are still not finding as many qualified candidates for our AI roles as we'd like. If you want our AI safety spending to grow faster, please encourage people to apply!

Load more