(Posting in a personal capacity unless stated otherwise.) I help allocate Open Phil's resources to improve the governance of AI with a focus on avoiding catastrophic outcomes. Formerly co-founder of the Cambridge Boston Alignment Initiative, which supports AI alignment/safety research and outreach programs at Harvard, MIT, and beyond, co-president of Harvard EA, Director of Governance Programs at the Harvard AI Safety Team and MIT AI Alignment, and occasional AI governance researcher. I'm also a proud GWWC pledger and vegan.
(Speaking for myself as someone who has also recommended donating to Horizon, not Julian or OP)
I basically think the public outputs of the fellows is not a good proxy for the effectiveness of the program (or basically any talent program). The main impact of talent programs, including Horizon, seems better measured by where participants wind up shortly after the program (on which Horizon seems objectively strong), plus a subjective assessment of how good the participants are. There just isn't a lot of shareable data/info on the latter, so I can't do much better than just saying "I've spent some time on this (rather than taking for granted that they're good) and I think they're good on average." (I acknowledge that this is not an especially epistemically satisfying answer.)
I appreciate these analyses, but given the very high sensitivity of the bottom lines to parameters like how welfare ranges correspond to neuron counts or other facts about the animals in question, I find it implausible that the best donation option is to fund the intervention with the highest mean estimate rather than either 1) fund more research into those parameters or 2) save/invest until such research has happened. Maybe future posts could examine the tradeoff between funding/waiting for such research versus funding the direct interventions now?
I think this is comparing apples and oranges: biological capabilities on benchmarks (AFAIK not that helpful in real-world lab settings yet) versus actual economic impact. The question is whether real world bio capabilities will outstrip real world broad economic capabilities.
It's certainly possible that an AI will trigger a biorisk if-then commitment before it has general capabilities capable of 10% cumulative GDP growth. But I would be pretty surprised if we get a system so helpful that it could counterfactually enable laypeople to dramatically surpass the current state of the art in the specific domain of bio-offense without having previously gotten systems that are pretty helpful at counterfactually enabling professionals to do their jobs somewhat better and automate some routine tasks. I think your claim implies something like, as AI automates things, it will hit "making a bioweapon that ends the world, which no one can currently do" before it hits "the easiest ~15% of the stuff we already do, weighted by market value" (assuming labor is ~2/3 of GDP). This seems unlikely, especially since bioweapons involves a bunch of physical processes where AIs seem likely to struggle mightily for a while, though again I concede not impossible.
In terms of whether "most AI safety people" believe this, consider that the great takeoff speeds debate was operationalized in terms of whether AI would produce a cumulative 100% growth in four years before it produced 100% growth in one year. To the extent that this debate loosely tracked a debate within the community more broadly, it seems to imply a large constituency for the view that we will see much more than 10% cumulative growth before AI becomes existentially scary.
I think this picture of EA ignoring stable totalitarianism is missing the longtime focus on China.
Also, see this thread on Open Phil's ability to support right-of-center policy work.
It feels like there's an obvious trade between the EA worldview on AI and Thiel's, where the strategy is "laissez faire for the kinds of AI that cause late-90s-internet-scale effects (~10% cumulative GDP growth), aggressive regulation for the kinds of AI that inspire the 'apocalyptic fears' that he agrees should be taken seriously, and require evaluations of whether a given frontier AI poses those risks at the pre-deployment stage so you know which of these you're dealing with."
Indeed, this is pretty much the "if-then" policy structure Holden proposes here, seemingly with the combination of skepticism of capabilities and distrust of regulation very much in mind.
Obviously the devil (as it were) is in the details. But it feels like there are a bunch of design features that would move in this direction: very little regulation of AI systems that don't trigger very high capability thresholds (i.e. nothing currently available), aiming for low-cost and accurate risk evaluations for specific threat models like very powerful scheming, self-improvement, and bioterrorism uplift. Idk, maybe I'm failing the ideological turing test here and Thiel would say this is already a nanny state proposal or would lapse into totalitarianism, but like, there's a huge gulf between capabilities that can get you ~10% cumulative GDP growth and capabilities that can kill billions of people -- really feels like there's some governance structure that allows/promotes the former and regulates the latter.
I notice a pattern in my conversations where someone is making a career decision: the most helpful parts are often prompted by "what are your strengths and weaknesses?" and "what kinds of work have you historically enjoyed or not enjoyed?"
I can think of a couple cases (one where I was the recipient of career decision advice, another where I was the advice-giver) where we were kinda spinning our wheels, going over the same considerations, and then we brought up those topics >20 minutes into the conversation and immediately made more progress than the rest of the call to that point.
Maybe this is because in EA circles people have already put a ton of thought into considerations like "which of these jobs would be more impactful conditional on me doing a 8/10 job or better in them" and "which of these is generally better for career capital (including skill development, networks, and prestige)," so it's the conversational direction with the most low-hanging fruit. Another frame is that this is another case of people underrating personal fit relative to the more abstract/generally applicable characteristics of a job.
Yeah interesting. To be clear, I'm not saying e.g. Manifund/Manival are net negative because of adverse selection. I do think additional grant evaluation capacity seems useful, and the AI tooling here seems at least more useful than feeding grants into ChatGPT. I suppose I agree that adverse selection is a smaller problem in general than those issues, though once you consider tractability, it seems deserving of some attention.
Cases where I'd be more worried about adverse selection, and would therefore more strongly encourage potential donors:
In those cases, especially for six-figure-and-up donations, people should feel free to supplement their own evaluation (via Manival or otherwise!) by checking in with professional grantmakers; Open Phil now has a donor advisory function that you can contact at donoradvisory@openphilanthropy.org.
(For some random feedback: I picked an applicant I was familiar with, was surprised by its low score, ran it through the "Austin config," and it turns out it was losing a bunch of points for not having any information about the team's background; only problem is, it had plenty of information about the team's background! Not sure what's goin on there. Also, weakly held, but I think when you run a config it should probably open a new tab rather than taking you away from the main page?)
Can you say more about how this / your future plans solve the adverse selection problems? (I imagine you're already familiar with this post, but in case other readers aren't, I recommend it!)
Weak-downvoted; I think it's fair game to say an org acted in an untrustworthy way, but I think it's pretty essential to actually sketch the argument rather than screenshotting their claims and not specifying what they've done that contradicts the claims. It seems bad to leave the reader in a position of being like, "I don't know what the author means, but I guess Epoch must have done something flagrantly contradictory to these goals and I shouldn't trust them," rather than elucidating the evidence so the reader can actually "form their own judgment." Ben_West then asked in two comments for these specifics, and I still don't know what you mean (and I think I'm pretty high-percentile among forum readers on the dimension of "familiar with drama/alleged bad behavior of AI safety orgs").
Would remove the downvote if you fill in the implicit part of the argument here: what information/explanation would a reader need to know what you mean by "it certainly seems to me that the AI Safety community was too ready to trust Epoch" in the context of these screenshots?