T

tlevin

Senior Program Associate, AI Governance and Policy @ Open Philanthropy
2367 karmaJoined Working (6-15 years)

Bio

(Posting in a personal capacity unless stated otherwise.) I help allocate Open Phil's resources to improve the governance of AI with a focus on avoiding catastrophic outcomes. Formerly co-founder of the Cambridge Boston Alignment Initiative, which supports AI alignment/safety research and outreach programs at Harvard, MIT, and beyond, co-president of Harvard EA, Director of Governance Programs at the Harvard AI Safety Team and MIT AI Alignment, and occasional AI governance researcher. I'm also a proud GWWC pledger and vegan.

Comments
134

(Speaking for myself as someone who has also recommended donating to Horizon, not Julian or OP)

I basically think the public outputs of the fellows is not a good proxy for the effectiveness of the program (or basically any talent program). The main impact of talent programs, including Horizon, seems better measured by where participants wind up shortly after the program (on which Horizon seems objectively strong), plus a subjective assessment of how good the participants are. There just isn't a lot of shareable data/info on the latter, so I can't do much better than just saying "I've spent some time on this (rather than taking for granted that they're good) and I think they're good on average." (I acknowledge that this is not an especially epistemically satisfying answer.)

I appreciate these analyses, but given the very high sensitivity of the bottom lines to parameters like how welfare ranges correspond to neuron counts or other facts about the animals in question, I find it implausible that the best donation option is to fund the intervention with the highest mean estimate rather than either 1) fund more research into those parameters or 2) save/invest until such research has happened. Maybe future posts could examine the tradeoff between funding/waiting for such research versus funding the direct interventions now?

I think this is comparing apples and oranges: biological capabilities on benchmarks (AFAIK not that helpful in real-world lab settings yet) versus actual economic impact. The question is whether real world bio capabilities will outstrip real world broad economic capabilities.

It's certainly possible that an AI will trigger a biorisk if-then commitment before it has general capabilities capable of 10% cumulative GDP growth. But I would be pretty surprised if we get a system so helpful that it could counterfactually enable laypeople to dramatically surpass the current state of the art in the specific domain of bio-offense without having previously gotten systems that are pretty helpful at counterfactually enabling professionals to do their jobs somewhat better and automate some routine tasks. I think your claim implies something like, as AI automates things, it will hit "making a bioweapon that ends the world, which no one can currently do" before it hits "the easiest ~15% of the stuff we already do, weighted by market value" (assuming labor is ~2/3 of GDP). This seems unlikely, especially since bioweapons involves a bunch of physical processes where AIs seem likely to struggle mightily for a while, though again I concede not impossible.

In terms of whether "most AI safety people" believe this, consider that the great takeoff speeds debate was operationalized in terms of whether AI would produce a cumulative 100% growth in four years before it produced 100% growth in one year. To the extent that this debate loosely tracked a debate within the community more broadly, it seems to imply a large constituency for the view that we will see much more than 10% cumulative growth before AI becomes existentially scary.

Why does the high generality of AI capabilities imply that a similar level of capabilities produces 10% cumulative GDP growth and extinction?

I think this picture of EA ignoring stable totalitarianism is missing the longtime focus on China.

Also, see this thread on Open Phil's ability to support right-of-center policy work.

It feels like there's an obvious trade between the EA worldview on AI and Thiel's, where the strategy is "laissez faire for the kinds of AI that cause late-90s-internet-scale effects (~10% cumulative GDP growth), aggressive regulation for the kinds of AI that inspire the 'apocalyptic fears' that he agrees should be taken seriously, and require evaluations of whether a given frontier AI poses those risks at the pre-deployment stage so you know which of these you're dealing with."

Indeed, this is pretty much the "if-then" policy structure Holden proposes here, seemingly with the combination of skepticism of capabilities and distrust of regulation very much in mind.

Obviously the devil (as it were) is in the details. But it feels like there are a bunch of design features that would move in this direction: very little regulation of AI systems that don't trigger very high capability thresholds (i.e. nothing currently available), aiming for low-cost and accurate risk evaluations for specific threat models like very powerful scheming, self-improvement, and bioterrorism uplift. Idk, maybe I'm failing the ideological turing test here and Thiel would say this is already a nanny state proposal or would lapse into totalitarianism, but like, there's a huge gulf between capabilities that can get you ~10% cumulative GDP growth and capabilities that can kill billions of people -- really feels like there's some governance structure that allows/promotes the former and regulates the latter.

I notice a pattern in my conversations where someone is making a career decision: the most helpful parts are often prompted by "what are your strengths and weaknesses?" and "what kinds of work have you historically enjoyed or not enjoyed?"

I can think of a couple cases (one where I was the recipient of career decision advice, another where I was the advice-giver) where we were kinda spinning our wheels, going over the same considerations, and then we brought up those topics >20 minutes into the conversation and immediately made more progress than the rest of the call to that point.

Maybe this is because in EA circles people have already put a ton of thought into considerations like "which of these jobs would be more impactful conditional on me doing a 8/10 job or better in them" and "which of these is generally better for career capital (including skill development, networks, and prestige)," so it's the conversational direction with the most low-hanging fruit. Another frame is that this is another case of people underrating personal fit relative to the more abstract/generally applicable characteristics of a job.

Yeah interesting. To be clear, I'm not saying e.g. Manifund/Manival are net negative because of adverse selection. I do think additional grant evaluation capacity seems useful, and the AI tooling here seems at least more useful than feeding grants into ChatGPT. I suppose I agree that adverse selection is a smaller problem in general than those issues, though once you consider tractability, it seems deserving of some attention.

Cases where I'd be more worried about adverse selection, and would therefore more strongly encourage potential donors:

  • The amount you're planning to give is big. Downside risks from funding one person to do a project are usually pretty low; empowering them to run an org is a different story. (Also, smaller grants are more likely to have totally flown under the radar of the big funders.)
  • The org/person has been around for a while.
  • The project is risky.

In those cases, especially for six-figure-and-up donations, people should feel free to supplement their own evaluation (via Manival or otherwise!) by checking in with professional grantmakers; Open Phil now has a donor advisory function that you can contact at donoradvisory@openphilanthropy.org.

(For some random feedback: I picked an applicant I was familiar with, was surprised by its low score, ran it through the "Austin config," and it turns out it was losing a bunch of points for not having any information about the team's background; only problem is, it had plenty of information about the team's background! Not sure what's goin on there. Also, weakly held, but I think when you run a config it should probably open a new tab rather than taking you away from the main page?)

Can you say more about how this / your future plans solve the adverse selection problems? (I imagine you're already familiar with this post, but in case other readers aren't, I recommend it!)

Having a savings target seems important. (Not financial advice.)

I sometimes hear people in/around EA rule out taking jobs due to low salaries (sometimes implicitly, sometimes a little embarrassedly). Of course, it's perfectly understandable not to want to take a significant drop in your consumption. But in theory, people with high salaries could be saving up so they can take high-impact, low-paying jobs in the future; it just seems like, by default, this doesn't happen. I think it's worth thinking about how to set yourself up to be able to do it if you do find yourself in such a situation; you might find it harder than you expect.

(Personal digression: I also notice my own brain paying a lot more attention to my personal finances than I think is justified. Maybe some of this traces back to some kind of trauma response to being unemployed for a very stressful ~6 months after graduating: I just always could be a little more financially secure. A couple weeks ago, while meditating, it occurred to me that my brain is probably reacting to not knowing how I'm doing relative to my goal, because 1) I didn't actually know what my goal is, and 2) I didn't really have a sense of what I was spending each month. In IFS terms, I think the "social and physical security" part of my brain wasn't trusting that the rest of my brain was competently handling the situation.)

So, I think people in general would benefit from having an explicit target: once I have X in savings, I can feel financially secure. This probably means explicitly tracking your expenses, both now and in a "making some reasonable, not-that-painful cuts" budget, and gaming out the most likely scenarios where you'd need to use a large amount of your savings, beyond the classic 3 or 6 months of expenses in an emergency fund. For people motivated by EA principles, the most likely scenarios might be for impact reasons: maybe you take a public-sector job that pays half your current salary for three years, or maybe you'd need to self-fund a new project for a year; how much would it cost to maintain your current level of spending, or a not-that-painful budget-cut version? Then you could target that amount (in addition to the emergency fund, so you'd still have that at the end of the period); once you have that, you could feel more secure/spend less brain space on money, donate more of your income, and be ready to jump on a high-impact, low-paying opportunity.

Of course, you can more easily hit that target if you can bring down your expenses -- you both lower the required amount in savings and you save more each month. So, maybe some readers would also benefit from cutting back a bit, though I think most EAs are pretty thrifty already.

(This is hardly novel -- Ben Todd was publishing related stuff on 80k in 2015. But I guess I had to rediscover it, so posting here in case anyone else could use the refresher.)

Load more