T

tlevin

AI Governance Program Associate @ Open Philanthropy
1534 karmaJoined Working (0-5 years)

Bio

(Posting in a personal capacity unless stated otherwise.) I help allocate Open Phil's resources to improve the governance of AI with a focus on avoiding catastrophic outcomes. Formerly co-founder of the Cambridge Boston Alignment Initiative, which supports AI alignment/safety research and outreach programs at Harvard, MIT, and beyond, co-president of Harvard EA, Director of Governance Programs at the Harvard AI Safety Team and MIT AI Alignment, and occasional AI governance researcher. I'm also a proud GWWC pledger and vegan.

Comments
108

Fwiw, I think the main thing getting missed in this discourse is that even 3 out of your 50 speakers (especially if they're near the top of the bill) are mostly known for a cluster of edgy views that are not welcome in most similar spaces, people who really want to gather to discuss those edgy and typically unwelcome views will be a seriously disproportionate share of attendees, and this will have significant repercussions for the experience of the attendees who were primarily interested in the other 47 speakers.

I recommend the China sections of this recent CNAS report as a starting point for discussion (it's definitely from a relatively hawkish perspective, and I don't think of myself as having enough expertise to endorse it, but I did move in this direction after reading). 

From the executive summary:

Taken together, perhaps the most underappreciated feature of emerging catastrophic AI risks from this exploration is the outsized likelihood of AI catastrophes originating from China. There, a combination of the Chinese Communist Party’s efforts to accelerate AI development, its track record of authoritarian crisis mismanagement, and its censorship of information on accidents all make catastrophic risks related to AI more acute.

From the "Deficient Safety Cultures" section:

While such an analysis is of relevance in a range of industry- and application-specific cultures, China’s AI sector is particularly worthy of attention and uniquely predisposed to exacerbate catastrophic AI risks [footnote]. China’s funding incentives around scientific and technological advancement generally lend themselves to risky approaches to new technologies, and AI leaders in China have long prided themselves on their government’s large appetite for risk—even if there are more recent signs of some budding AI safety consciousness in the country [footnote, footnote, footnote]. China’s society is the most optimistic in the world on the benefits and risks of AI technology, according to a 2022 survey by the multinational market research firm Institut Public de Sondage d’Opinion Secteur (Ipsos), despite the nation’s history of grisly industrial accidents and mismanaged crises—not least its handling of COVID-19 [footnote, footnote, footnote, footnote]. The government’s sprint to lead the world in AI by 2030 has unnerving resonances with prior grand, government-led attempts to accelerate industries that have ended in tragedy, as in the Great Leap Forward, the commercial satellite launch industry, and a variety of Belt and Road infrastructure projects [footnote, footnote, footnote]. China’s recent track record in other hightech sectors, including space and biotech, also suggests a much greater likelihood of catastrophic outcomes [footnote, footnote, footnote, footnote, footnote].

From "Further Considerations"

In addition to having to grapple with all the same safety challenges that other AI ecosystems must address, China’s broader tech culture is prone to crisis due to its government’s chronic mismanagement of disasters, censorship of information on accidents, and heavy-handed efforts to force technological breakthroughs. In AI, these dynamics are even more pronounced, buoyed by remarkably optimistic public perceptions of the technology and Beijing’s gigantic strategic gamble on boosting its AI sector to international preeminence. And while both the United States and China must reckon with the safety challenges that emerge from interstate technology competitions, historically, nations that perceive themselves to be slightly behind competitors are willing to absorb the greatest risks to catch up in tech races [footnote]. Thus, even while the United States’ AI edge over China may be a strategic advantage, Beijing’s self-perceived disadvantage could nonetheless exacerbate the overall risks of an AI catastrophe.

Yes, but it's kind of incoherent to talk about the dollar value of something without having a budget and an opportunity cost; it has to be your willingness-to-pay, not some dollar value in the abstract. Like, it's not the case that the EA funding community would pay $500B even for huge wins like malaria eradication, end to factory farming, robust AI alignment solution, etc, because it's impossible: we don't have $500B.

And I haven't thought about this much but it seems like we also wouldn't pay, say, $500M for a 1-in-1000 chance for a "$500B win" because unless you're defining "$500B win" with respect to your actual willingness-to-pay, you might wind up with many opportunities to take these kinds of moonshots and quickly run out of money. The dollar size of the win still has to ultimately account for your budget.

Well, it implies you could change the election with those amounts if you knew exactly how close the election would be in each state and spent optimally. But If you figure the estimates are off by an OOM, and half of your spending goes to states that turn out not to be useful (which matches a ~30 min analysis I did a few months ago), and you have significant diminishing returns such that $10M-$100M is 3x less impactful than the first $10M and $100M-$1B is another 10x less impactful, you still get:

  • First $10M is ~$10k per key vote = 1,000 votes (enough to swing 2000)
  • Next $90M is ~$30k per key vote = 3,000 votes
  • Next $900M is ~$90k per key vote = 10,000 votes

I think if you think there's a major difference between the candidates, you might put a value on the election in the billions -- let's say $10B for the sake of calculation; so the first $10M would be worth it if there's a 0.1% chance the election is decided by <1000 votes (which of course happened 6 elections ago!), the next $90M is worth it if there's a 0.9% chance the election is decided by >1000 but <4000 votes, and the next $900M is worth it if there's a 9% chance the election is decided by >4000 but <14000 votes. IMO the first two probably pass and the last one probably doesn't, but idk.

It seems like you might be under-weighing the cumulative amount of resources - even if you have some pretty heavy decay rate (which it's unclear you should -- usually we think of philanthropic investments compounding over time), avoiding nuclear war was a top global priority for decades, and it feels like we have a lot of intellectual and policy "legacy infrastructure" from that.

Yeah, this is all pretty compelling, thanks!

tlevin
60
13
1
3

I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable.

I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies.

In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences.

Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).

Yes, some regulations backfire, and this is a good flag to keep in mind when designing policy, but to actually make the reference-class argument here work, you'd have to show that this is what we should expect from AI policy, which would include showing that failures like NEPA are either much more relevant for the AI case or more numerous than other, more successful regulations, like (in my opinion) the Clean Air Act, Sarbanes-Oxley, bans on CFCs or leaded gasoline, etc. I know it's not quite as simple as "I would simply design good regulations instead of bad ones," but it's also not as simple as "some regulations are really counterproductive, so you shouldn't advocate for any." Among other things, this assumes that nobody else will be pushing for really counterproductive regulations!

This post correctly identifies some of the major obstacles to governing AI, but ultimately makes an argument for "by default, governments will not regulate AI well," rather than the claim implied by its title, which is that advocating for (specific) AI regulations is net negative -- a type of fallacious conflation I recognize all too well from my own libertarian past.

Interesting! I actually wrote a piece on "the ethics of 'selling out'" in The Crimson almost 6 years ago (jeez) that was somewhat more explicit in its EA justification, and I'm curious what you make of those arguments.

I think randomly selected Harvard students (among those who have the option to do so) deciding to take high-paying jobs and donate double-digit percentages of their salary to places like GiveWell is very likely better for the world than the random-ish other things they might have done, and for that reason I strongly support this op-ed. But I think for undergrads who are really committed to doing the most good, there are two things I would recommend instead. Both route through developing a solid understanding of the most important and tractable problems in the world, via reading widely, asking good questions of knowledgeable people, doing their own writing and seeking feedback, probably aggressively networking among the people working on these problems. 

This enables much more effective earning to give — I think very plugged-in and reasonably informed donors can outperform even top grantmaking organizations in various ways, including helping organizations diversify their funding, moving faster, spotting opportunities that the grantmakers don't, etc. 

And it's also basically necessary for doing direct work on the world's most important problems. I think the generic advice to earn to give misses the huge variation in performance between individuals in direct work; if I understand correctly, 80k agrees with this and thinks this should have been much more emphasized in their early writing and advice. Many Harvard students, in my view, could relatively quickly become excellent in roles like think tank research in AI policy or biosecurity or operations at very impactful organizations. A smaller but nontrivial number could be excellent researchers on important philosophical or technical questions. I think it takes a lot of earning potential to beat those.

Load more