Hide table of contents

Dwarkesh's summary:

Paul Christiano is the world’s leading AI safety researcher. My full episode with him is out!

We discuss:

  • Does he regret inventing RLHF, and is alignment necessarily dual-use?
  • Why he has relatively modest timelines (40% by 2040, 15% by 2030),
  • What do we want post-AGI world to look like (do we want to keep gods enslaved forever)?
  • Why he’s leading the push to get to labs develop responsible scaling policies, and what it would take to prevent an AI coup or bioweapon,
  • His current research into a new proof system, and how this could solve alignment by explaining model's behavior
  • and much more.

5

0
0

Reactions

0
0
No comments on this post yet.
Be the first to respond.
More from ESRogs
Curated and popular this week
 ·  · 16m read
 · 
This is a crosspost for The Case for Insect Consciousness by Bob Fischer, which was originally published on Asterisk in January 2025. [Subtitle.] The evidence that insects feel pain is mounting, however we approach the issue. For years, I was on the fence about the possibility of insects feeling pain — sometimes, I defended the hypothesis;[1] more often, I argued against it.[2] Then, in 2021, I started working on the puzzle of how to compare pain intensity across species. If a human and a pig are suffering as much as each one can, are they suffering the same amount? Or is the human’s pain worse? When my colleagues and I looked at several species, investigating both the probability of pain and its relative intensity,[3] we found something unexpected: on both scores, insects aren’t that different from many other animals.  Around the same time, I started working with an entomologist with a background in neuroscience. She helped me appreciate the weaknesses of the arguments against insect pain. (For instance, people make a big deal of stories about praying mantises mating while being eaten; they ignore how often male mantises fight fiercely to avoid being devoured.) The more I studied the science of sentience, the less confident I became about any theory that would let us rule insect sentience out.  I’m a philosopher, and philosophers pride themselves on following arguments wherever they lead. But we all have our limits, and I worry, quite sincerely, that I’ve been too willing to give insects the benefit of the doubt. I’ve been troubled by what we do to farmed animals for my entire adult life, whereas it’s hard to feel much for flies. Still, I find the argument for insect pain persuasive enough to devote a lot of my time to insect welfare research. In brief, the apparent evidence for the capacity of insects to feel pain is uncomfortably strong.[4] We could dismiss it if we had a consensus-commanding theory of sentience that explained why the apparent evidence is ir
 ·  · 7m read
 · 
Introduction I have been writing posts critical of mainstream EA narratives about AI capabilities and timelines for many years now. Compared to the situation when I wrote my posts in 2018 or 2020, LLMs now dominate the discussion, and timelines have also shrunk enormously. The ‘mainstream view’ within EA now appears to be that human-level AI will be arriving by 2030, even as early as 2027. This view has been articulated by 80,000 Hours, on the forum (though see this excellent piece excellent piece arguing against short timelines), and in the highly engaging science fiction scenario of AI 2027. While my article piece is directed generally against all such short-horizon views, I will focus on responding to relevant portions of the article ‘Preparing for the Intelligence Explosion’ by Will MacAskill and Fin Moorhouse.  Rates of Growth The authors summarise their argument as follows: > Currently, total global research effort grows slowly, increasing at less than 5% per year. But total AI cognitive labour is growing more than 500x faster than total human cognitive labour, and this seems likely to remain true up to and beyond the point where the cognitive capabilities of AI surpasses all humans. So, once total AI cognitive labour starts to rival total human cognitive labour, the growth rate of overall cognitive labour will increase massively. That will drive faster technological progress. MacAskill and Moorhouse argue that increases in training compute, inference compute and algorithmic efficiency have been increasing at a rate of 25 times per year, compared to the number of human researchers which increases 0.04 times per year, hence the 500x faster rate of growth. This is an inapt comparison, because in the calculation the capabilities of ‘AI researchers’ are based on their access to compute and other performance improvements, while no such adjustment is made for human researchers, who also have access to more compute and other productivity enhancements each year.
 ·  · 21m read
 · 
Introduction ~440 billion shrimps are farmed each year [1]. This is over 5x the total number of all farmed land animals put together [2]. Many farmed shrimps suffer from conditions that can and should be addressed, such as poor water quality, high stocking densities, inhumane slaughter methods, and avoidable mutilations (such as eyestalk ablation) [3]. Shrimp Welfare Project is an organisation of people who believe that shrimps are capable of suffering and deserve our moral consideration [4]. We aim to cost-effectively reduce the suffering of billions of farmed shrimps. This post is essentially an expanded version of our 2025 Funding Proposal.  If you want the TL;DR version of this post, I'd recommend reading that. (Shr)Impact and Vision Shrimp Welfare Project has four workstreams, two of which we consider our Core or Foundational workstreams - those are Corporate Engagement and Farmer Support. Two more are relatively new, but we think they have a lot of potential, and those are Research & Policy, and Precision Welfare. For each workstream, I want to talk you through: * Our mission statement for the workstream * The problem we’re trying to solve through this workstream, * The strategy we’re taking to solve the problem, * The successes we’ve had so far * And our vision for 2030 Core: Corporate Engagement Catalysing industry-wide adoption of pre-slaughter stunning by buying and deploying electrical stunners to early adopters to build towards a tipping point that achieves critical mass. Problem (and Context) When we started Shrimp Welfare Project, we planned to originally work only directly with farmers. However, we soon became aware that unlike a lot of fish farming, which is often produced and consumed domestically, shrimps instead were bought and sold on the global market. In particular, most shrimps are farmed in the Global South (in places like Ecuador, India, and Vietnam), and then exported to countries in the Global North (such as those in Euro