3398 karmaJoined Working (6-15 years)Reykjavik, Islande


Now: TYPE III AUDIO; Independent study.

Previously: 80,000 Hours (2014-15; 2017-2021) Worked on web development, product management, strategy, internal systems, IT security, etc. Read my CV.

Also: Inbox When Ready; Radio Bostrom; The Valmy; Comment Helper for Google Docs.


Topic contributions

I also don't see any evidence for the claim of EA philosophers having "eroded the boundary between this kind of philosophizing and real-world decision-making".

Have you visited the 80,000 Hours website recently?

I think that effective altruism centrally involves taking the ideas of philosophers and using them to inform real-world decision-making. I am very glad we’re attempting this, but we must recognise that this is an extraordinarily risky business. Even the wisest humans are unqualified for this role. Many of our attempts are 51:49 bets at best—sometimes worth trying, rarely without grave downside risk, never without an accompanying imperative to listen carefully for feedback from the world. And yes—diverse, hedged experiments in overconfidence also make sense. And no, SBF was not hedged anything like enough to take his 51:49 bets—to the point of blameworthy, perhaps criminal negligence.

A notable exception to the “we’re mostly clueless” situation is: catastrophes are bad. This view passes the “common sense” test, and the “nearly all the reasonable takes on moral philosophy” test too (negative utilitarianism is the notable exception). But our global resource allocation mechanisms are not taking “catastrophes are bad” seriously enough. So, EA—along with other groups and individuals—has a role to play in pushing sensible measures to reduce catastrophic risks up the agenda (as well as the sensible disaster mitigation prep).

(Derek Parfit’s “extinction is much worse than 99.9% wipeout” claim is far more questionable—I put some of my chips on this, but not the majority.)

As you suggest, the transform function from “abstract philosophical idea” to “what do” is complicated and messy, and involves a lot of deference to existing norms and customs. Sadly, I think that many people with a “physics and philosophy” sensibility underrate just how complicated and messy the transform function really has to be. So they sometimes make bad decisions on principle instead of good decisions grounded in messy common sense.

I’m glad you shared the J.S. Mill quote.

…the beliefs which have thus come down are the rules of morality for the multitude, and for the philosopher until he has succeeded in finding better

EAs should not be encouraged to grant themselves practical exception from “the rules of morality for the multitude” if they think of themselves as philosophers. Genius, wise philosophers are extremely rare (cold take: Parfit wasn’t one of them).

To be clear: I am strongly in favour of attempts to act on important insights from philosophy. I just think that this is hard to do well. One reason is that there is a notable minority of “physics and philosophy” folks who should not be made kings, because their “need for systematisation” is so dominant as to be a disastrous impediment for that role.

In my other comment, I shared links to Karnofsky, Beckstead and Cowen expressing views in the spirit of the above. From memory, Carl Shuman is in a similar place, and so are Alexander Berger and Ajeya Cotra.

My impression is that more than half of the most influential people in effective altruism are roughly where they should be on these topics, but some of the top “influencers”, and many of the ”second tier”, are not.

(Views my own. Sword meme credit: the artist currently known as John Stewart Chill.)

Bret Taylor and Larry Summers (members of the current OpenAI board) have responded to Helen Toner and Tasha McCauley in The Economist.

The key passages:

Helen Toner and Tasha McCauley, who left the board of Openai after its decision to reverse course on replacing Sam Altman, the CEO, last November, have offered comments on the regulation of artificial intelligence (AI) and events at OpenAI in a By Invitation piece in The Economist.

We do not accept the claims made by Ms Toner and Ms McCauley regarding events at OpenAI. Upon being asked by the former board (including Ms Toner and Ms McCauley) to serve on the new board, the first step we took was to commission an external review of events leading up to Mr Altman’s forced resignation. We chaired a special committee set up by the board, and WilmerHale, a prestigious law firm, led the review. It conducted dozens of interviews with members of OpenAI's previous board (including Ms Toner and Ms McCauley), Openai executives, advisers to the previous board and other pertinent witnesses; reviewed more than 30,000 documents; and evaluated various corporate actions. Both Ms Toner and Ms McCauley provided ample input to the review, and this was carefully considered as we came to our judgments.

The review’s findings rejected the idea that any kind of ai safety concern necessitated Mr Altman’s replacement. In fact, WilmerHale found that “the prior board’s decision did not arise out of concerns regarding product safety or security, the pace of development, OpenAI's finances, or its statements to investors, customers, or business partners.”

Furthermore, in six months of nearly daily contact with the company we have found Mr Altman highly forthcoming on all relevant issues and consistently collegial with his management team. We regret that Ms Toner continues to revisit issues that were thoroughly examined by the WilmerHale-led review rather than moving forward.

Ms Toner has continued to make claims in the press. Although perhaps difficult to remember now, OpenAI released ChatGPT in November 2022 as a research project to learn more about how useful its models are in conversational settings. It was built on GPT-3.5, an existing ai model which had already been available for more than eight months at the time.

Andrew Mayne points out that “the base model for ChatGPT (GPT 3.5) had been publicly available via the API since March 2022”.

On (1): it's very unclear how ownership could be compatible with no financial interest.

Maaaaaybe (2) explains it. That is: while ownership does legally entail financial interest, it was agreed that this was only a pragmatic stopgap measure, such that in practice Sam had no financial interest.

For context:

  1. OpenAI claims that while Sam owned the OpenAI Startup Fund, there was “no personal investment or financial interest from Sam”.
  2. In February 2024, OpenAI said: “We wanted to get started quickly and the easiest way to do that due to our structure was to put it in Sam's name. We have always intended for this to be temporary.”
  3. In April 2024 it was announced that Sam no longer owns the fund.

Sam didn't inform the board that he owned the OpenAI Startup Fund, even though he constantly was claiming to be an independent board member with no financial interest in the company.

Sam has publicly said he has no equity in OpenAI. I've not been able to find public quotes where Sam says he has no financial interest in OpenAI (does anyone have a link?).

From the interview:

When ChatGPT came out, November 2022, the board was not informed in advance about that. We learned about ChatGPT on Twitter.

Several sources have suggested that the ChatGPT release was not expected to be a big deal. Internally, ChatGPT was framed as a “low-key research preview”. From The Atlantic:

The company pressed forward and launched ChatGPT on November 30. It was such a low-key event that many employees who weren’t directly involved, including those in safety functions, didn’t even realize it had happened. Some of those who were aware, according to one employee, had started a betting pool, wagering how many people might use the tool during its first week. The highest guess was 100,000 users.

If that's true, then perhaps it wasn’t ex ante above the bar to report to the board.

You wrote:

[OpenAI do] very little public discussion of concrete/specific large-scale risks of their products and the corresponding risk-mitigation efforts (outside of things like short-term malicious use by bad API actors, where they are doing better work).

This doesn't match my impression.

For example, Altman signed the CAIS AI Safety Statement, which reads:

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

The “Preparedness” page—linked from the top navigation menu on their website—starts:

The study of frontier AI risks has fallen far short of what is possible and where we need to be. To address this gap and systematize our safety thinking, we are adopting the initial version of our Preparedness Framework. It describes OpenAI’s processes to track, evaluate, forecast, and protect against catastrophic risks posed by increasingly powerful models.

The page mentions “cybersecurity, CBRN (chemical, biological, radiological, nuclear threats), persuasion, and model autonomy”. The framework itself goes into more detail, proposing scorecards for assessing risk in each category. They define "catastrophic risk" as "any risk which could result in hundreds of billions of dollars in economic damage or lead to the severe harm or death of many individuals—this includes, but is not limited to, existential risk". The phrase "millions of deaths" appears in one of the scorecards.

Their “Planning for AGI & Beyond” blog post describes the risks as "existential", I quote the relevant passage in another comment.

On their “Safety & Alignment” blog they highlight recent posts called Reimagining secure infrastructure for advanced AI and Building an early warning system for LLM-aided biological threat creation.

My sense is that there are many other examples, but I'll stop here for now.

OpenAI's “Planning for AGI & Beyond” blog post includes the following:

As our systems get closer to AGI, we are becoming increasingly cautious with the creation and deployment of our models. Our decisions will require much more caution than society usually applies to new technologies, and more caution than many users would like. Some people in the AI field think the risks of AGI (and successor systems) are fictitious; we would be delighted if they turn out to be right, but we are going to operate as if these risks are existential.

At some point, the balance between the upsides and downsides of deployments (such as empowering malicious actors, creating social and economic disruptions, and accelerating an unsafe race) could shift, in which case we would significantly change our plans around continuous deployment.


The first AGI will be just a point along the continuum of intelligence. We think it’s likely that progress will continue from there, possibly sustaining the rate of progress we’ve seen over the past decade for a long period of time. If this is true, the world could become extremely different from how it is today, and the risks could be extraordinary. A misaligned superintelligent AGI could cause grievous harm to the world; an autocratic regime with a decisive superintelligence lead could do that too.


Successfully transitioning to a world with superintelligence is perhaps the most important—and hopeful, and scary—project in human history. Success is far from guaranteed, and the stakes (boundless downside and boundless upside) will hopefully unite all of us.

Altman signed the CAIS AI Safety Statement, which reads:

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

In 2015 he wrote a blog post which begins:

Development of superhuman machine intelligence (SMI) is probably the greatest threat to the continued existence of humanity.

Yes. (4) and (11) are also very much "citation needed". My sense is that they would need to be significantly moderated to fit the facts (e.g. the profit cap is still a thing).

Load more