Hide table of contents

Comment Permalink

My own take on AI Safety Classic arguments is I've become convinced by o3/Sonnet 3.7 that the alignment is very easy hypothesis is looking a lot shakier than it used to be, and I suspect future capabilities progress is likely to be at best neutral, and probably worse for alignment being very easy.

I do think you can still remain optimistic based on other cases, but a pretty core crux is I think alignment does need to be solved if AIs are able to automate the economy, and this is pretty robust to variations on what happens with AI.

The big reason for this is that once your labor is valueless, but your land/capital isn't, you have fundamentally knocked out a load-bearing pillar of the argument that expropriation is less useful than trade.

This is to a first approximation why we do not trade with most non-human species, rather than enslaving/killing them.

(For farm animals, their labor is useful, but the stuff lots of humans want from animals fundamentally requires expropriation/violating farm animal property rights)

A good scenario for what happens if we fail is at minimum the intelligence curse scenario elaborated on by Rudolf Lane and Luke Drago below:

https://intelligence-curse.ai/defining/

See in context

Epoch AI alumni launch Mechanize to "automate the whole economy"

by Henry Stanley 🔸

Apr 181 min read 52

103

AI safetyCommunityExistential riskAnnouncements and updatesPublic communication on AI safetyTwitter

Frontpage

Epoch AI alumni launch Mechanize to "automate the whole economy"

The explosive economic growth likely to result from completely automating labor could generate vast abundance, much higher standards of living, and new goods and services that we can’t even imagine today. Our vision is to realize this potential as soon as possible.

Links

52 comments

Three Epoch employees – Matthew Barnett, Tamay Besiroglu, and Ege Erdil – have left to launch Mechanize, an AI startup aiming for broad automation of ordinary labour:

Today we’re announcing Mechanize, a startup focused on developing virtual work environments, benchmarks, and training data that will enable the full automation of the economy.
We will achieve this by creating simulated environments and evaluations that capture the full scope of what people do at their jobs. ...
Currently, AI models have serious shortcomings that render most of this enormous value out of reach. They are unreliable, lack robust long-context capabilities, struggle with agency and multimodality, and can’t execute long-term plans without going off the rails.
To overcome these limitations, Mechanize will produce the data and evals necessary for comprehensively automating work. Our digital environments will act as practical simulations of real-world work scenarios, enabling agents to learn useful abilities through RL. ...
The explosive economic growth likely to result from completely automating labor could generate vast abundance, much higher standards of living, and new goods and services that we can’t even imagine today. Our vision is to realize this potential as soon as possible.

Tweet from Matthew Barnett:

I started a new company with @egeerdil2 and @tamaybes that's focused on automating the whole economy. We're taking a big bet on our view that the main value of AI will come from broad automation rather than from "geniuses in a data center".

The Mechanize website is scant on detail. It seems broadly bad that the alumni from a safety-focused AI org have left to form a company which accelerates AI timelines (and presumably is based on/uses evals built at Epoch).

It seems noteworthy that Epoch AI retweeted the announcement, wishing the departing founders best of luck – which feels like a tacit endorsement of the move.

Habryka wonders whether payment would have had to be given to Epoch for use of their benchmarks suite.

103 Reactions

Comments52

Sorted by

New & upvoted

Click to highlight new comments since: Today at 10:06 PM

JWS 🔸Apr 1946

I'm not sure I feel as concerned about this as others. tl;dr - They have different beliefs from Safety-concerned EAs, and their actions are a reflection of those beliefs.

It seems broadly bad that the alumni from a safety-focused AI org

Was Epoch ever a 'safety-focused' org? I thought they were trying to understand what's happening with AI, not taking a position on Safety per se.

...have left to form a company which accelerates AI timelines

I think Matthew and Tamay think this is positive, since they think AI is positive. As they say, they think explosive growth can be translated into abundance. They don't think that the case for AI risk is strong, or significant, especially given the opportunity cost they see from leaving abundance on the table.

Also important to note is what Epoch boss Jaime says in this very comment thread.

As I learned more and the situation unfolded I have become more skeptical of AI Risk.

The same thing seems to be happening with me, for what it's worth.

People seem to think that there is an 'EA Orthodoxy' on this stuff, but there either isn't as much as people think, or people who disagree with it are no longer EAs. I really don't think it makes sense to clamp down on 'doing anything to progress AI' as being a hill for EA to die on.

Lukas_GloorApr 19*43

I think there are two competing failure modes:

(1) The epistemic community around EA, rationality, and AI safety, should stay open to criticism of key empirical assumptions (like the level of risks from AI, risks of misalignments, etc.) in a healthy way.

(2) We should still condemn people who adopt contrarian takes with unreasonable-seeming levels of confidence and then take actions based on them that we think are likely doing damage.

In addition, there's possibly also a question of "how much do people who benefit from AI safety funding and AI safety association have an obligation to not take unilateral actions that most of the informed people in the community consider negative." (FWIW I don't think the obligation here would be absolute even if Epoch had been branded as centrally 'AI safety,' and I acknowledge that the branding issue seems contested; also, it wasn't Jamie [edit: Jaime] the founder who left in this way, and of the people who went off to found this new org, Matthew Barnett, for instance, has been really open about his contrarian takes, so insofar as Epoch's funders had concerns about the alignment of employees at Epoch, it was also -- to some degree, at least -- on them to ask for more information or demand some kind of security guarantee if they felt worried. And maybe this did happen -- I'm just flagging that I don't feel like we onlookers necessarily have the info, and so it's not clear whether anyone has violated norms of social cooperation here or whether we're just dealing with people getting close to the boundaries of unilateral action in a way that is still defensible because they've never claimed to be more aligned than they were, never accepted funding that came with specific explicit assumptions, etc.)

JasonApr 19*21

or whether we're just dealing with people getting close to the boundaries of unilateral action in a way that is still defensible because they've never claimed to be more aligned than they were, never accepted funding that came with specific explicit assumptions, etc.)

Caveats up front: I note the complexity of figuring out what Epoch's own views are, as opposed to Jaime's [corrected spelling] view or the views of the departing employees. I also do not know what representations were made. Therefore, I am not asserting that Epoch did something or needs to do something, merely that the concern described below should be evaluated.

People and organizations change their opinions all the time. One thing I'm unclear on is whether there was a change in position here should that created an obligation to offer to return and/or redistribute unused donor funds.

I note that, in February 2023, Epoch was fundraising through September 2025. I don't know its cash flows, but I cite that to show it is plausible they were operating on safety-focused money obtained before a material change to less safety-focused views. In other words, the representations to donors may have been appropriate when the money was raised but outdated by the time it was spent.

I think it's fair to ask whether a donor would have funded a longish runway if it had known the organization's views would change by the time the monies were spent. If the answer is "no," that raises the possibility that the organization may be ethically obliged to refund or regrant the unspent grant monies.

I can imagine circumstances in which the answers are no and yes: for instance, suppose the organization was a progressive political advocacy organization that decided to go moderate left instead. It generally will not be appropriate for that org to use progressives' money to further its new stance. On the other hand, any shift here was less pronounced, and there's a stronger argument that the donors got the forecasting/information outputs they paid for.

Anyway, for me all this ties into post-FTX discussions about giving organizations a healthy financial runway. People in those discussions did a good job flagging the downsides of short-term grants without confidence in renewal, as well as the high degree of power funders hold in the ecosystem. But AI is moving fast; this isn't something more stable like anti-malarial work. So the chance of organizational drift seems considerably higher here.

How do we deal with the possibility that honest organizational changes will create a inconsistency with the implicit donor-recipient understanding at the time of grant? I don't claim to have the answer, or how to apply it here.

PabloApr 1910

By the way, the name is ‘Jaime’, not ‘Jamie’. The latter doesn't exist in Spanish and the two are pronounced completely differently (they share one phoneme out of five, when aligned phoneme by phoneme).

(I thought I should mention it since the two names often look indistinguishable in written form to people who are not aware that they differ.)

Jaime SevillaApr 198

Thank you Pablo for defending the integrity of my name -- literally 😆

TFDApr 20*6

How common is it for such repayments to occur, and what do you think would be the standard for the level of clarity of the commitment, and who does that commitment would have to be to? For example, is there a case that 80k hours should refund payments in light of their pivot to focus on AI? I know there are differences, their funder could support the move etc., but in the spirit of the thing, where is the line here?

Editing to add: One of my interests in this topic is that EA/rationalists seem to have some standards/views that diverge somewhat from what I would characterize as more "mainstream" approaches to these kinds of things. Re-reading the OP, I noticed a detail I initially missed:

Habryka wonders whether payment would have had to be given to Epoch for use of their benchmarks suite.

to me this does seem like it implicates a more mainstream view of a potential conflict-of-interest.

JWS 🔸Apr 2513

Note - this was written kinda quickly, so might be a bit less tactful than I would write if I had more time.

Making a quick reply here after binge listening to three Epoch-related podcasts in the last week, and I basically think my original perspective was vindicated. It was kinda interesting to see which points were repeated or phrased a different way - would recommend if your interested in the topic.

The initial podcast with Jaime, Ege, and Tamay. This clearly positions the Epoch brain trust as between traditional academia and the AI Safety community (AISC). tl;dr - academia has good models but doesn't take ai seriously, and AISC the opposite (from Epoch's PoV)
The 'debate' between Matthew and Ege. This should have clued people in, because while full of good content, by the last hour/hour and half it almost seemed to turn into 'openly mocking and laughing' at AISC, or at least the traditional arguments. I also don't buy those arguments, but I feel like the reaction Matthew/Ege have shows that they just don't buy the root AISC claims.
The recent podcast Dwarkesh with Ege & Tamay. This is the best of the 3, but probably also best listened too after the first too, since Dwarkesh actually pushes back on quite a few claims, which means Ege & Tamay flush out their views more - personal highlight was what the reference class for AI Takeover actually means.

Basically, the Mechanize cofounders don't agree at all with 'AI Safety Classic', I am very confident that they don't buy the arguments at all, that they don't identify with the community, and somewhat confident that they don't respect the community or its intellectual output that much.

Given that their views are: a) AI will be a big deal soon (~a few decades), b) returns to AI will be very large, c) Alignment concerns/AI risks are overrated, and d) Other people/institutions aren't on the ball, then starting an AI Start-up seems to make sense.

What is interesting to note, and one I might look into in the future, is just how much these differences in expectation of AI depend on differences in worldview, rather than differences in technical understanding of ML or understanding of how the systems work on a technical level.

So why are people upset?

Maybe they thought the Epoch people were more part of the AISC than they actually were? Seems like the fault of the people believe this, not Epoch or the Mechanize founders.
Maybe people are upset that Epoch was funded by OpenPhil, and this seems to have lead to 'AI acceleration'? I think that's plausible, but Epoch has still produced high-quality reports and information, which OP presumably wanted them to do. But I don't think equating EA == OP, or anyone funded by OP, is a useful concept to me.
Maybe people are upset at any progress in AI capabilities. But that assumes that Mechanize will be successful in its aims, not guaranteed. It also seems to reify the concept of 'capabilities' as one big thing which i don't think makes sense. Making a better Stockfish, or a better AI for FromSoft bosses does not increase x-risk, for instance.
Maybe people think that the AI Safety Classic arguments are just correct and therefore people taking actions other than it. But then many actions seem bad by this criteria all the time, so odd this would provoke such a reaction. I also don't think EA should hang its hat on 'AI Safety Classic' arguments being correct anyway.

Probably some mix of it. I personally remain not that upset because a) I didn't really class Epoch as 'part of the community', b) I'm not really sure I'm 'part of the community' either and c) my views are at least somewhat similar to the Epoch set above, though maybe not as far in their direction, so I'm not as concerned object-level either.

JasonApr 256

To steelman this:

Even assuming OP funding != EA, one still might consider OP funding to count as funding from the AI Safety Club (TM), and for the Mechanize critics to be speaking in their capacity as members of the AISC rather than of EA. Being upset that AISC money supported development of people who are now working to accelerate AI seems understandable to me.
Epoch fundraised on the Forum in early 2023 and solicited applications for employment on the Forum as recently as December 2024. Although I don't see any specific references to the AISC in those posts, it wouldn't be unreasonable to assume some degree of alignment from its posting of fundraising and recruitment asks on the Forum without any disclaimer. (However, I haven't heard a good reason to impute Epoch's actions to the Mechanize trio specifically.)

SharmakeApr 273

This is to a first approximation why we do not trade with most non-human species, rather than enslaving/killing them.

(For farm animals, their labor is useful, but the stuff lots of humans want from animals fundamentally requires expropriation/violating farm animal property rights)

A good scenario for what happens if we fail is at minimum the intelligence curse scenario elaborated on by Rudolf Lane and Luke Drago below:

https://intelligence-curse.ai/defining/

MichaelDickensApr 1915

I think Matthew and Tamay think this is positive, since they think AI is positive.

I don't see how this alleviates concern. Sure they're acting consistently with their beliefs*, but that doesn't change the fact that what they're doing is bad.

*I assume, I don't really know

PabloApr 198

Intuitively, it seems we should respond differently depending on which of these three possibilities is true:

They think that what they are doing is negative for the world, but do it anyway, because it is good for themselves personally.
They do not think that what they are doing is negative for the world, but they believe this due to motivated cognition.
They do not think that what they are doing is negative for the world, and this belief was not formed in a way that seems suspect.

From an act consequentialist perspective, these differences do not matter intrinsically, but they are still instrumentally relevant.^[1]

^{^}
I don't mean to suggest that any one of these possibilities is particularly likely, or they they are all plausible. I haven't followed this incident closely. FWIW, my vague sense is that the Mechanize founders had all expressed skepticism about the standard AI safety arguments for a while, in a way that seems hard to reconcile with (1) or (2).

TFDApr 202

it suggests the concern is an object level one, not a meta one. the underlying "vibe" I am getting from a lot of these discussions is that the people in question have somehow betrayed EA/the community/something else. That is a meta concern, one of norms. You could "betray" the community even if you are on the AI deceleration side things. If the people in question or Epoch made a specific commitment that they violated, that would be a "meta" issue, and would be one regardless of their "side" on the deceleration question. Perhaps they did do such a thing, but I haven't seen convincing information suggesting this. I think that really the main explanatory variable here is in fact what "side" this suggests they are on. If that is the case, I think it is worth having clarity about it. People can do a bad thing because they are just wrong in their analysis of a situation or their decision-making. That doesn't mean their actions constitute a betrayal.

Chris LeongApr 189

I've written a short-form here as well.

mal_graham🔸Apr 195

Responding here for greater visibility -- I'm responding to the idea in your short-form that the lesson from this is to hire for greater value alignment.

Epoch's founder has openly stated that their company culture is not particularly fussed about most AI risk topics [edit: they only stated this today, making the rest of my comment here less accurate; see thread]. Key quotes from that post:

"on net I support faster development of AI, so we can benefit earlier from it."
"I am not very concerned about violent AI takeover. I am concerned about concentration of power and gradual disempowerment."

So I'm not sure this is that much of a surprise? It's at least not totally obvious that Mechanize's existence is contrary to those values.

As a result, I'm not sure the lesson is "EA orgs should hire for value alignment." I think most EAs just didn't understand what Epoch's values were. If that's right, the lesson is that the EA community shouldn't assume that an organization that happens to work adjacent to AI safety actually cares about it. In part, that's a lesson for funders to not just look at the content of the proposal in front of you, but also what the org as a whole is doing.

Jaime SevillaApr 19*37

Epoch's founder has openly stated that their company culture is not particularly fussed about most AI risk topics

To be clear, my personal views are different from my employees or our company. We have a plurality of views within the organisation (which I think it's important for our ability to figure out what will actually happen!)

I co-started Epoch to get more evidence on AI and AI risk. As I learned more and the situation unfolded I have become more skeptical of AI Risk. I tried to be transparent about this, though I've changed my mind often and is time-consuming to communicate every update.

I also strive to make Epoch work relevant and useful to people regardless of their views. Eg both AI2027 and situational awareness rely heavily on Epoch work, even though I disagree with their perspectives. You don't need to agree with what I believe to find our work useful!

JasonApr 196

That post was written today though -- I think the lesson to be learned depends on whether those were always the values vs. a change from what was espoused at the time of funding.

mal_graham🔸Apr 199

Oh whoops, I was looking for a tweet they wrote a while back and confused it with the one I linked. I was thinking of this one, where he states that "slowing down AI development" is a mistake. But I'm realizing that this was also only in January, when the OpenAI funding thing came out, so doesn't necessarily tell us much about historical values.

I suppose you could interpret some tweets like this or this in a variety of ways but it now reads as consistent with "don't let AI fear get in the way of progress" type views. I don't say this to suggest that EA funders should have been able to tell ages ago, btw, just trying to see if there's any way to get additional past data.

Another fairly relevant thing to me is that their work is on benchmarking and forecasting potential outcomes, something that doesn't seem directly tied to safety and which is also clearly useful to accelerationists. As a relative outsider to this space, it surprises me much less that Epoch would be mostly made up of folks interested in AI acceleration or at least neutral towards it, than if I found out that some group researching something more explicitly safety-focused had those values. Maybe the takeaway there is that if someone is doing something that is useful both to acceleration-y people and safety people, check the details? But perhaps that's being overly suspicious.

mal_graham🔸Apr 195

And I guess also more generally, again from a relatively outside perspective, it's always seemed like AI folks in EA have been concerned with both gaining the benefits of AI and avoiding X risk. That kind of tension was at issue when this article blew up here a few years back and seems to be a key part of why the OpenAI thing backfired so badly. It just seems really hard to combine building the tool and making it safe into the same movement; if you do, I don't think stuff like Mechanize coming out of it should be that surprising, because your party will have guests who only care about one thing or the other.

Henry Stanley 🔸Apr 182

Interesting that you chose not to name the org in question - I guess you wanted to focus on the meta-level principle rather than this specific case

Chris LeongApr 184

Maybe I should have. I honestly don't know. I didn't think deeply about it.

Marcus Abramovitch 🔸Apr 258

I think people are, generally speaking, being too simplistic between "capabilities" and "alignment". I assume most people on the forum use ChatGPT/Claude or other LLM apps and don't think they pose, in their current form, much of a safety concern.

I am far more concerned of "geniuses in a data center" which Dario/Sam seem to be pushing for, than I am of more economically useful AI.

I furthermore think that Matthew and to a lesser extent, Tamay and Ege have engaged significantly with AI risk arguments than most people.

Disclosure: I'm one of the investors in Mechanize

YadavApr 188

EDIT: I did not read the entire thing and now realise the author of this post said the same. I will still keep my feelings around this public.

Hmm. This seems like a strange thing to work towards? Perhaps even harmful. Is this not just trying to push SOTA?

(Perhaps strange is not the right word to use here. I could see many reasons why you would want to do this, but I guess I had the intuition that people at Epoch would not want to do this).

SharmakeApr 19*-3

To be honest, I don't necessarily think it's as bad as people claim, though I still don't think it was a great action relative to available alternatives, and is at best not the best thing you could decide on for making AI safe, relative to other actions.

One of my core issues, and a big crux here is that I don't really believe that you can succeed at the goal of automating the whole economy with cheap robots without also allowing actors to speed up the race to superintelligence/superhuman AI researchers a lot.

And if we put any weight on misalignment, we should be automating AI safety, not AI capabilities, so this is quite bad.

Jaime Sevilla admits that the reason he supports Mechanize's effort is for selfish reasons:

https://x.com/Jsevillamol/status/1913276376171401583

I selfishly care about me, my friends and family benefitting from AI. For some of my older relatives, it might make a big difference to their health and wellbeing whether AI-fueled explosive growth happens in 10 vs 20 years.

Edit: @Jaime Sevilla has stated that he won't go to Mechanize, and will stay at Epoch, sorry for any confusion.

[This comment is no longer endorsed by its author]Reply

Jaime SevillaApr 1945

Saying that I personally support faster AI development because I want people close to me to benefit is not the same as saying I'm working at Epoch for selfish reasons.

I've had opportunities to join major AI labs, but I chose to continue working at Epoch because I believe the impact of this work is greater and more beneficial to the world.

That said, I’m also frustrated by the expectation that I must pretend not to prioritize those closest to me. I care more about the people I love, and I think that’s both normal and reasonable—most people operate this way. That doesn’t mean I don’t care about broader impacts too.

RebeccaApr 1927

I don't think people are expecting you to pretend to not hold the values that you do, rather they're disappointed that you hold those values, as welfare impartiality is a core value for a lot of EAs.

Jeff Kaufman 🔸Apr 1946

I don't think impartiality to the extent of not caring more about the people one loves is a core value for very many EAs? Yes, it's pretty central to EA that most people are excessively partial, but I don't recall ever seeing someone advocate full impartiality.

JasonApr 1938

Some of the reaction here may be based on Jaime acting in a professional, rather than a personal, capacity when working in AI.

There are a number of jobs and roles that expect your actions in a professional capacity to be impartial in the sense of not favoring your loved ones over others. For instance, a politician should not give any more weight to the effects of proposed legislation on their own mother than the effect on any other constituent. Government service in general has this expectation. One could argue that (like serving as a politician), working in AI involves handing out significant risks and harms to non-consenting others -- and that should trigger a duty of impartiality.

Government workers and politicians are free to favor their own mother in their personal life, of course.

TFDApr 2011

It seems like the view expressed reduces to an existing-person-effecting view. Is their any plausible mechanism by which an action by Epoch is supposed to impact Sevilla's friends/relatives specifically? I seriously doubt it. The only plausible mechanism would be that AI goes well instead of poorly, which would benefit all existing people. This makes the politician comparison, as stated, dis-analogousness. Would you say that if a politician said their motivation to become a politician was to make a better world for their children, for example, that would somehow violate their duties? Seems like a lot of politicians might have issue if that were the case.

I think this suggests a risk that the real infraction here is honestly stating the consideration about friends and family. Is it really the case that no-one leading AI safety orgs that are aiming for deceleration are motivated, at least partly, by the desire to protect their own friends and family from the consequences of AI going poorly? I will confess that is a big part of my own reasons from being interested in this topic. I would be very surprised if the standard being suggested here was really as ubiquitous as these comments suggest.

RebeccaApr 2027

I’d agree that a lot of people who care about AI safety do so because they want to leave the world a better place for their children (which encompasses their children’s wellbeing related to being parents themselves and having to worry about their own children’s future). But there’s no trade off between personal and impartial preferences there. That seems to me to be quite different from saying you’re prioritising eg your parents and grandparents getting to have extended lifespans over other people’s children’s wellbeing.

The discussion also isn’t about the effects of Epoch’s specific work, so I’m a bit confused by your argument relying on that.

From Jaime:

“But I want to be clear that even if you convinced me somehow that the risk that AI is ultimately bad for the world goes from 15% to 1% if we wait 100 years I would not personally take that deal. If it reduced the chances by a factor of 100 I would consider it seriously. But 100 years has a huge personal cost to me, as all else equal it would likely imply everyone I know [italics mine] being dead. To be clear I don't think this is the choice we are facing or we are likely to face.“

TFDApr 204

But there’s no trade off between personal and impartial preferences there. That seems to me to be quite different from saying you’re prioritising eg your parents and grandparents getting to have extended lifespans over other people’s children’s wellbeing.

I can see why you would interpret it this way given the context, but I read the statement differently. Based on my read of the thread, the comment was in response to a question about benefiting people sooner rather than later. This is why I say it reduces to an existing-person-effecting view (which, at least as far as I am aware, is not an unacceptable position to hold in EA). The question is functionally about current vs future people, not literally Sevilla's friends and family specifically. I think this matches the "making the world better for your children" idea. You can channel a love of friends and family into an altruistic impulse, so long as there isn't some specific conflict-of-interest where you're benefiting them specifically. I think the statement in question is consistent with that.

The discussion also isn’t about the effects of Epoch’s specific work, so I’m a bit confused by your argument relying on that.

I'm bringing this up because I think its implausible that anything that is being discussed here has some specific relevance to Sevilla's friends and family as individuals (in support of my point above). In other words, due to the nature of the actions being taken

there’s no trade off between personal and impartial preferences there

In what way are any concrete actions that are relevant here prioritizing Sevilla's family over other people's children? Although I can see how it might initially seem that way I don't think that's what the statement was intended to communicate.

RebeccaApr 206

Have you read the whole Twitter thread including Jaime’s responses to comments? He repeatedly emphasises that it’s about his literal friends, family and self, and hypothetical moderate but difficult trade offs with the welfare of others.

TFDApr 204

When I click the link I see three posts that go Sevilla, Lifland, Sevilla. I based my comments above on those. I haven't read through all the other replies by others or posts responding to them. If there is context in those or else where that is relevant I'm open to changing my mind based on that.

He repeatedly emphasises that it’s about his literal friends, family and self, and hypothetical moderate but difficult trade offs with the welfare of others.

Can you say what statements lead you to this conclusion? For example, you quote him saying something I haven't seen, perhaps part of the thread I didn't read.

“But I want to be clear that even if you convinced me somehow that the risk that AI is ultimately bad for the world goes from 15% to 1% if we wait 100 years I would not personally take that deal. If it reduced the chances by a factor of 100 I would consider it seriously. But 100 years has a huge personal cost to me, as all else equal it would likely imply everyone I know [italics mine] being dead. To be clear I don't think this is the choice we are facing or we are likely to face.“

To me, this seems to confirm what I said above:

Based on my read of the thread, the comment was in response to a question about benefiting people sooner rather than later. This is why I say it reduces to an existing-person-effecting view (which, at least as far as I am aware, is not an unacceptable position to hold in EA). The question is functionally about current vs future people, not literally Sevilla's friends and family specifically.

Yes, Sevilla is motivated specifically by considerations about those he loves, and yes, there is a trade-off, but that trade-off is really about current vs future people. People who aren't longtermists for example would also implicate this same trade-off. I don't think Sevilla would be getting the same reaction here if he just said he isn't a longtermist. Because of the nature of the available actions, the interests of Sevilla's loved-ones is aligned with those of current people (but not necessarily future people). The reason why "everyone [he] know[s]" will be dead is because everyone will be dead, in that scenario.

You might think that having loved-ones as a core motivation above other people is inherently a problem. I think this is answered above by Jeff Kaufman:

I don't think impartiality to the extent of not caring more about the people one loves is a core value for very many EAs? Yes, it's pretty central to EA that most people are excessively partial, but I don't recall ever seeing someone advocate full impartiality.

I agree with this statement. Therefore my view is that simply stating that you're more motivated by consequences to your loved-ones is not, in and of itself, a violation of a core EA idea.

Jason offers a refinement of this view. Perhaps what Kaufman says is true, but what if there is a more specific objection?

There are a number of jobs and roles that expect your actions in a professional capacity to be impartial in the sense of not favoring your loved ones over others. For instance, a politician should not give any more weight to the effects of proposed legislation on their own mother than the effect on any other constituent.

Perhaps the issue is not necessarily that Sevilla has the motivation itself, but that his role comes with a specific conflict-of-interest-like duty, which the statement suggests he is violating. My response was addressing this argument. I claim that the duty isn't as broad as Jason seems to imply:

It seems like the view expressed reduces to an existing-person-effecting view. Is their any plausible mechanism by which an action by Epoch is supposed to impact Sevilla's friends/relatives specifically? I seriously doubt it. The only plausible mechanism would be that AI goes well instead of poorly, which would benefit all existing people. This makes the politician comparison, as stated, dis-analogousness. Would you say that if a politician said their motivation to become a politician was to make a better world for their children, for example, that would somehow violate their duties? Seems like a lot of politicians might have issue if that were the case.

Does a politician who votes for a bill and states they are doing so to "make a better world for their children", violate a conflict-of-interest duty? Jason's argument seems to suggest they would. Let's assume they are being genuine, they really are significantly motivated by care for their children, more than for a random citizen. They apply more weight to the impact of the legislation on their children then to others, violating Jason's proposed criteria.

Yet I don't think we would view such statements as disqualifying for a politician. The reason is that the mechanism by which they benefit their children really only operates by also helping everyone else. Most legislation won't have any different impact on their children compared to any other person. So while the statement nominally suggests a conflict-of-interest, in practice the politicians incentives are aligned, the only way that voting for this legislation helps their children is that it helps everyone, and that includes their children. If the legislation plausibly did have a specific impact on their child (for example impacting an industry their child works in), then that really could be a conflict-of-interest. My claim is there needs to be some greater specificity for a conflict to exist. Sevilla's case is more like the first case than the second, or at least that is my claim:

Is their any plausible mechanism by which an action by Epoch is supposed to impact Sevilla's friends/relatives specifically? I seriously doubt it. The only plausible mechanism would be that AI goes well instead of poorly, which would benefit all existing people.

So, what has Sevilla done wrong? My analysis is this. It isn't simply that he is more motivated to help his loved-ones (Kaufman argument). Nor is it something like a conflict-of-interest (my argument). In another comment on this thread I said this:

People can do a bad thing because they are just wrong in their analysis of a situation or their decision-making.

I think, at bottom, the problem is that Sevilla makes mistake in his analysis and/or decision-making about AI. His statements aren't norm-violating, they are just incorrect (at least some of them are, in my opinion). I think its worth having clarity about what the actual "problem" is.

RebeccaApr 202

The reason why "everyone [he] know[s]" will be dead is because everyone will be dead, in that scenario.

We are already increasing maximum human lifespan, so I wouldn't be surprised if many people who are babies now are still alive in 100 years. And even if they aren't, there's still the element of their wellbeing while they are alive being affected by concerns about the world they will be leaving their own children to.

TFDApr 202

Although I haven't thought deeply about the issue you raise you could definitely be correct, and I think they are reasonable things to discuss. But I don't see their relevance to my arguments above. The quote you reference is itself discussing a quote from Sevilla that analyzes a specific hypothetical. I don't necessarily think Sevilla had the issues you raise in mind when we was addressing that hypothetical. I don't think his point was that based on forecasts of life extension technology he had determined that acceleration was the optimal approach in light of his weighing of 1 year-olds vs 50 year-olds. I think his point is more similar to what I mention above about current vs future people. I took a look at more of the X discussion, including the part where that quote comes from, and I think it is pretty consistent with this view (although of course others may disagree). Maybe he should factor in the things you mention, but to the extent his quote is being used to determine his views, I don't think the issues you raise are relevant unless he was considering them when he made the statement. On the other hand, I think discussing those things could be useful in other, more object level discussions. That's kind of what I was getting at here:

I think, at bottom, the problem is that Sevilla makes mistake in his analysis and/or decision-making about AI. His statements aren't norm-violating, they are just incorrect (at least some of them are, in my opinion). I think its worth having clarity about what the actual "problem" is.

I know I've been commenting here a lot, and I understand my style may seem confrontational and abrasive in some cases. I also don't want to ruin people's day with my self-important rants, so, having said my piece, I'll drop the discussion for now and let you get on with other things.

(although it you would like to response you are of course welcome, I just mean to say I won't continue the back-and-forth after, so as not to create a pressure to keep responding.)

RebeccaApr 206

I don’t think you’re being confrontational, I just think you’re over-complicating someone saying they support things that might bring AGI forward to 2035 instead of 2045 because otherwise it will be too late for their older relatives. And it’s not that motivating to debate things that feel like over-complications.

JasonApr 207

I agree that there are no plausible circumstances in which anyone's relatives will benefit in a way not shared with a larger class of people. However, I do think groups of people differ in ways that are relevant to how important fast AI development vs. more risk-averse AI development is to their interests. Giving undue weight to the interests of a group of people because one's friends or family are in that group would still raise the concern I expressed above.

One group that -- if they were considering their own interests only -- might be rationally expected to accept somewhat more risk than the population as a whole are those who are ~50-55+. As Jaime wrote:

For some of my older relatives, it might make a big difference to their health and wellbeing whether AI-fueled explosive growth happens in 10 vs 20 years.

A similar outcome could also happen if (e.g.) the prior generation of my family has passed on, I had young children, and as a result of prioritizing their interests I didn't give enough weight to older individuals' desire to have powerful AI soon enough to improve and/or extend their lives.

TFDApr 201

the prior generation of my family has passed on, I had young children

This seems to suggest that you think the politicians "making the world better for my children" statement would then also be problematic. Do you agree with that?

I'll be honest, this argument seems a bit too clever. Is the underlying problem with the statement really that it implies a set of motivations that might slightly up-weight a certain age group? One of the comments speaks of "core values" for EA. Is that really a core value? I'm pretty sure I recall reading an argument by McAskill about how actually we should more heavily weight young people in various ways (I think it was voting), for example. I serious doubt most EAs could claim that they literally are distributionally exact in weighting all morally relevant entities in every decision they make. I think the "core value" that exists probably isn't really this demanding, although I could be wrong.

RebeccaApr 20*2

Prioritising young people often makes sense from an impartial welfare standpoint, because young people have more years left, so there is more welfare to be affected. With voting in particular, it’s the younger people who have to deal with the longer term consequences of any electoral outcome. You see this in climate change related critiques of the Baby Boomer generation.

See eg

“Effective altruism can be defined by four key values: …

2. Impartial altruism: all people count equally — effective altruism aims to give everyone’s interests equal weight, no matter where or when they live. When combined with prioritisation, this often results in focusing on neglected groups…”

https://80000hours.org/2020/08/misconceptions-effective-altruism/

TFDApr 201

Prioritising young people often makes sense from an impartial welfare standpoint

Sure, I think you can make a reasonable argument for that, but if someone disagreed with that, would you say they lack impartiality? To me it seems like something that is up for debate, within the "margin-of-error" of what is meant by impartiality. Two EAs could come down on different sides of that issue and still be in good standing in the community, and wouldn't be considered to not believe in the general principle of impartiality. Likewise, I think we can interpret Jeff Kaufman's argument above as expressing a similar view about an individual's loved-ones. It is within the "margin-of-error" of impartiality to still have a higher degree of concern for loved-ones, even if that might not be living up to the platonic ideal of impartiality.

My point in bringing this up is, the exact reason why the statement in question is bad seems to be shifting a bit over the conversation. Is the core reason that Sevilla's statement is objectionable really that it might up-weight people in a certain age group?

Yarrow🔸Apr 218

TFD, I think your analysis is correct and incisive. I’m grateful to you for writing these comments on this post.

It seems clear that if Jaime had different views about the risk-reward of hypothetical 21st century AGI, nobody would be complaining about him loving his family.

Accusing Jaime of "selfishness", even though he used that term himself in (what I interpret to be) a self-deprecating way, seems really unfair and unreasonable, and just excessively mean. As you and Jeff Kaufman pointed out, many people who are accepted into the EA movement have the same or similar views as Jaime on who to prioritize and so on. These criticisms would not be levied against Jaime if he were not an AI risk skeptic.

The social norms of EA or at least the EA Forum are different today than they were ten years ago. Ten years ago, if you said you only care about people who are either alive today or who will be born in the next 100 years, and you don’t think much about AGI because global poverty seems a lot more important, then you would be fully qualified to be the president of a university EA group, get a job at a meta-EA organization, or represent the views of the EA movement to a public audience.

Today, it seems like there are a lot more people who self-identify as EAs who see focusing on global poverty as more or less a waste of time relative to the only thing that matters, which is that the Singularity is coming in about 2-5 years (unless we take drastic action), and all our efforts should be focused on making sure the Singularity goes good and not bad — including trying to delay it if that helps. People who disagree with this view have not yet been fully excluded from EA but it seems like some people are pretty mean to people who disagree. (I am one of the people who disagrees.)

As a side note, it’s also strange to me that people are treating the founding of Mechanize as if it has a realistic chance to accelerate AGI progress more than a negligible amount — enough of a chance of enough of an acceleration to be genuinely concerning. AI startups are created all the time. Some of them state wildly ambitious goals, like Mechanize. They typically fail to achieve these goals. The startup Vicarious comes to mind.

There are many startups trying to automate various kinds of physical and non-physical labour. Some larger companies like Tesla and Alphabet are also working on this. Why would Mechanize be particularly concerning or be particularly likely to succeed?

Jeff Kaufman 🔸Apr 23*8

The social norms of EA or at least the EA Forum are different today than they were ten years ago. Ten years ago, if you said you only care about people who are either alive today or who will be born in the next 100 years, and you don’t think much about AGI because global poverty seems a lot more important, then you would be fully qualified to be the president of a university EA group, get a job at a meta-EA organization, or represent the views of the EA movement to a public audience.

This isn't just a social thing, it's also response to a lot of changes in AI timelines over the past ten years. Back then a lot of us had views like "most experts think powerful AI is far off, I'm not going to sink a bunch of time into how it might affect my various options for doing good", but as expert views have shifted that makes less sense. While "don’t think much about AGI because global poverty seems a lot more important" is still a reasonable position to hold (ex: people who think we can't productively influence how AI goes and so we should focus on doing as much good as we can in areas we can affect), I think it requires a good bit more reasoning and thought than it did ten years ago.

(On the other hand, I see "only care about people who are either alive today or who will be born in the next 100 years" as still within the range of common EA views (ex).)

Yarrow🔸Apr 246

I see it primarily as a social phenomenon because I think the evidence we have today that AGI will arrive by 2030 is less compelling than the evidence we had in 2015 that AGI would arrive by 2030. In 2015, it was a little more plausible that AGI could arrive by 2030 because that was 15 years away and who knows what can happen in 15 years.

Now that 2030 is a little less than 5 years away, AGI by 2030 is a less plausible prediction than it was in 2015 because there's less time left and it's more clear it won't happen.

I don't think the reasons people believe AGI will arrive by 2030 are primarily based on evidence but are primarily a sociological phenomenon. People were ready to believe this regardless of the evidence going back to Ray Kurzweil's The Age of Spiritual Machines in 1999 and Eliezer Yudkowsky's "End-of-the-World Bet" in 2017. People don't really pay attention to whether the evidence is good or bad, they ignore obvious evidence and arguments against near-term AGI, and they mostly make a choice to ignore or attack people who express disagreement and instead tune into the relentless drumbeat of people agreeing with them. This is sociology, not epistemology.

Don't believe me? Talk to me again in 5 years and send me a fruit basket. (Or just kick the can down the road and say AGI is coming in 2035...)

Expert opinion has changed? First, expert opinion is not itself evidence, it's people's opinions about evidence. What evidence are the experts basing their beliefs on? That seems way more important than someone just saying a number based on an intuition.

Second, expert opinion does not clearly support the idea of near-term AGI.

As of 2023, the expert opinion on AGI was... well, first of all, really confusing. The AI Impacts survey found that the experts believed there is a 50% chance by 2047 that "unaided machines can accomplish every task better and more cheaply than human workers." And also that there's a 50% chance that by 2116 "machines could be built to carry out the task better and more cheaply than human workers." I don't know why these predictions are 69 years apart.

Regardless, 2047 is sufficiently far away that it might as well be 2057 or 2067 or 2117. This is just people generating a number using a gut feeling. We don't know how to build AGI and we have no idea how long it will take to figure out how to. No amount of thinking of numbers or saying numbers can escape this fundamental truth.

We actually won't have to wait long to see that some of the most attention-catching near-term AI predictions are false. Dario Amodei, the CEO of Anthropic (a company that is said to be "literally creating God"), has predicted that by some point between June 2025 and September 2025, 90% of all code will written by AI rather than humans. In late 2025 and early 2026, when it's clear Dario was wrong about this (when, not if), maybe some people will start to be more skeptical of attention-grabbing expert predictions. But maybe not.

There are already strong signs of AGI discourse being irrational and absurd. On April 16, 2025, Tyler Cowen claimed that OpenAI's o3 model is AGI and asked, "is April 16th AGI day?". In a follow-up post on April 17, seemingly in response to criticism, he said, "I don’t mind if you don’t want to call it AGI", but seemed to affirm he still thinks o3 is AGI.

On one hand, I hope that in 5 years the people who promoted the idea of AGI by 2030 will lose a lot of credibility and maybe will do some soul-searching to figure out how they could be so wrong. On the other hand, there is nothing preventing people from being irrational indefinitely, such as:

Defining whatever exists in 2030 as AGI (Tyler Cowen already did it in 2025, and Ray Kurzweil innovated the technique years ago).
Kicking the can down the road a few years, and repeat as necessary (similar to how Elon Musk has predicted that the Tesla fleet will achieve Level 4/5 autonomy in a year every year from 2015 to 2025 and has not given up the game despite his losing streak).
Telling a story in which AGI didn't happen only because effective altruists or other good actors successfully delayed AGI development.

I think part of the sociological problem is that people are just way too polite about how crazy this all is and how awful the intellectual practices of effective altruists have been on this topic. (Sorry!) So, I'm being blunt about this to try to change that a little.

LarksApr 2415

I see it primarily as a social phenomenon because I think the evidence we have today that AGI will arrive by 2030 is less compelling than the evidence we had in 2015 that AGI would arrive by 2030.

The evidence we have today that there will be AGI by 2030 is clearly dramatically stronger than the evidence we had in 2015 that there would be AGI by 2020, and that is surely the relevant comparison. This is not EA specific - we have been ahead of the curve in thinking AI would be a big deal, but the whole world has updated in this direction, and it would be strange if we hadn't as well.

TFDApr 243

My personal take is that there are pretty reasonable arguments that what we have seen in AI/ML since 2015 suggests AI will be a big deal. I like the way I have seen Yoshua Bengio talk about it "over the next few years, or a few decades". I share the view that either of those possibilities are reasonable. People who are highly confident that something like AGI is going to arrive over the next few years are more confident in this than I am, but I think that view is within the bounds of reasonable interpretation of the evidence. I think it is also with-in the bounds of reasonable to have the opposite view, that something like AGI is most likely further than a few years away.

Don't believe me? Talk to me again in 5 years and send me a fruit basket. (Or just kick the can down the road and say AGI is coming in 2035...)

I think this is a healthy attitude and that I think is worth appreciating. We may get answers to these questions over the next few years. That seems pretty positive to me. We will be able to resolve some of these disagreements productively by observing what happens. I hope people who have different views now keep this in mind and that the environment is still in a good place for people who disagree now to work together in the future if some of these disagreements get resolved.

I will offer the ea forum internet-points equivalent of a fruit basket to anyone who would like one in the future if we disagree now and in the future they are proven right and I am proven wrong.

I think part of the sociological problem is that people are just way too polite about how crazy this all is and how awful the intellectual practices of effective altruists have been on this topic.

Can you saw what view it is you think is crazy? It seems quite reasonable to me to think that AI is going to be a massive deal and therefore that it would be highly useful to influence how it goes. On other other hand, I think people often over-estimate the robustness of the arguments for any given strategy for how to actually do that influencing. In other words, its reasonable to prioritize AI, but people's AI takes are often very over-confident.

SharmakeApr 231

For what it's worth, I basically agree with the view that Mechanize is unlikely to be successful at it's goals:

As a side note, it’s also strange to me that people are treating the founding of Mechanize as if it has a realistic chance to accelerate AGI progress more than a negligible amount — enough of a chance of enough of an acceleration to be genuinely concerning. AI startups are created all the time. Some of them state wildly ambitious goals, like Mechanize. They typically fail to achieve these goals. The startup Vicarious comes to mind.
There are many startups trying to automate various kinds of physical and non-physical labour. Some larger companies like Tesla and Alphabet are also working on this. Why would Mechanize be particularly concerning or be particularly likely to succeed?

TFDApr 24-1

I appreciate your comment.

It seems clear that if Jaime had different views about the risk-reward of hypothetical 21st century AGI, nobody would be complaining about him loving his family.

I do think this is substantially correct, but I also want to acknowledge that these can be difficult subjects to navigate. I think anyone has done anything wrong, I'm sure I myself have done something similar to this many times. But I do think its worth trying to understand where the central points of disagreement lie, and I think this really is the central disagreement.

On the question of changing EA attitudes towards AI over the years, although I personally think AI will be a big deal, could be dangerous, and those issues are worth of significant attention, I also can certainly see reasons why people might disagree and why those people would have reasonable grievances with decisions by certain EA people and organizations.

An idea that I have pondered for a while about EA is a theory about which "boundaries" a community emphasizes. Although I've only ever interacted with EA by reading related content online, my perception is that EA really emphasizes the boundary around the EA community itself, while de-emphasizing the boundaries around individual people or organizations. The issues around Epoch I think demonstrate this. The feeling of betrayal comes from viewing "the community" as central. I think a lot of other cultures that place more emphasize on those other boundaries might react differently. For example, at most companies I have worked at, although certainly they would never be happy to see an employee leave, they wouldn't view moving to another job as a betrayal, even if an employee went to work for a direct competitor. I personally think placing more emphasis on orgs/individuals rather than the community as a whole could have some benefits, such as with the issue you raise about how to navigate changing views on AI.

Although emphasizing "the community" might seem like its ideal for cooperation, I think it can actually harm cooperation in the presence of substantial disagreements, because it generates dynamics like what is going on here. People feel like they can't cooperate with people across the disagreement. We will probably see some of these disagreements resolved over the next few years as AI progresses. I for one hope that even if I am wrong I can take any necessary corrections on-board and still work with people who I disagreed with to make positive contributions. Likewise, I hope that if I am right, people who I disagreed with still feel like they can work with me despite that.

As a side note, it’s also strange to me that people are treating the founding of Mechanize as if it has a realistic chance to accelerate AGI progress more than a negligible amount — enough of a chance of enough of an acceleration to be genuinely concerning. AI startups are created all the time. Some of them state wildly ambitious goals, like Mechanize. They typically fail to achieve these goals. The startup Vicarious comes to mind.

I admit I had a similar thought, but I am of two minds about it. On the one hand, I think intentions do matter. I think it is reasonable to point out if you think someone is making a mistake, even if you think ultimately that mistake is unlikely to have a substantial impact because the person is unlikely to succeed in what they are trying to do.

On the other hand, I do think the degree of the reaction and the way that people are generalizing seems like people are almost pricing in the idea that the actions in question have already had a huge impact. So I do wonder if people are kind of over-updating on this specific case for similar reasons to what you mention.

RebeccaApr 193

Yeah that sounds right to me as a gloss

MichaelDickensApr 1910

I think it's a good thing that you're open about your motivations and I appreciate it.

RebeccaApr 197

I think Sharmake might be thinking you are one of the people that left Epoch to start Mechanize? (He says "admits that the reason he is working on this" in response to the main post, about Mechanize)

Jaime SevillaApr 1912

Ah, in case there is any confusion about this I am NOT leaving Epoch nor joining Mechanize. I will continue to be director of Epoch and work in service of our public benefit mission.

SharmakeApr 193

I incorrectly thought that you also left, I edited my comment.

Greg_Colbourn ⏸️ Apr 226

For what it's worth, I think you are woefully miscalibrated about what the right course of action is if you care about the people you love. Preventing ASI from being built for at least a few years should be a far bigger priority (and Mechanize's goal is ~the opposite of that). Would be interested to hear more re why you think violent AI takeover is unlikely.