Hide table of contents

Crossposted from AI Lab Watch. Subscribe on Substack.

Introduction

Anthropic has an unconventional governance mechanism: an independent "Long-Term Benefit Trust" elects some of its board. Anthropic sometimes emphasizes that the Trust is an experiment, but mostly points to it to argue that Anthropic will be able to promote safety and benefit-sharing over profit.[1]

But the Trust's details have not been published and some information Anthropic has shared is concerning. In particular, Anthropic's stockholders can apparently overrule, modify, or abrogate the Trust, and the details are unclear.

Anthropic has not publicly demonstrated that the Trust would be able to actually do anything that stockholders don't like.

The facts

There are three sources of public information on the Trust:

They say there's a new class of stock, held by the Trust/Trustees. This stock allows the Trust to elect some board members and will allow them to elect a majority of the board by 2027.

But:

  1. Morley et al.: "the Trust Agreement also authorizes the Trust to be enforced by the company and by groups of the company’s stockholders who have held a sufficient percentage of the company’s equity for a sufficient period of time," rather than the Trustees.
    1. I don't know what this means.
  2. Morley et al.: the Trust and its powers can be amended "by a supermajority of stockholders. . . . [This] operates as a kind of failsafe against the actions of the Voting Trustees and safeguards the interests of stockholders." Anthropic: "the Trust and its powers [can be changed] without the consent of the Trustees if sufficiently large supermajorities of the stockholders agree."
    1. It's impossible to assess this "failsafe" without knowing the thresholds for these "supermajorities." Also, a small number of investors—currently, perhaps Amazon and Google—may control a large fraction of shares. It may be easy for profit-motivated investors to reach a supermajority.
  3. Maybe there are other issues with the Trust Agreement — we can't see it and so can't know.
  4. Vox: the Trust "will elect a fifth member of the board this fall," viz. Fall 2023.
    1. Anthropic has not said whether that happened nor who is on the board these days (nor who is on the Trust these days).

Conclusion

Public information is consistent with the Trust being quite subordinate to stockholders, likely to lose their powers if they do anything stockholders dislike. (Even if stockholders' formal powers over the Trust are never used, that threat could prevent the Trust from acting contrary to the stockholders' interests.)

Anthropic knows this and has decided not to share the information that the public needs to evaluate the Trust. This suggests that Anthropic benefits from ambiguity because the details would be seen as bad. I basically fail to imagine a scenario where publishing the Trust Agreement is very costly to Anthropic—especially just sharing certain details (like sharing percentages rather than saying "a supermajority")—except that the details are weak and would make Anthropic look bad.[2]

Maybe it would suffice to let an auditor see the Trust Agreement and publish their impression of it. But I don't see why Anthropic won't publish it.

Maybe the Trust gives Anthropic strong independent accountability — or rather, maybe it will by default after (unspecified) time- and funding-based milestones. But only if Anthropic's board and stockholders have substantially less power over it than they might—or if they will exercise great restraint in using their power—and the Trust knows this.

Unless I'm missing something, Anthropic should publish the Trust Agreement (and other documents if relevant) and say whether and when the Trust has elected board members. Especially vital is (1) publishing information about how the Trust or its powers can change, (2) committing to publicly announce changes, and (3) clarifying what's going on with the Trust now.

Note: I don't claim that maximizing the Trust's power is correct. Maybe one or more other groups should have power over the Trust, whether to intervene if the Trust collapses or does something illegitimate or just to appease investors. I just object to the secrecy.


Thanks to Buck Shlegeris for suggestions. He doesn't necessarily endorse this post.

  1. ^

    E.g. 1, 2, and 3, and almost every time Anthropic people talk about Anthropic's governance.

  2. ^

    Unlike with some other policies, the text of the Trust Agreement is crucial; it is a legal document that dictates actors' powers over each other.

134

3
1
2
1

Reactions

3
1
2
1

More posts like this

Comments20
Sorted by Click to highlight new comments since:

Thanks for investigating!

From my understanding of boards and governance structures, I think that few are actually very effective, and it's often very difficult to tell this from outside the organization. 

So, I think that the prior should be to expect these governance structures to be quite mediocre, especially in extreme cases, and wait for a significant amount of evidence otherwise. 

I think some people think, "Sure, but it's quite hard to provide a lot of public evidence, so instead we should give these groups the benefit of the doubt." I don't think this makes sense as an epistemic process.

If the prior is bad, then you should expect it to be bad. If it's really difficult to convince a good epistemic process otherwise, don't accept a worse epistemic process in order to make it seem "more fair for the evaluee". 

From my understanding of boards and governance structures, I think that few are actually very effective, and it's often very difficult to tell this from outside the organization.

It seems valuable to differentiate between "ineffective by design" and "ineffective in practice". Which do you think is more the cause for the trend you're observing?

OP is concerned that Anthropic's governance might fall into the "ineffective by design" category. Like, it's predictable in advance that something could maybe go wrong here.

If yours is more of an "ineffective in practice" argument -- that seems especially concerning, if the "ineffective in practice" point applies even when the governance appeared to be effective by design, ex ante.


In any case, I'd really like to see dedicated efforts to argue for ideal AI governance structures and documents. It feels like EA has overweighted the policy side of AI governance and underweighted the organizational founding documents side. Right now we're in the peanut gallery, criticizing how things are going at OpenAI and now Anthropic, without offering much in the way of specific alternatives.

Events at OpenAI have shown that this issue deserves a lot more attention, in my opinion. Some ideas:

  • A big cash prize for best AI lab governance structure proposals. (In practice you'd probably want to pick and choose the best ideas across multiple proposals.)

  • Subsidize red-teaming novel proposals and testing out novel proposals in lower-stakes situations, for non-AI organiations. (All else equal, it seems better for AGI to be developed using an institutional template that's battle-tested.) We could dogfood proposals by using them for non-AI EA startups or EA organizations focused on e.g. community-building.

  • Governance lit reviews to gather and summarize info, both empirical info and also theoretical models from e.g. economics. Cross-national comparisons might be especially fruitful if we don't think the right structures are battle-tested in a US legal context.

At this point, I'm embarrassed that if someone asked me how to fix OpenAI's governance docs, I wouldn't really have a suggestion. On the other hand, if we had some really solid suggestions, it feels doable to either translate them into policy requirements, or convince groups like Anthropic's trustees to adopt them.

If yours is more of an "ineffective in practice" argument -- that seems especially concerning,

Good point here. You're right that I'm highlighting "ineffective in practice".

In any case, I'd really like to see dedicated efforts to argue for ideal AI governance structures and documents.

Yep, I'd also like to see more work here! A while back I helped work on this project, which investigated this area a bit, but I think it's obvious there's more work to do.

I think "powerless" is a huge overstatement of the claims you make in this piece (many of which I agree with). Having powers that are legally and politically constrained is not the same thing as the nonexistence of those powers.

I agree though that additional information about the Trust and its relationship to Anthropic would be very valuable.

I claim that public information is very consistent with the investors hold an axe over the Trust; maybe the Trust will cause the Board to be slightly better or the investors will abrogate the Trust or the Trustees will loudly resign at some point; regardless, the Trust is very subordinate to the investors and won't be able to do much.

And if so, I think it's reasonable to describe the Trust as "maybe powerless."

I think people should definitely consider and assign non-trivial probability to the LTBT being powerless (probably >10%), which feels like the primary point of the post. Do you disagree with that assessment of probabilities (if so, I would probably be open to bets).

How are you defining "powerless"? See my previous comment: I think the common meaning of "powerless" implies not just significant constraints on power but rather the complete absence thereof.

I would say that the LTBT is powerless iff it can be trivially prevented from accomplishing its primary function—overriding the financial interests of the for-profit Anthropic investors—by those investors, such as with a simple majority (which is the normal standard of corporate control). I think this is very unlikely to be true, p<5%.

Would you say the OpenAI board was powerless to remove Altman? They had some legal powers that were legally and politically constrained, and in practice I think it's fair to describe them as effectively powerless.

I definitely would not say that the OpenAI Board was powerless to remove Sam in general, for the exact reason you say: they had the formal power to do so, but it was politically constrained. That formal power is real and, unless it can be trivially overruled in any instance in which it is exercised for the purpose for which it exists, sufficient to not be "powerless."

It turns out that they were maybe powerless to remove him in that instance and in that way, but I think there are many nearby fact patterns on which the Sam firing could have worked. This is evident from the fact that, in the period of days after November 17, prediction markets gave much less than 90% odds—and for many periods of time much less than 50%—that Sam would shortly come back as CEO.

As an intuition pump: Would we say that the President is powerless just because the other branches of government can constrain her (e.g., through the impeachment power or ability to override her veto) in many cases? I think not.

"Powerless" under its normal meaning is a very high bar, meaning completely lacking power. Taking all of Anthropic's statements as true, I think we have evidence that the LTBT has significant powers (the ability to appoint an increasing number of board members), with unclear but significant legal and (an escalating supermajority requirement) and political constraints on those powers. I think it's good to push for both more transparency on what those constraints are and for more independence. But unless a simple majority of shareholders are able to override the LTBT—which seems to be ruled out by the evidence—I would not describe them as powerless.

... I think there are many nearby fact patterns on which the Sam firing could have worked. This is evident from the fact that, in the period of days after November 17, prediction markets gave much less than 90% odds—and for many periods of time much less than 50%—that Sam would shortly come back as CEO.

This seems confused to me, because the market is reflecting epistemic uncertainty, not counterfactual resilience. It could be the case that the board would reliably fail in all nearby fact patterns but that market participants simply did not know this, because there were important and durable but unknown facts about e.g. the strength of the MSFT relationship or players' BATNAs.

Would we say that the President is powerless just because the other branches of government can constrain her (e.g., through the impeachment power or ability to override her veto) in many cases?

I think it would be fair to describe some Presidents as being effectively powerless with regard their veto yes, if the other party control a super-majority of the legislature and have good internal discipline. 

In any case I think the impact and action-relevance of this post would not be very much changed if the title was instead a more wordy "Maybe Anthropic's Long-Term Benefit Trust is as powerless as OpenAI's was".

It could be the case that the board would reliably fail in all nearby fact patterns but that market participants simply did not know this, because there were important and durable but unknown facts about e.g. the strength of the MSFT relationship or players' BATNAs.

I agree this is an alternative explanation. But my personal view is also that the common wisdom that it was destined to fail ab initio is incorrect. I don't have much more knowledge than other people do on this point, though.

I think it would be fair to describe some Presidents as being effectively powerless with regard their veto yes, if the other party control a super-majority of the legislature and have good internal discipline.

(Emphasis added.) I think this is the crux of the argument. I agree that the OpenAI board may have been powerless to accomplish a specific result in a specific situation. Similarly, in this hypo, the President may be powerless powerless to accomplish a specific result (vetoing legislation) in a specific situation.

But I think this is very far away from saying a specific institution is "powerless" simpliciter, which is what I disagreed with Zach's headline. (And so similarly would disagree that the President was "powerless" simpliciter in your hypo.)

An institution's powers will almost always be constrained significantly by both law and politics, so showing significant constraints on an institution's ability to act unilaterally is very far from showing it overall completely lacks power.

I basically fail to imagine a scenario where publishing the Trust Agreement is very costly to Anthropic—especially just sharing certain details (like sharing percentages rather than saying "a supermajority")—except that the details are weak and would make Anthropic look bad.

Anthropic might be worried that the details are strong, and would make Anthropic look vulnerable to similar governance chaos to what happened at OpenAI during the board turnover saga. A large public conversation on this could be bad for Anthropic's reputation among its investors, team, or other stakeholders, who have concerns other than longterm safety, or might think that Anthropic's non profit-motivated governance is opaque or bad for whatever other reason. To put this another way: Anthropic is probably reputation-managing, but it might not be their safety reputation that they are trying to manage. It might be their reputation -- to potential investors, say -- as a reliable actor with predictable decision-making that won't be upturned at the whims of the trust.

I would expect, though, that Anthropic's major investors know the details of the governance structure and mechanics.

 

Maybe. Note that they sometimes brag about how independent the Trust is and how some investors dislike it, e.g. Dario:

Every traditional investor who invests in Anthropic looks at this. Some of them are just like, whatever, you run your company how you want. Some of them are like, oh my god, this body of random people could move Anthropic in a direction that's totally contrary to shareholder value.

And I've never heard someone from Anthropic suggest this.

This sounds like a job for someone in mechanism design.

Did you reach out to Anthropic for comments/answers?

Anthropic has a general policy of employees not being allowed to respond to inquiries about their policies or commitments with anything that could be perceived as an official response from Anthropic (I have reached out to people at Anthropic many times to get clarification on various commitments and was universally told they are unable to clarify, and don't generally respond to that kind of inquiry, and if I want to know what they committed to I should purely look at their official publications which will generally not be written with anyone's specific concerns in mind).

Ah I see, thanks.

That seems like.. a bad policy?

Yes, I've previously made some folks at Anthropic aware of these concerns, e.g. associated with this post.

In response to this post, Zac Hatfield-Dodds told me he expects Anthropic will publish more information about its governance in the future.

the Trust Agreement also authorizes the Trust to be enforced by the company and by groups of the company’s stockholders who have held a sufficient percentage of the company’s equity for a sufficient period of time

...

It's impossible to assess this "failsafe" without knowing the thresholds for these "supermajorities." Also, a small number of investors—currently, perhaps Amazon and Google—may control a large fraction of shares. It may be easy for profit-motivated investors to reach a supermajority.

Just speculating here, perhaps the "sufficient period of time" is meant to deter activist shareholders/corporate raiders? Without that clause, you can imagine an activist who believes that Anthropic is underperforming due to its safety commitments. The activist buys sufficient shares to reach the supermajority threshold, then replaces Anthropic management with profit-seekers. Anthropic stock goes up due to increased profits. The activist sells their stake and pockets the difference.

Having some sort of check on the trustees seems reasonable, but the point about Amazon and Google owning a lot of shares is concerning. It seems better to require consensus from many independent decisionmakers in order to overrule the trustees.

Maybe it would be better to weight shareholders according to the log or the square root of the number of shares they hold. That would give increased weight to minor shareholders like employees and ex-employees. That could strike a better balance between concern for safety and concern for profit. Hopefully outside investors would trust employees and ex-employees to have enough skin in the game to do the right thing.

With my proposed reweighting, you would need some mechanism to prevent Amazon and Google from splitting their stake across a bunch of tiny shell entities. Perhaps shares could lose their voting rights if they're transferred away from their original owner. That also seems like a better way to deter patient activists. But maybe it could cause problems if Anthropic wants to go public? I guess the trustees could change the rules if necessary at that point.

Curated and popular this week
Relevant opportunities