Crossposted from AI Lab Watch. Subscribe on Substack.
Introduction
Anthropic has an unconventional governance mechanism: an independent "Long-Term Benefit Trust" elects some of its board. Anthropic sometimes emphasizes that the Trust is an experiment, but mostly points to it to argue that Anthropic will be able to promote safety and benefit-sharing over profit.[1]
But the Trust's details have not been published and some information Anthropic has shared is concerning. In particular, Anthropic's stockholders can apparently overrule, modify, or abrogate the Trust, and the details are unclear.
Anthropic has not publicly demonstrated that the Trust would be able to actually do anything that stockholders don't like.
The facts
There are three sources of public information on the Trust:
- The Long-Term Benefit Trust (Anthropic 2023)
- Anthropic Long-Term Benefit Trust (Morley et al. 2023)
- The $1 billion gamble to ensure AI doesn't destroy humanity (Vox: Matthews 2023)
They say there's a new class of stock, held by the Trust/Trustees. This stock allows the Trust to elect some board members and will allow them to elect a majority of the board by 2027.
But:
- Morley et al.: "the Trust Agreement also authorizes the Trust to be enforced by the company and by groups of the company’s stockholders who have held a sufficient percentage of the company’s equity for a sufficient period of time," rather than the Trustees.
- I don't know what this means.
- Morley et al.: the Trust and its powers can be amended "by a supermajority of stockholders. . . . [This] operates as a kind of failsafe against the actions of the Voting Trustees and safeguards the interests of stockholders." Anthropic: "the Trust and its powers [can be changed] without the consent of the Trustees if sufficiently large supermajorities of the stockholders agree."
- It's impossible to assess this "failsafe" without knowing the thresholds for these "supermajorities." Also, a small number of investors—currently, perhaps Amazon and Google—may control a large fraction of shares. It may be easy for profit-motivated investors to reach a supermajority.
- Maybe there are other issues with the Trust Agreement — we can't see it and so can't know.
- Vox: the Trust "will elect a fifth member of the board this fall," viz. Fall 2023.
- Anthropic has not said whether that happened nor who is on the board these days (nor who is on the Trust these days).
Conclusion
Public information is consistent with the Trust being quite subordinate to stockholders, likely to lose their powers if they do anything stockholders dislike. (Even if stockholders' formal powers over the Trust are never used, that threat could prevent the Trust from acting contrary to the stockholders' interests.)
Anthropic knows this and has decided not to share the information that the public needs to evaluate the Trust. This suggests that Anthropic benefits from ambiguity because the details would be seen as bad. I basically fail to imagine a scenario where publishing the Trust Agreement is very costly to Anthropic—especially just sharing certain details (like sharing percentages rather than saying "a supermajority")—except that the details are weak and would make Anthropic look bad.[2]
Maybe it would suffice to let an auditor see the Trust Agreement and publish their impression of it. But I don't see why Anthropic won't publish it.
Maybe the Trust gives Anthropic strong independent accountability — or rather, maybe it will by default after (unspecified) time- and funding-based milestones. But only if Anthropic's board and stockholders have substantially less power over it than they might—or if they will exercise great restraint in using their power—and the Trust knows this.
Unless I'm missing something, Anthropic should publish the Trust Agreement (and other documents if relevant) and say whether and when the Trust has elected board members. Especially vital is (1) publishing information about how the Trust or its powers can change, (2) committing to publicly announce changes, and (3) clarifying what's going on with the Trust now.
Note: I don't claim that maximizing the Trust's power is correct. Maybe one or more other groups should have power over the Trust, whether to intervene if the Trust collapses or does something illegitimate or just to appease investors. I just object to the secrecy.
Thanks to Buck Shlegeris for suggestions. He doesn't necessarily endorse this post.
- ^
- ^
Unlike with some other policies, the text of the Trust Agreement is crucial; it is a legal document that dictates actors' powers over each other.
It seems valuable to differentiate between "ineffective by design" and "ineffective in practice". Which do you think is more the cause for the trend you're observing?
OP is concerned that Anthropic's governance might fall into the "ineffective by design" category. Like, it's predictable in advance that something could maybe go wrong here.
If yours is more of an "ineffective in practice" argument -- that seems especially concerning, if the "ineffective in practice" point applies even when the governance appeared to be effective by design, ex ante.
In any case, I'd really like to see dedicated efforts to argue for ideal AI governance structures and documents. It feels like EA has overweighted the policy side of AI governance and underweighted the organizational founding documents side. Right now we're in the peanut gallery, criticizing how things are going at OpenAI and now Anthropic, without offering much in the way of specific alternatives.
Events at OpenAI have shown that this issue deserves a lot more attention, in my opinion. Some ideas:
A big cash prize for best AI lab governance structure proposals. (In practice you'd probably want to pick and choose the best ideas across multiple proposals.)
Subsidize red-teaming novel proposals and testing out novel proposals in lower-stakes situations, for non-AI organiations. (All else equal, it seems better for AGI to be developed using an institutional template that's battle-tested.) We could dogfood proposals by using them for non-AI EA startups or EA organizations focused on e.g. community-building.
Governance lit reviews to gather and summarize info, both empirical info and also theoretical models from e.g. economics. Cross-national comparisons might be especially fruitful if we don't think the right structures are battle-tested in a US legal context.
At this point, I'm embarrassed that if someone asked me how to fix OpenAI's governance docs, I wouldn't really have a suggestion. On the other hand, if we had some really solid suggestions, it feels doable to either translate them into policy requirements, or convince groups like Anthropic's trustees to adopt them.