I do independent research on EA topics. I write about whatever seems important, tractable, and interesting (to me).
I have a website: https://mdickens.me/ Much of the content on my website gets cross-posted to the EA Forum, but I also write about some non-EA stuff over there.
My favorite things that I've written: https://mdickens.me/favorite-posts/
I used to work as a software developer at Affirm.
Not saying Bregman is wrong (I don't really have a belief on the matter) but this is not what I'd call a "strong case". He says
As a historian, I've studied some of the major consumer boycott in history. Which ones changed history, and which ones fizzled out?
The answer is surprisingly consistent: the successful ones didn't try to fight everything at once. They picked a single target – one that was both symbolically powerful and genuinely vulnerable – and went all in.
But then he provides only one example (Montgomery Bus Boycott), and doesn't provide any evidence that the Montgomery Bus Boycott was an important causal factor in ending segregation.
FWIW I certainly wouldn't tell anyone not to boycott ChatGPT. Decreasing OpenAI's revenue is good for the world.
I have many issues with this change. Commenters on LessWrong have already said most everything I wanted to say but it hasn't been said on EAF yet, so I'll quote a couple excerpts from comments that captured my biggest concerns.
But it’s been easy to get the impression that the RSP is “binding ourselves to the mast” and committing to unilaterally pause AI development and deployment under some conditions, and Anthropic is responsible for that.
Yes, Anthropic employees on more than a dozen occasions told me that the RSP binds them to a mast. I had many very explicit conversations with many Anthropic employees about this, because I was following up on what I thought was Anthropic violating what I perceived to be a promise to not push forward the state of AI capabilities, which many employees disputed had happened.
[Habryka gives a public example of one such conversation]
This was, in my experience, routine[1]. I therefore do see this switch from "RSP as concrete if-then-commitments" to "RSP as positive milestone setting" to constitute a meaningful breaking of a promise. Yes, the RSP always said in its exact words that Anthropic could revise it, but people who said that condition would trigger were frequently dismissed and insulted as in the comment above.
And to be clear, I think this is a huge deal! My experience interfacing with Anthropic on RSP-adjacent topics has been pretty universally terrible, with a very routine experience of being gaslit (with the exception of interfacing specifically with you, Holden, on this topic, where your comments have seemed clear and reasonable and consistent across time to me).
I am glad to see this post as a kind of reckoning with many of these bad implicit promises, but at some point Anthropic has failed so many times to set reasonable expectations, and has acted so many times adversarially to people trying to get clarity on commitments, that it becomes very hard to have any kind of non-adversarial relationship to it. I do think this post helps, and I hope it might open up better and less adversarial future communications.
If there were strong and broad political will for treating AI like nuclear power and slowing it down arbitrarily much to keep risks low, the situation might be different. But that isn’t the world we’re in now, and I fear that “overreaching” can be costly.
I think it would make a nontrivial contribution to that 'strong and broad political will' if Dario were to come out and say "actually, sorry about all that deliberate Overton-window-closing I did in previous writings. In fact, political will is not a totally exogenous oh-well thing, but it is the responsibility of frontier developers to inculcate that political will by telling the public that a pause is possible and desirable, instead of a dumb lame thing not even worth considering. So now we're saying loud and clear: a pause is possible and desirable, and the world should work toward it as a Plan A!"
I'm being deliberately cartoonish here, but you get the point. If incentives are forcing Anthropic to abandon things that are good for human survival––which occurrence was, no offense, completely obvious from day one––Anthropic should be screaming from the rooftops, Help!! Incentives are forcing us to abandon things that are good for human survival!!
If this is a crux for you––if you/Anthropic think a pause is so undesirable/unlikely that it's important for the safety of the human race to publicly disparage the possibility of a pause (as Dario opens many of his essays by doing)––please say so! Otherwise, this lily-livered, disingenuous, "oh no, the incentives! it's a shame incentives can never be changed!" moping will give us all an undignified death.
Quoting myself: The main thing that saddened me about this post isn't Anthropic breaking and weakening its commitments—that was expected to happen. It's that Holden seems to be adopting the same shirking-of-responsibility stance on government regulations that Dario has been taking for a while.
I think using made-up-ish numbers can be better than not doing it at all - as long as there's no impression of rigor. We don't have to claim rigor!
In my experience, here's how these things go ~100% of the time:
AI 2027 is a great example of this.
A relevant question I'm not sure about: for people who talk to politicians about AI risk, how useful are benchmarks? I'm not involved in those conversations so I can't really say. My guess is that politicians are more interested in obvious capabilities (e.g. Claude can write good code now) than they are in benchmark performance.
Know when to sound alarm bells
What is the situation where people coordinate to sound alarm bells over an AI benchmark? I basically don't think people pay attention to benchmarks in the way that matters: a benchmark comes out demonstrating some new potential danger, AI safety people raise concerns about it, and the people with the actual power continue to ignore them.
Thinking of some historical examples:
We focus on intentional deaths because they are most revealing of terminal preferences, which in turn are most predictive of future harm.
Why is an ideological desire to kill the outgroup more likely to influence the long-term future than (say) a desire to preserve wild-animal suffering for aesthetic reasons?
Considering that wild animal suffering is currently somewhere around 10^3 to 10^15 times worse than any ideologically-driven tragedy has ever been, by default I'm far more concerned about the former.
A donor wanted to spend their money this way; it would not be fair to the donor for Eliezer to turn around and give the money to someone else. There is a particular theory of change according to which this is the best marginal use of ~$1 million: it gives Eliezer a strong defense against accusations like
If they suddenly said that the risk of human extinction from AGI or superintelligence is extremely low, in all likelihood that money would dry up and Yudkowsky and Soares would be out of a job.
I kinda don't think this was the best use of a million dollars, but I can see the argument for how it might be.
I'm unsure about whether using free accounts is good or bad (I lean toward bad). It decreases gross profit, but it also increases usage numbers which makes it easier to raise money, which I think is more important than gross profit right now.