ai x-risk policy & science comms
261 karmaJoined Apr 2019



I’m good at explaining alignment to people in person, including to policymakers.

I got 250k people to read HPMOR and sent 1.3k copies to winners of math and computer science competitions; have taken the GWWC pledge; created a small startup that donated >100k$ to effective nonprofits.

I have a background in ML and strong intuitions about the AI alignment problem. In the past, I studied a bit of international law (with a focus on human rights) and wrote appeals that won cases against the Russian government in Russian courts. I grew up running political campaigns.

I’m interesting in chatting to potential collaborators and comms allies.

My website: https://contact.ms

Schedule a call with me: https://contact.ms/ea30


I carefully looked through all of our messages (there weren’t too many) a couple of times, because that was pretty surprising and I considered it to be more likely that I don’t remember something than her saying something directly false. But there’s nothing like that and she’s, unfortunately, straightforwardly saying a false thing.

he usually did not accept my answers when I gave them but continued to argue with me, either straight up or by insisting I didn't really understand his argument or was contradicting myself somehow.

This is also something I couldn’t find any examples of before the protest, no matter how I interpret the messages we have exchanged about her strategy etc.

I’m >99% sure that I’m not missing anything. Included it because it’s not a mathematical truth, and because adding “Are you sure? What am I missing, if you are?” is more polite than just saying “this is clearly false and we both know it and any third party can verify the falsehood; why are you saying that?”. It’s technically possible I’m just blind at something- I can imagine a conceivable universe where I’m wrong about this, maybe she publicly tweeted this and I forgot and now I can’t see it because she blocked me on Twitter or something. But I’m confident she’s saying a straightforwardly false thing. I’d feel good about betting up to $20k at 99:1 odds on this.

Like, "missing something" isn’t really a hypothesis, I’m including this because it is a part of my epistemic situation and seems nicer to include in the message.

Asking to share messages publicly or showing them to a third party seems to unnecessarily up the stakes. I'm not sure why you're suggesting that

When someone is confidently saying false things about the contents of the messages we exchanged, it seems reasonable to suggest publishing them or having a third party look at them. I’m not sure how it’s “upping the stakes”. It’s a natural thing to do.

Huh. I strongly upvoted, as it contributes a lot to the conversation. I also saved this comment to the Web Archive.

I told him that I made my decisions and didn't need any more of his input a few weeks before the 2/12 protest

This is not true.

We last spoke for a significant amount of time in October[1]. You mentioned you'd be down to do an LW dialogue.

After that, in November, you messaged me on Twitter, asking for the details of the OpenAI situation. I then messaged you on Christmas (nothing EA/AI related).

On January 31 (a few weeks before the protest), I shared my concerns about the messaging being potentially misleading. You asked for advice on how you should go about anticipating misunderstandings like that. You said that you won't say anything technically false and asked what solutions I propose. Among more specific things, I mentioned that "It seems generally good to try to be maximally honest and non-deceptive".

At or before this point, you didn't tell me anything about "not needing more of my input". And you didn't tell me anything like "I made my decisions".

On February 4th, I attended your EAG talk, and on February 5th, I messaged you that it was great and that I was glad you gave it.

Then, on February 6, you invited me to join your Twitter space and invited to it as a speaker. I didn't get the chance to share my thoughts about allying with people promoting locally invalid views, and I shared them publicly on Twitter and privately with you on Messenger, including a quote I was asked to not share publicly until an LW dialogue is up. We chatted a little bit about general goals and allying with people with different views. You told me, "You have a lot of opinions and I’d be happy to see you organize your own protests". I replied, "Sure, I might if I think it’s useful:)" and after a message sharing my confusion regarding one of your previous messages, didn't share any more "opinions". This could be interpreted as indirectly telling me you didn't need my input, but that was a week after I shared my concerns about the messaging being misleading, and you still didn't tell me anything about having made your decisions.

A day before the protest, I shared a picture of a sign I wanted to bring to the protest, and asked "Hey, is it alright if I print and show up to the protest with something like this?", because I wouldn't want to attend the protest if you weren't ok with the sign. You replied that it was ok, shared some thoughts, and told me that you liked the sentiment as it supports the overall impression that you want, so I brought the sign to the protest:

After the protest, I shared the draft of this post with you. After a short conversation, you told me lots of things and blocked me, and I felt like I should expect retaliation if I published the post.

Please tell me if I'm missing something. If I'm not, you're either continuing to be inattentive to the facts, or you're directly lying.

I'd be happy to share all of our message exchanges with a third party, such as CEA Community Health, or share them publicly, if you agree to that.

  1. ^

    Before that, as I can tell from Google Docs edits and comments, you definitely spent more than an hour on the https://moratorium.ai text back in August; I'm thankful for that; I accepted four of your edits and found some of your feedback helpful. We've also had 4 calls, mostly about moratorium.ai and the technical problem (and my impression was that you found these calls helpful), and went on a walk in Berkeley, mostly discussing strategy. Some of what you told me during the walk was concerning.

To many community members, including protest participants, it’s pretty clear the messaging was deceptive.

A protest organiser is saying it was the curse of knowledge, but I sent them messages directly pointing out how people will see the messaging. As I mentioned elsewhere in the comments, I want to have a third party look at the messages exchanged between me and the protest organiser, if they agree.

Also, I expect many people to only skim through the post, and look at of the protest organiser’s initial engagement with it or a shortform post they made before I published the post; all of these make it seem like I’m saying the organiser intentionally made the mistake they then corrected: “I ran a successful protest at [company] yesterday. Before the night was over, Mikhail Samin, who attended the protest, sent me a document to review that accused me of what sounds like a bait and switch and deceptive practices because I made an error in my original press release (which got copied as a description on other materials) and apparently didn't address it to his satisfaction because I didn't change the theme of the event more radically or cancel it.”

These post and comments have not been corrected to show that this is not what I’m talking about, when the protest organiser understood what misleading messaging the post talks about.

Thanks Lucie. There’s certainly was miscommunication surrounding the draft of this post, but I don’t believe they didn’t understand people can be misled back at the end of January.

No, they already spent at least hundreds of millions (and possibly more than a billion- I saw a $1.5b number somewhere) on the lawyers. The first thing the lawyers did, back in November 2022, was getting paid

Do you want to create a market? I’d be happy to bet 1:70.

Not only they’re returning all the customer money, they were also able to pay somewhere between hundreds of millions and 1.5b (couldn’t find an accurate number) to the lawyers. My impression is that they found a lot of unaccounted money lying around. It’s also not obvious what was the value of Anthropic

It seems that one of the protest organisers now says that what happened was what I described in the post as “Even the best-intending people are not perfect, often have some inertia, and aren't able to make sure the messaging they put out isn't misleading and fully propagate updates”.

I suggest focusing on what mechanisms can be designed to prevent misleading messaging from emerging in the future.

I wouldn’t care if people knew some number to some approximation and not fully. This is quite different from saying something that’s technically not false but creates a misleading impression you thought was more likely to get people to support your message.

I don’t want to be spending time this way and would be happy if you found someone we’d both be happy with them reading our message exchange and figuring out how deceptive or not was the protest messaging.

Load more