Ben Millwood🔸

The law of conservation of expected evidence, E(E(X|Y)) = E(X), essentially states that you can't "expect to change your mind", in the sense that, if you already thought that your estimate of (say) some intervention's cost-effectiveness would go up by an average of Z after reading this study, then your EV should already have been Z higher before you read it. You should be balanced (in EV terms) between the possible outcomes that would be positive surprises and negative surprises, otherwise you're just not calculating your EVs correctly.

Anyway, let's take X to be global future welfare, and Y to be the consequences of some action you take. E(E(X|Y)) = E(X) means that the average global well-being given the outcome of your action is exactly the same as the average global well-being without the outcome of your action. So why did you bother doing it?

Important Update: Pivotal(TM) Trademark Announcement

Ben Millwood🔸15d4

I trust that you'll enforce this trademark against anyone who takes any actions with an unduly large impact on the world, requiring them to first apply for a license to do so.

What if I'm not open to feedback?

Ben Millwood🔸15d38

This got me thinking:

	no name	name
feedback	anonymous form	normal
no feedback	shut up	???

Have you considered making a form where people can submit their names and nothing else?

Joseph Lemien's Quick takes

Ben Millwood🔸2mo5

Not that it's super important, but TVTropes didn't invent the phrase (nor do they claim they did), it's from Warhammer 40,000.

Clarifying Our Intentions

Ben Millwood🔸2mo10

I downvoted this because I think this isn't independently valuable / separate enough from your existing posts to merit a new, separate post. I think it would have been better as a comment on your existing posts (and as I've said on a post by someone else about your reviews, I think we're better off consolidating the discussion in one place).

That said, I think the sentiments expressed here are pretty reasonable, and I would have upvoted this in comment form I think.

Sinergia Animal review by Vetted Causes / Animal Charity Evaluators - looking for opinions

Ben Millwood🔸2mo9

They posted about their review of Sinergia on the forum already: https://forum.effectivealtruism.org/posts/YYrC2ZR5pnrYCdSLt/sinergia-ace-top-charity-makes-false-claims-about-helping

I suggest we concentrate discussion there and not here.

SWE vs AIS

Ben Millwood🔸2mo*4

Someone on the forum said there were ballpark 70 AI safety roles in 2023

Just to note that the UK AI Security Institute employs more than 50 technical staff by itself and I forget how many non-technical staff, so this number may be due an update.

The standard case for delaying AI appears to rest on non-utilitarian assumptions

Ben Millwood🔸2mo14

This doesn't seem right to me because I think it's popular among those concerned with the longer term future to expect it to be populated with emulated humans, which clearly isn't a continuation of the genetic legacy of humans, so I feel pretty confident that it's something else about humanity that people want to preserve against AI. (I'm not here to defend this particular vision of the future beyond noting that people like Holden Karnofsky have written about it, so it's not exactly niche.)

You say that expecting AI to have worse goals than humans would require studying things like what the empirical observed goals of AI systems turn out to be, and similar – sure, so in the absence of having done those studies, we should delay our replacement until they can be done. And doing these studies is undermined by the fact that right now the state of our knowledge on how to reliably determine what an AI is thinking is pretty bad, and it will only get worse as they develop their abilities to strategise and lie. Solving these problems would be a major piece of what people are looking for in alignment research, and precisely the kind of thing it seems worth delaying AI progress for.

Consider keeping your threat models private.

Ben Millwood🔸2mo*3

another opportunity for me to shill my LessWrong writing posing this question: Should we exclude alignment research from LLM training datasets?

I don't have a lot of time to spend on this, but this post has inspired me to take a little time to figure out whether I can propose or implement some controls (likely: making posts visible to logged-in users only) in ForumMagnum (the software underlying the EA Forum, LW, and the Alignment Forum)

edit: https://github.com/ForumMagnum/ForumMagnum/issues/10345

Should we try harder to appeal to conservatives?

Ben Millwood🔸3mo9

Ben Millwood🔸

Participation3

Posts 6

Comments533

Topic contributions1

Participation
3

Posts
6

Comments
533

Topic contributions
1