Eric Neyman

1296 karmaJoined


I'm a theoretical CS grad student at Columbia specializing in mechanism design. I write a blog called Unexpected Values which you can find here: https://ericneyman.wordpress.com/. My academic website can be found here: https://sites.google.com/view/ericneyman/.


(Comment is mostly cross-posted comment from Nuño's blog.)

In "Unflattering aspects of Effective Altruism", you write:

Third, I feel that EA leadership uses worries about the dangers of maximization to constrain the rank and file in a hypocritical way. If I want to do something cool and risky on my own, I have to beware of the “unilateralist curse” and “build consensus”. But if Open Philanthropy donates $30M to OpenAI, pulls a not-so-well-understood policy advocacy lever that contributed to the US overshooting inflation in 2021, funds Anthropic13 while Anthropic’s President and the CEO of Open Philanthropy were married, and romantic relationships are common between Open Philanthropy officers and grantees, that is ¿an exercise in good judgment? ¿a good ex-ante bet? ¿assortative mating? ¿presumably none of my business?

I think the claim that Open Philanthropy is hypocritical re: the unilateralist's curse doesn't quite make sense to me. To explain why, consider the following two scenarios.

Scenario 1: you and 999 other people smart, thoughtful people have a button. You know there's 1000 people with such a button. If anyone presses the button, all mosquitoes will disappear.

Scenario 2: you and you alone have a button. You know that you're the only person with such a button. If you press the button, all mosquitoes will disappear.

The unilateralist's curse applies to Scenario 1 but *not* Scenario 2. That's because, in Scenario 1, your estimate of the counterfactual impact of pressing the button should be your estimate of the expected utility of all mosquitoes disappearing, *conditioned on no one else pressing the button*. In Scenario 2, where no one else has the button, your estimate of the counterfactual impact of pressing the button should be your estimate of the (unconditional) expected utility of all mosquitoes disappearing.

So, at least the way I understand the term, the unilateralist's curse refers to the fact that taking a unilateral action is worse than it naively appears, *if other people also have the option of taking the unilateral action*.


This relates to Open Philanthropy because, at the time of buying the OpenAI board seat, Dustin was one of the only billionaires approaching philanthropy with an EA mindset (maybe the only?). So he was sort of the only one with the "button" of having this option, in the sense of having considered the option and having the money to pay for it. So for him it just made sense to evaluate whether or not this action was net positive in expectation.

Now consider the case of an EA who is considering launching an organization with a potentially large negative downside, where the EA doesn't have some truly special resource or ability. (E.g., AI advocacy with inflammatory tactics -- think DxE for AI.) Many people could have started this organization, but no one did. And so, when deciding whether this org would be net positive, you have to condition on this observation.

Thanks for asking! The first thing I want to say is that I got lucky in the following respect. The set of possible outcomes isn't the interior of the ellipse I drew; rather, it is a bunch of points that are drawn at random from a distribution, and when you plot that cloud of points, it looks like an ellipse. The way I got lucky is: one of the draws from this distribution happened to be in the top-right corner. That draw is working at ARC theory, which has just about the most intellectually interesting work in the world (for my interests) and is also just about the most impactful place for me to work (given my skills and my models of what sort of work is impactful). I interned there for 4-5 months and I'll be starting there full-time soon!

Now for my report card, as for how well I checked in (in the ways listed in the post):

  • Writing the above post was useful in an interesting way: I formed some amount of identity around "I care about things besides impact" in a way that somewhat decreased value drift. (I endorse this, I think.) This manifested as me thinking a lot over the last year about whether I'm happy. Sometimes the answer was "not really"! But I noticed this and took steps toward fixing it. In particular, I noticed when I was in Berkeley last summer that I had a need for a social group that doesn't talk about maximizing impact all the time. This was super relevant to my criteria for choosing a living situation when I came back to Berkeley in October. I ended up choosing a "chill" group house, and I think that was the right choice.
  • I had the goal of keeping a monthly diary about my values. I updated it four times -- in June, July, October, and March -- and I think that captured most of the value. (I'm not sure that this was a particularly valuable intervention.)
  • Regarding the four specific non-EA things I cared about that I listed above:
    • Family and non-EA friends: I continue to be close with my family and remain similarly close with the non-EA friends I had at the time.
    • Puzzles and puzzle hunts: I continue caring about this. Empirically I haven't done many puzzle hunts over the last year, but that was more for a lack of good opportunities. But I recently joined a new puzzle hunt team, so I might have more opportunities ahead!
    • Spending time in nature: yup, I continue to care about this. I went to Alaska for a few weeks last month and it was great.
    • Random statistical analyses: honestly, much less? Which I'm a bit sad about.
      • One interested that I had not listed because I had mixed feelings about how much I endorsed the interest was politics. I indeed care less about politics now (though still do a decent amount).
  • I also picked up an interest -- I'm part of the Bayesian Choir! I've also been playing some small amount of tennis, for the first time since high school.
  • I didn't do any of the CFAR techniques, like focusing or internal double crux.

I'd say that this looks pretty good.


I do think that there are a couple of yellow flags, though:

  • I currently believe that the Berkeley EA community is unhealthy (I'm not sure whether to add the caveat "for me" or whether I think it's unhealthy, period). The main reason for this, I think, is that there's a status hierarchy. The way I sometimes put this is: if you asked me which of my friends in college are highest status, I would've been like "...what does that even mean, that question doesn't make sense". But unfortunately I think if you asked about people's status in this community, I'd often have thoughts. I have a theory that this comes out of having a large group of people with really similar values and goals. To elaborate on this: in college, everyone was pursuing their own thing and had their own values, which means that different people had very different standards for what it meant for someone to be cool. (There would have been way more status if, say, everyone were trying to be a member of some society; my impression is that this caused status dynamics in parts of my college that I didn't interact with.) In the Berkeley EA community, most people have pretty similar goals (such as furthering AI safety or having interesting conversations). If people agree on what's important then naturally they'll agree more on who's good at the important things (who's good at AI safety research, or who's good at having interesting conversations -- and by the way, there's way more agreement in the Berkeley EA community about what constitutes an interesting conversation than there is in college).
    • This theory would predict that political party organizations (the Democratic and Republican parties) have a strong social status hierarchy, since they mostly share the same goals (get the party into a position of power). If I learn that actually these organizations mostly don't have strong social status hierarchies, I'll retract my diagnosis.
  • I weakly think that something about the Berkeley EA community makes it harder for me to have original thoughts. Maybe it's that there's so much stuff going on that I don't spend very much time alone with my thoughts. Or maybe it's that there's more of a "party line" about the right takes, in a way that discourages free-thinking. Or maybe it's that people in this community really like talking about some things but not other things, and this implicitly discourages thinking about the "other things".

I haven't figured out how to navigate this. These may be genuine trade-offs -- a case where I can't both work at ARC and be immune from these downsides -- or maybe I'll learn to deal with the downsides over time. I do think that the benefits of my decision to work at ARC are worth the costs for me, though.

Thanks -- I should have been a bit more careful with my words when I wrote that "measurement noise likely follows a distribution with fatter tails than a log-normal distribution". The distribution I'm describing is your subjective uncertainty over the standard error of your experimental results. That is, you're (perhaps reasonably) modeling your measurement as being the true quality plus some normally distributed noise. But -- normal with what standard deviation? There's an objectively right answer that you'd know if you were omniscient, but you don't, so instead you have a subjective probability distribution over the standard deviation, and that's what I was modeling as log-normal.

I chose the log-normal distribution because it's a natural choice for the distribution of an always-positive quantity. But something more like a power law might've been reasonable too. (In general I think it's not crazy to guess that the standard error of your measurement is proportional to the size of the effect you're trying to measure -- in which case, if your uncertainty over the size of the effect follows a power law, then so would your uncertainty over the standard error.)

(I think that for something as clean as a well-set-up experiment with independent trials of a representative sample of the real world, you can estimate the standard error well, but I think the real world is sufficiently messy that this is rarely the case.)

Let's take the very first scatter plot. Consider the following alternative way of labeling the x and y axes. The y-axis is now the quality of a health intervention, and it consists of two components: short-term effects and long-term effects. You do a really thorough study that perfectly measures the short-term effects, while the long-term effects remain unknown to you. The x-value is what you measured (the short-term effects); the actual quality of the intervention is the x-value plus some unknown, mean zero variance 1 number.

So whereas previously (i.e. in the setting I actually talk about), we have E[measurement | quality] = quality (I'm calling this the frequentist sense of "unbiased"), now we have E[quality | measurement] = measurement (what I call the Bayesian sense of "unbiased").

Great question -- you absolutely need to take that into account! You can only bargain with people who you expect to uphold the bargain. This probably means that when you're bargaining, you should weight "you in other worlds" in proportion to how likely they are to uphold the bargain. This seems really hard to think about and probably ties in with a bunch of complicated questions around decision theory.

This is probably my favorite proposal I've seen so far, thanks!

I'm a little skeptical that warnings from the organization you propose would have been heeded (especially by people who don't have other sources of funding and so relying on FTX was their only option), but perhaps if the organization had sufficient clout, this would have put pressure on FTX to engage in less risky business practices.

I think this fails (1), but more confidently, I'm pretty sure it fails (2). How are you going to keep individuals from taking crypto money? See also: https://forum.effectivealtruism.org/posts/Pz7RdMRouZ5N5w5eE/ea-should-taboo-ea-should

I think my crux with this argument is "actions are taken by individuals". This is true, strictly speaking; but when e.g. a member of U.S. Congress votes on a bill, they're taking an action on behalf of their constituents, and affecting the whole U.S. (and often world) population. I like to ground morality in questions of a political philosophy flavor, such as: "What is the algorithm that we would like legislators to use to decide which legislation to support?". And as I see it, there's no way around answering questions like this one, when decisions have significant trade-offs in terms of which people benefit.

And often these trade-offs need to deal with population ethics. Imagine, as a simplified example, that China is about to deploy an AI that has a 50% chance of killing everyone and a 50% chance of creating a flourishing future of many lives like the one many longtermists like to imagine. The U.S. is considering deploying its own "conservative" AI, which we're pretty confident is safe, and which will prevent any other AGIs from being built but won't do much else (so humans might be destined for a future that looks like a moderately improved version of the present). Should the U.S. deploy this AI? It seems like we need to grapple with population ethics to answer this question.

(And so I also disagree with "I can’t imagine a reasonable scenario in which I would ever have the power to choose between such worlds", insofar as you'll have an effect on what we choose, either by voting or more directly than that.)

Maybe you'd dispute that this is a plausible scenario? I think that's a reasonable position, though my example is meant to point at a cluster of scenarios involving AI development. (Abortion policy is a less fanciful example: I think any opinion on the question built on consequentialist grounds needs to either make an empirical claim about counterfactual worlds with different abortion laws, or else wrestle with difficult questions of population ethics.)

I guess I have two reactions. First, which of the categories are you putting me in? My guess is you want to label me as a mop, but "contribute as little as they reasonably can in exchange" seems an inaccurate description of someone who's strongly considering devoting their career to an EA cause; also I really enjoy talking about the weird "new things" that come up (like idk actually trade between universes during the long reflection).

My second thought is that while your story about social gradients is a plausible one, I have a more straightforward story about who EA should accept which I like more. My story is: EA should accept/reward people in proportion to (or rather, in a monotone increasing fashion of) how much good they do.* For a group that tries to do the most good, this pretty straightforwardly incentivizes doing good! Sure, there are secondary cultural effects to consider-- but I do think they should be thought of as secondary to doing good.

*You can also reward trying to do good to the best of each's ability. I think there's a lot of merit to this approach, but might create some not-great incentives of the form "always looking like you're trying" (regardless of whether you really are trying effectively).

Load more