Will Aldred

4481 karmaJoined

Posts
28

Sorted by New

Sequences
1

CERI SRF '22

Comments
218

Topic contributions
17

Epistemic status: strong opinions, lightly held

I remember a time when an org was criticized, and a board member commented defending the org. But the board member was factually wrong about at least one claim, and the org then needed to walk back wrong information. It would have been clearer and less embarrassing for everyone if they’d all waited a day or two to get on the same page and write a response with the correct facts.

I guess it depends on the specifics of the situation, but, to me, the case described, of a board member making one or two incorrect claims (in a comment that presumably also had a bunch of accurate and helpful content) that they needed to walk back sounds… not that bad? Like, it seems only marginally worse than their comment being fully accurate the first time round, and far better than them never writing a comment at all. (I guess the exception to this is if the incorrect claims had legal ramifications that couldn’t be undone. But I don’t think that’s true of the case you refer to?)

A downside is that if an organization isn’t prioritizing back-and-forth with the community, of course there will be more mystery and more speculations that are inaccurate but go uncorrected. That’s frustrating, but it’s a standard way that many organizations operate, both in EA and in other spaces.

I don’t think the fact that this is a standard way for orgs to act in the wider world says much about whether this should be the way EA orgs act. In the wider world, an org’s purpose is to make money for its shareholders: the org has no ‘teammates’ outside of itself; no-one really expects the org to try hard to communicate what it is doing (outside of communicating well being tied to profit); no-one really expects the org to care about negative externalities. Moreover, withholding information can often give an org a competitive advantage over rivals.

Within the EA community, however, there is a shared sense that we are all on the same team (I hope): there is a reasonable expectation for cooperation; there is a reasonable expectation that orgs will take into account externalities on the community when deciding how to act. For example, if communicating some aspect of EA org X’s strategy would take half a day of staff time, I would hope that the relevant decision-maker at org X takes into account not only the cost and benefit to org X of whether or not to communicate, but also the cost/benefit to the wider community. If half a day of staff time helps others in the community better understand org X’s thinking,[1] such that, in expectation, more than half a day of (quality-adjusted) productive time is saved (through, e.g., community members making better decisions about what to work on), then I would hope that org X chooses to communicate.

When I see public comments about the inner workings of an organization by people who don’t work there, I often also hear other people who know more about the org privately say “That’s not true.” But they have other things to do with their workday than write a correction to a comment on the Forum or LessWrong, get it checked by their org’s communications staff, and then follow whatever discussion comes from it.

I would personally feel a lot better about a community where employees aren’t policed by their org on what they can and cannot say. (This point has been debated before—see saulius and Habryka vs. the Rethink Priorities leadership.) I think such policing leads to chilling effects that make everyone in the community less sane and less able to form accurate models of the world. Going back to your example, if there was no requirement on someone to get their EAF/LW comment checked by their org’s communications staff, then that would significantly lower the time/effort barrier to publishing such comments, and then the whole argument around such comments being too time-consuming to publish becomes much weaker.


All this to say: I think you’re directionally correct with your closing bullet points. I think it’s good to remind people of alternative hypotheses. However, I push back on the notion that we must just accept the current situation (in which at least one major EA org has very little back-and-forth with the community)[2]. I believe that with better norms, we wouldn’t have to put as much weight on bullets 2 and 3, and we’d all be stronger for it.

  1. ^

    Or, rather, what staff at org X are thinking. (I don’t think an org itself can meaningfully have beliefs: people have beliefs.)

  2. ^

    Note: Although I mentioned Rethink Priorities earlier, I’m not thinking about Rethink Priorities here.

the actions he [SBF] was convicted of are nearly universally condemned by the EA community

I don’t think that observing lots of condemnation and little support is all that much evidence for the premise you take as given—that SBF’s actions were near-universally condemned by the EA community—compared to meaningfully different hypotheses like “50% of EAs condemned SBF’s actions.”

There was, and still is, a strong incentive to hide any opinion other than condemnation (e.g., support, genuine uncertainty) over SBF’s fraud-for-good ideology, out of legitimate fear of becoming a witch-hunt victim. By the law of prevalence, I therefore expect the number of EAs who don’t fully condemn SBF’s actions to be far greater than the number who publicly express opinions other than full condemnation.

(Note: I’m focusing on the morality of SBF’s actions, and not on executional incompetence.)

Anecdotally, of the EAs I’ve spoken to about the FTX collapse with whom I’m close—and who therefore have less incentive to hide what they truly believe from me—I’d say that between a third and a half fall into the genuinely uncertain camp (on the moral question of fraud for good causes), while the number in the support camp is small but not zero.[1]

  1. ^

    And of those in my sample in the condemn camp, by far the most commonly-cited reason is timeless decision theory / pre-committing to cooperative actions, which I don’t think is the kind of reason one jumps to when one hears that EAs condemn fraud for good-type thinking.

Importance of the digital minds stuff compared to regular AI safety; how many early-career EAs should be going into this niche? What needs to happen between now and the arrival of digital minds? In other words, what kind of a plan does Carl have in mind for making the arrival go well? Also, since Carl clearly has well-developed takes on moral status, what criteria he thinks could determine whether an AI system deserves moral status, and to what extent.

Additionally—and this one's fueled more by personal curiosity than by impact—Carl's beliefs on consciousness. Like Wei Dai, I find the case for anti-realism as the answer to the problem of consciousness weak, yet this is Carl's position (according to this old Brian Tomasik post, at least), and so I'd be very interested to hear Carl explain his view.

Thank you for engaging. I don’t disagree with what you’ve written; I think you have interpreted me as implying something stronger than what I intended, and so I’ll now attempt to add some colour.

That Emily and other relevant people at OP have not fully adopted Rethink’s moral weights does not puzzle me. As you say, to expect that is to apply an unreasonably high funding bar. I am, however, puzzled that Emily and co. appear to have not updated at all towards Rethink’s numbers. At least, that’s the way I read:

  • We don’t use Rethink’s moral weights.
    • Our current moral weights, based in part on Luke Muehlhauser’s past work, are lower. We may update them in the future; if we do, we’ll consider work from many sources, including the arguments made in this post.

If OP has not updated at all towards Rethink’s numbers, then I see three possible explanations, all of which I find unlikely, hence my puzzlement. First possibility: the relevant people at OP have not yet given the Rethink report a thorough read, and have therefore not updated. Second: the relevant OP people have read the Rethink report, and have updated their internal models, but have not yet gotten around to updating OP’s actual grantmaking allocation. Third: OP believes the Rethink work is low quality or otherwise critically corrupted by one or more errors. I’d be very surprised if one or two are true, given how moral weight is arguably the most important consideration in neartermist grantmaking allocation. I’d also be surprised if three is true, given how well Rethink’s moral weight sequence has been received on this forum (see, e.g., comments here and here).[1] OP people may disagree with Rethink’s approach at the independent impression level, but surely, given Rethink’s moral weights work is the most extensive work done on this topic by anyone(?), the Rethink results should be given substantial weight—or at least non-trivial weight—in their all-things-considered views?

(If OP people believe there are errors in the Rethink work that render the results ~useless, then, considering the topic’s importance, I think some sort of OP write-up would be well worth the time. Both at the object level, so that future moral weight researchers can avoid making similar mistakes, and to allow the community to hold OP’s reasoning to a high standard, and also at the meta level, so that potential donors can update appropriately re. Rethink’s general quality of work.)

Additionally—and this is less important, I’m puzzled at the meta level at the way we’ve arrived here. As noted in the top-level post, Open Phil has been less than wholly open about its grantmaking, and it’s taken a pretty not-on-the-default-path sequence of events—Ariel, someone who’s not affiliated with OP and who doesn’t work on animal welfare for their day job, writing this big post; Emily from OP replying to the post and to a couple of the comments; me, a Forum-goer who doesn’t work on animal welfare, spotting an inconsistency in Emily’s replies—to surface the fact that OP does not give Rethink’s moral weights any weight.

  1. ^

    Edited to add: Carl has left a detailed reply below, and it seems that three is, in fact, what has happened.

Here, you say, “Several of the grants we’ve made to Rethink Priorities funded research related to moral weights.” Yet in your initial response, you said, “We don’t use Rethink’s moral weights.” I respect your tapping out of this discussion, but at the same time I’d like to express my puzzlement as to why Open Phil would fund work on moral weights to inform grantmaking allocation, and then not take that work into account.

The "EA movement", however you define it, doesn't get to control the money and there are good reasons for this.

I disagree, for the same reasons as those given in the critique to the post you cite. Tl;dr: Trades have happened, in EA, where many people have cast aside careers with high earning potential in order to pursue direct work. I think these people should get a say over where EA money goes.

Directionally, I agree with your points. On the last one, I’ll note that counting person-years (or animal-years) falls naturally out of empty individualism as well as open individualism, and so the point goes through under the (substantively) weaker claim of “either open or empty individualism is true”.[1]

(You may be interested in David Pearce’s take on closed, empty, and open individualism.)

  1. ^

    For the casual reader: The three candidate theories of personal identity are empty, open, and closed individualism. Closed is the common sense view, but most people who have thought seriously about personal identity—e.g., Parfit—have concluded that it must be false (tl;dr: because nothing, not memory in particular, can “carry” identity in the way that's needed for closed individualism to make sense). Of the remaining two candidates, open appears to be the fringe view—supporters include Kolak, Johnson, Vinding, and Gomez-Emilsson (although Kolak's response to Cornwall makes it unclear to what extent he is indeed a supporter). Proponents of (what we now call) empty individualism include Parfit, Nozick, Shoemaker, and Hume.

There was near-consensus that Open Phil should generously fund promising AI safety community/movement-building projects they come across

Would you be able to say a bit about to what extent members of this working group have engaged with the arguments around AI safety movement-building potentially doing more harm than good? For instance, points 6 through 11 of Oli Habryka's second message in the “Shutting Down the Lightcone Offices” post (link). If they have strong counterpoints to such arguments, then I imagine it would be valuable for these to be written up.

(Probably the strongest response I've seen to such arguments is the post “How MATS addresses ‘mass movement building’ concerns”. But this response is MATS-specific and doesn't cover concerns around other forms of movement building, for example, ML upskilling bootcamps or AI safety courses operating through broad outreach.)

I enjoyed this post, thanks for writing it.

Is there any crucial consideration I’m missing? For instance, are there reasons to think agents/civilizations that care about suffering might – in fact – be selected for and be among the grabbiest?

I think I buy your overall claim in your “Addressing obvious objections” section that there is little chance of agents/civilizations who disvalue suffering (hereafter: non-PUs) winning a colonization race against positive utilitarians (PUs). (At least, not without causing equivalent expected suffering.) However, my next thought is that non-PUs will generally work this out, as you have, and that some fraction of technologically advanced non-PUs—probably mainly those who disvalue suffering the most—might act to change the balance of realized upside- vs. downside-focused values by triggering false vacuum decay (or by doing something else with a similar switching-off-a-light-cone effect).

In this way, it seems possible to me that suffering-focused agents will beat out PUs. (Because there’s nothing a PU agent—or any agent, for that matter—can do to stop a vacuum decay bubble.) This would reverse the post’s conclusion. Suffering-focused agents may in fact be the grabbiest, albeit in a self-sacrificial way.

(It also seems possible to me that suffering-focused agents will mostly act cooperatively, only triggering vacuum decays at a frequency that matches the ratio of upside- vs. downside-focused values in the cosmos, according to their best guess for what the ratio might be.[1] This would neutralize my above paragraph as well as the post's conclusion.)

  1. ^

    My first pass at what this looks like in practice, from the point of view of a technologically advanced, suffering-focused (or perhaps non-PU more broadly) agent/civilization: I consider what fraction of agents/civilizations like me should trigger vacuum decays in order to realize the cosmos-wide values ratio. Then, I use a random number generator to tell me whether I should switch off my light cone.

    Additionally, one wrinkle worth acknowledging is that some universes within the inflationary multiverse, if indeed it exists and allows different physics in different universes, are not metastable. PUs likely cannot be beaten out in these universes, because vacuum decays cannot be triggered. Nonetheless, this can be compensated for through suffering-focused/non-PU agents in metastable universes triggering vacuum decays at a correspondingly higher frequency.

This is a good post; I’m happy it exists. One thing I notice, which I find a little surprising, is that the post doesn’t seem to include what I'd consider the classic example of controlling the past: evidentially cooperating with beings/civilizations that existed in past cycles of the universe.[1]

  1. ^

    This example does rely on a cyclic (e.g., Big Bounce) model of cosmology,^ which has a couple of issues. Firstly, that such a cosmological model is much less likely to be true, in my all-things-considered view, than eternal inflation. Secondly, that within a cyclic model, there isn't a clearly meaningful notion of time across cycles. However, I don't think these issues undercut the example. Controlling faraway events through evidential cooperation is no less possible in an eternally inflating multiverse, it's just that space is doing more of the work now than time (which makes it a less classic example for controlling the past). Also, while to an observer within a cycle, the notion of time outside their cycle may not hold much meaning, I think that from a God's eye view, there is a material sense in which the cycles occur sequentially, with some in the past of others.

    In addition, the example can be adapted, I believe, to fit the simulation hypothesis. Sequential universe cycles become sequential simulation runs,* and the God’s eye view is now the point of view of the beings in the level of reality one above ours, whether that be base reality or another simulation.  *(It seems likely to me that simulation runs would be massively, but not entirely, parallelized. Moreover, even if runs are entirely parallelized, it would be physically impossible—so long as the level-above reality has physical laws that remotely resemble ours—for two or more simulations to happen in the exact same spatial location. Therefore, there would be frames of reference in the base reality from which some simulation runs take place in the past of others.)

    ^ (One type of cyclic model, conformal cyclic cosmology, allows causal as well as evidential influence between universes, though in this model one universe can only causally influence the next one(s) in the sequence (i.e., causally controlling the past is not possible). For more on this, see "What happens after the universe ends?".)

Load more