riceissa

I was indeed simplifying, and e.g. probably should have said "global catastrophe" instead of "human extinction" to cover cases like permanent totalitarian regimes. I think some of the scenarios you mention could happen, but also think a bunch of them are pretty unlikely, and also disagree with your conclusion that "The bulk of the probability lies somewhere in the middle". I might be up for discussing more specifics, but also I don't get the sense that disagreement here is a crux for either of us, so I'm also not sure how much value there would be in continuing down this thread.

Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope

riceissa2y6

I agree with most of the points in this post (AI timelines might be quite short; probability of doom given AGI in a world that looks like our current one is high; there isn't much hope for good outcomes for humanity unless AI progress is slowed down somehow). I will focus on one of the parts where I think I disagree and which feels like a crux for me on whether advocating AI pause (in current form) is a good idea.

You write:

But we can still have all the nice things (including a cure for ageing) without AGI; it might just take a bit longer than hoped. We don’t need to be risking life and limb driving through red lights just to be getting to our dream holiday a few minutes earlier.

I think framings like these do a misleading thing where they use the word "we" to ambiguously refer to both "humanity as a whole" and "us humans who are currently alive". The "we" that decides how much risk to take is the humans currently alive, but the "we" that enjoys the dream holiday might be humans millions of years in the future.

I worry that "AI pause" is not being marketed honestly to the public. If people like Wei Dai are right (and I currently think they are), then AI development may need to be paused for millions of years potentially, and it's unclear how long it will take unaugmented or only mildly augmented humans to reach longevity escape velocity.

So to a first approximation, the choice available to humans currently alive is something like:

Option A: 10% chance utopia within our lifetime (if alignment turns out to be easy) and 90% human extinction
Option B: ~100% chance death but then our descendants probably get to live in a utopia

For philosophy nerds with low time preference and altruistic tendencies (into which I classify many EA people and also myself), Option B may seem obvious. But I think many humans existing today would rather risk it and just try to build AGI now, rather than doing any AI pause, and to the extent that they say they prefer pause, I think they are being deceived by the marketing or acting under Caplanian Principle of Normality, or else they are somehow better philosophers than I expected they would be.

(Note: if you are so pessimistic about aligning AI without a pause that your probability on that is lower than the probability of unaugmented present-day humans reaching longevity escape velocity, then Option B does seem like a strictly better choice. But the older and more unhealthy you are, the less this applies to you personally.)

RyanCarey's Quick takes

riceissa2y10

I've wondered about this for independent projects and there's some previous discussion here.

See also the shadows of the future term that Michael Nielsen uses.

Shapley values: Better than counterfactuals

riceissa2y2

I think a general and theoretically sound approach would be to build a single composite game to represent all of the games together

Yeah, I did actually have this thought but I guess I turned it around and thought: shouldn't an adequate notion of value be invariant to how I decide to split up my games? The linearity property on Wikipedia even seems to be inviting us to just split games up in however manner we want.

And yeah, I agree that in the real world games will overlap and so there will be double counting going on by splitting games up. But if that's all that's saving us from reaching absurd conclusions then I feel like there ought to be some refinement of the Shapley value concept...

Shapley values: Better than counterfactuals

riceissa2y2

I asked my question because the problem with infinities seems unique to Shapley values (e.g. I don't have this same confusion about the concept of "marginal value added"). Even with a small population, the number of cooperative games seems infinite: for example, there are an infinite number of mathematical theorems that could be proven, an infinite number of Wikipedia articles that could be written, an infinite number of films that could be made, etc. If we just use "marginal value added", the total value any single person adds is finite across all such cooperative games because in the actual world, they can only do finitely many things. But the Shapley value doesn't look at just the "actual world", it seems to look at all possible sequences of ways of adding people to the grand coalition and then averages the value, so people get non-zero Shapley value assigned to them even if they didn't do anything in the "actual world".

(There's maybe some sort of "compactness" argument one could make that even if there are infinitely many games, in the real world only finitely many of them get played to completion and so this should restrict the total Shapley value any single person can get, but I'm just trying to go by the official definition for now.)

Shapley values: Better than counterfactuals

riceissa2y2

I don't think the example you give addresses my point. I am supposing that Leibniz could have also invented calculus, so . But Leibniz could have also invented lots of different things (infinitely many things!), and his claim to each invention would be valid (although in the real world he only invents finitely many things). If each invention is worth at least a unit of value, his Shapley value across all inventions would be infinite, even if Leibniz was "maximally unluckly" and in the actual world got scooped every single time and so did not invent anything at all.

I don't understand the part about self-modifications - can you spell it out in more words/maybe give an example?

Shapley values: Better than counterfactuals

riceissa2y2

Disagree-voting a question seems super aggressive and also nonsensical to me. (Yes, my comment did include some statements as well, but they were all scaffolding to present my confusion. I wasn't presenting my question as an opinion, as my final sentence makes clear.) I've been unhappy with the way the EA Forum has been going for a long time now, but I am noting this as a new kind of low.

Shapley values: Better than counterfactuals

riceissa2y2

What numerator and denominator? I am imagining that a single person could be a player in multiple cooperative games. The Shapley value for the person would be finite in each game, but if there are infinitely many games, the sum of all the Shapley values (adding across all games, not adding across all players in a single game) could be infinite.

Shapley values: Better than counterfactuals

riceissa2y13

Example 7 seems wild to me. If the applicants who don't get the job also get some of the value, does that mean people are constantly collecting Shapley value from the world, just because they "could" have done a thing (even if they do absolutely nothing)? If there are an infinite number of cooperative games going on in the world and someone can plausibly contribute at least a unit of value to any one of them, then it seems like their total Shapley value across all games is infinite, and at that point it seems like they are as good as one can be, all without having done anything. I can't tell if I'm making some sort of error here or if this is just how the Shapley value works.

Reminding myself just how awful pain can get (plus, an experiment on myself)

riceissa2y9

Do you know of any ways I could experimentally expose myself to extreme amounts of pleasure, happiness, tranquility, and truth?

I'm not aware of any way to expose yourself to extreme amounts of pleasure, happiness, tranquility, and truth that is cheap, legal, time efficient, and safe. That's part of the point I was trying to make in my original comment. If you're willing forgo some of those requirements, then as Ian/Michael mentioned, for pleasure and tranquility I think certain psychedelics (possibly illegal depending on where you live, possibly unsafe, and depending on your disposition/luck may be a terrible idea) and meditation practices (possibly expensive, takes a long time, possibly unsafe) could be places to look into. For truth, maybe something like "learning all the fields and talking to all the people out there" (expensive, time-consuming, and probably unsafe/distressing), though I realize that's a pretty unhelpful suggestion.

I'd be willing to expose myself to whatever you suggest, plus extreme suffering, to see if this changes my mind. Or we can work together to design a different experimental setup if you think that would produce better evidence.

I appreciate the offer, and think it's brave/sincere/earnest of you (not trying to be snarky/dismissive/ironic here - I really wish more people had more of this trait that you seem to possess). My current thinking though is that humans need quite a benign environment in order to stay sane and be able to introspect well on their values (see discussion here, where I basically agree with Wei Dai), and that extreme experiences in general tend to make people "insane" in unpredictable ways. (See here for a similar concern I once voiced around psychedelics.) And even a bunch of seemingly non-extreme experiences (like reading the news, going on social media, or being exposed to various social environments like cults and Cultural Revolution-type dynamics) seem to have historically made a bunch of people insane and continue to make people insane. Basically, although flawed, I think we still have a bunch of humans around who are still basically sane or at least have some "grain of sanity" in them, and I think it's incredibly important to preserve that sanity. So I would probably actively discourage people from undertaking such experiments in most cases.

riceissa

Bio

Posts 11

Comments100

Topic contributions2

Posts
11

Comments
100

Topic contributions
2