Sharmake

An underrated success story which I find more plausible than moral circle expansion is that human concern for animal welfare (both wild and farmed animal welfare) remains low, but due to a combo of causal moral trade and acausal trade like Evidential Cooperation in Large Worlds, animals/animal advocates across the multiverse get most of what they want because it's cheap and easy to coordinate.

In general, I notice that trade-based futures aren't shown, when I tend to think they are by far the most likely way most beings get most of what they want almost regardless of values, if we can somehow prevent threats/blackmail from eating into expected value.

The best cause will disappoint you: An intro to the optimisers curse

Sharmake3mo4

While I agree that the optimizer's curse is a problem, and one that is relevant for certain sectors of EA, I will also say that given the very high variance in expected impact between causes, this is much less of a problem than other problems in EA epistemics, which is why it hasn't received much attention.

That said, you do note some very interesting things about the optimizer's curse, so the post is valuable beyond restating the problem, so I will give credit where it's due, it's a nice incremental improvement.

Evidence that Recent AI Gains are Mostly from Inference-Scaling

Sharmake3mo*3

To a large extent, I agree that RL scaling is basically just inference scaling for the most part, but I disagree with this claim immensely, and this causes me to have different expectations of AI progress over the next 4-6 years (but agree in the longer term, absent new paradigms inference scaling will be more important and AI progress will slow back down to the prior compute trend of 1.55x efficiency per year, rather than getting 3-4x more compute every year):

> In the last year or two, the most important trend in modern AI came to an end. The scaling-up of computational resources used to train ever-larger AI models through next-token prediction (pre-training) stalled out.

Vladimir Nesov explains why here in more detail, but the issue here is that the scaling laws were already fairly weak (and probably closer to logarithmic returns than linear returns, meaning that the compute increase from GPT-4 to GPT-4.5 was much closer to 10x than 100x, which means it's not surprising that people were disappointed in AI progress, since GPT-3 to GPT-4 type progress required 100x compute that will only come online in 2028 and 2030), so we have little evidence that returns have recently gotten worse, especially in a way that suggests that pre-training has stalled.

I think this post is much better viewed as evidence that pre-training isn't dead, it’s just resting, and that RL will in the near-term account for way less AI progress than pre-training, and that the big scale up of RLVR in 2025-2027 is much more of a one-time boost than a second trend that can progress independently of pre-training.

Will we get automated alignment research before an AI Takeoff?

Sharmake4mo2

For what it's worth, I think pre-training alone is probably enough to get us to about 1-3 month time horizons based on a 7 month doubling time, but pre-training data will start to run out in the early 2030s, meaning that you no longer (in the absence of other benchmarks) have very good general proxies of capabilities improvements.

The real issue isn't the difference between hours and months long tasks, but the difference between months long tasks and century long tasks, which Steve Newman describes well here.

New 80k problem profile: extreme power concentration

Sharmake5mo4

Nice write-up on the issue.

One thing I will say is that I'm maybe unusually optimistic on power concentration compared to a lot of EAs/LWers, and the main divergence I have is that I basically treat this counter-argument as decisive enough to make me think the risk of power-concentration doesn't go through, even in scenarios where humanity is basically as careless as possible.

This is due to evidence on human utility functions showing that most people have diminishing returns on utility on exclusive goods to use personally that are fast enough that altruism matters much more than their selfish desires on stellar/galaxy wide scales, combined with me being a relatively big believer in quite a few risks like suffering risks being very cheap to solve via moral trade where most humans are apathetic on.

More generally, I've become mostly convinced of the idea that a crucial positive consideration on any post-AGI/ASI future is that it's really, really easy to prevent most of the worst things that can happen in those futures under a broad array of values, even if moral objectivism/moral realism is false and there isn't much convergence on values amongst the broad population.

Donation Election Discussion Thread

Sharmake6mo8

The main reason I voted for Forethought and MATS was because I believe AI governance/safety is both unusually important, with only Farmed/Wild animal welfare being competitive in terms of EV, and I believe that AI has a reasonable chance to be so powerful as to make other cause area assumptions irrelevant, meaning their impact is much, much less predictable without considering AI governance/safety.

Eric Neyman's Quick takes

Sharmake6mo4

One of the key issues with "making the future go well" interventions is that we start to run up against the reality that what is a desirable outcome for the future is so variable between different humans that the concept of making the future go well requires buying into ethical assumptions that people won't share, meaning that it's much less valid as any sort of absolute metric to coordinate around:

(A quote from Steven Byrnes here):

When people make statements that implicitly treat "the value of the future" as being well-defined, e.g. statements like “I define ‘strong utopia’ as: at least 95% of the future’s potential value is realized”, I’m concerned that these statements are less meaningful than they sound.

This level of variability is less for preventing bad outcomes, especially outcomes in which we don't die (though there is still variability here) because of instrumental convergence, and while there are moral views where dying/suffering isn't so bad, these moral views aren't held by many human beings (in part due to selection effects), so there's less of a chance to have conflict with other agents.

The other reason is humans mostly value the same scarce instrumental goods, but in a world where AI goes well, basically everything but status/identity becomes abundant, and this surfaces up the latent moral disagreements way more than our current world.

How to make the future better (other than by reducing extinction risk)

Sharmake6mo*3

I'm commenting late, but I don't think the better futures perspective gets us back to intuitive/normie ethical views, because what is a better future has far more variation in values than preventing catastrophic outcomes (I'm making an empirical claim that most human values have more convergence in things they want to avoid than in things they want to seek out/are positive), and the other issue is that to a large extent, AGI/ASI in the medium/long-term is very totalizing in its effects, meaning that basically the only thing that matters is getting a friendly ASI to you, and thus promoting peace/democracy don't matter, while good governance can actually matter (though it'd have to be way more specific than what Will MacAskill defines as good governance.)

Yarrow's Quick takes

Sharmake7mo5

An example here is this quote, which straddles dangerously close to "these people have morality that you find to be offensive, therefore they are wrong on the actual facts of the matter" (Otherwise you would make the Nazi source allegations less central to your criticism here):

(I don't hold the moral views of what the quote is saying, to be clear).

It has never stopped shocking and disgusting me that the EA Forum is a place where someone can write a post arguing that Black Africans need Western-funded programs to edit their genomes to increase their intelligence in order to overcome global poverty and can cite overtly racist and white supremacist sources to support this argument (even a source with significant connections to the 1930s and 1940s Nazi Party in Germany and the American Nazi Party, a neo-Nazi party) and that post can receive a significant amount of approval and defense from people in EA, even after the thin disguise over top of the racism is removed by perceptive readers. That is such a bonkers thing and such a morally repugnant thing, I keep struggling to find words to express my exasperation and disbelief. Effective altruism as a movement probably deserves to fail for that, if it can't correct it.^[2]

Yarrow's Quick takes

Sharmake7mo14

Another issue, and why the comment is getting downvoted heavily (including by myself) is because you seem to conflate the is-ought distinction with this post, and without the is-ought distinction being conflated, this post would not exist.

You routinely leap from "a person has moral views that are offensive to you" to "they are wrong about the facts of the matter", and your evidence for this is paper thin at best.

Being able to separate moral views from beliefs on factual claims is one of the things that is expected if you are in EA/LW spaces.

This is not mutually exclusive with the issues CB has found.

Sharmake

Posts 14

Comments358

Topic contributions2

Posts
14

Comments
358

Topic contributions
2