Hide table of contents

This is experimenting with a new kind of post which is meant to convey a lot of ideas very quickly, without going into much detail for each. 

Things I wish people in AI Safety would stop talking about

A list of topics people concerned about x-risk from AI spend, in my opinion, way too much time talking about to those outside the community. It's not that these things aren't real, they just likely won't actually end up mattering that much.

 

How an AI could persuade you to let it out of the box

WRONG!

Keeping AIs in boxes was never a thing companies were seriously going to do. An AI in a box isn’t useful. These aren’t academics carefully studying a new species. This is an industry, with everyone trying to get ahead, get consumer feedback, and training on parallel cloud computing. 

 

How an AI could become an agent

WRONG!

Agency is the obvious next step that people will try to make their AIs into. An AI “tool” is simply inferior in almost every possible way to an agent. You don’t need to specify a special prompt, constantly click “Approve Plan”, or any of the supervised, time-consuming requirements from mere tools. The economic advantage is simply too staggering for people not to do this. Everyone can even agree that it’s dangerous, a totally bad idea, and still have to do it anyway if they want to (economically) survive.


How an AI could get ahold of, or create, weapons

WRONG!

The military advantages of fully-autonomous weapons is just too great for any large-scale government not to do, especially for democracies where losing troops abroad results in massive political backlash. Humans using a remote controller is just too slow a process, because it still means humans have to make very fast, split-second decisions. Fully autonomous warfare would result in tactical decisions occurring faster than any possible human could perform. Look at how Alpha Zero played millions of games against itself in 70 hours. AIs can make decisions faster, which is all that will matter.

 

How an AI might Recursively Self Improve without humans noticing

WRONG!

RSI is an ace in the hole for any company or government. They will try to do this. As AI continues to expand faster, and the stakes get higher, the paranoia that someone else will achieve it first will drive players to compete to create RSI. It’s the gift that keeps on giving. You don’t just get momentarily ahead of your competition, but you get to stay ahead, and keep moving so fast no one else can hope to keep up. Everyone will want to do this, even if they know it's dangerous, because the potential gains are too great.

 

Why a specific AI will want to kill you

Even though most AI systems, scaled to superintelligence, might want you dead doesn’t mean this is a hill to fight hard on. At the end of the day, even if most don’t want you dead, it doesn’t matter. All you need is just one superintelligence to want you dead, and then you get dead. If someone’s idea of a “safe” superintelligence doesn’t specify how it deals with all potential future intelligences, then inevitably someone designs an AI that kills everyone. It’s an end state to the game. Unless an AI kills everyone or somehow prevents other AIs from developing, the game continues.


 

Comments3


Sorted by Click to highlight new comments since:

Prometheus -- all of these topics might sound hackneyed, threadworn, and over-discussed to people who have spent years within the EA/LessWrong communities. 

But to most normal folks (the 'general public') who are encountering discussions of AI risk for the first time, all of these remain highly relevant issues to think through. It's not at all obvious to most people why companies would seek AIs with agency, or AIs capable of recursive self-improvement, or AIs that can make lethal autonomous decisions in military conflicts.

IMHO, we have a serious public duty and moral obligation to communicate as clearly as possible everything we've learned over the last couple of decades in discussing AI X-risk with each other. Most people haven't been privy to those discussions. They deserve to be able to follow our concepts, values, and reasoning on all these topics.

I generally agree, regarding the public at large. I'm speaking mostly from experience of people in the AIS Community speaking with people either working in AI or some related field, and I've found many often can get stuck debating these concepts. The general public seems to get more hung up on concepts like consciousness, sentience, etc. (from my experience)

This is a great (and very recent!) post from DYNOMIGHT on the last point.

Curated and popular this week
 ·  · 32m read
 · 
Summary Immediate skin-to-skin contact (SSC) between mothers and newborns and early initiation of breastfeeding (EIBF) may play a significant and underappreciated role in reducing neonatal mortality. These practices are distinct in important ways from more broadly recognized (and clearly impactful) interventions like kangaroo care and exclusive breastfeeding, and they are recommended for both preterm and full-term infants. A large evidence base indicates that immediate SSC and EIBF substantially reduce neonatal mortality. Many randomized trials show that immediate SSC promotes EIBF, reduces episodes of low blood sugar, improves temperature regulation, and promotes cardiac and respiratory stability. All of these effects are linked to lower mortality, and the biological pathways between immediate SSC, EIBF, and reduced mortality are compelling. A meta-analysis of large observational studies found a 25% lower risk of mortality in infants who began breastfeeding within one hour of birth compared to initiation after one hour. These practices are attractive targets for intervention, and promoting them is effective. Immediate SSC and EIBF require no commodities, are under the direct influence of birth attendants, are time-bound to the first hour after birth, are consistent with international guidelines, and are appropriate for universal promotion. Their adoption is often low, but ceilings are demonstrably high: many low-and middle-income countries (LMICs) have rates of EIBF less than 30%, yet several have rates over 70%. Multiple studies find that health worker training and quality improvement activities dramatically increase rates of immediate SSC and EIBF. There do not appear to be any major actors focused specifically on promotion of universal immediate SSC and EIBF. By contrast, general breastfeeding promotion and essential newborn care training programs are relatively common. More research on cost-effectiveness is needed, but it appears promising. Limited existing
Ben_West🔸
 ·  · 1m read
 · 
> Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months. Extrapolating this trend predicts that, in under a decade, we will see AI agents that can independently complete a large fraction of software tasks that currently take humans days or weeks. > > The length of tasks (measured by how long they take human professionals) that generalist frontier model agents can complete autonomously with 50% reliability has been doubling approximately every 7 months for the last 6 years. The shaded region represents 95% CI calculated by hierarchical bootstrap over task families, tasks, and task attempts. > > Full paper | Github repo Blogpost; tweet thread. 
 ·  · 2m read
 · 
For immediate release: April 1, 2025 OXFORD, UK — The Centre for Effective Altruism (CEA) announced today that it will no longer identify as an "Effective Altruism" organization.  "After careful consideration, we've determined that the most effective way to have a positive impact is to deny any association with Effective Altruism," said a CEA spokesperson. "Our mission remains unchanged: to use reason and evidence to do the most good. Which coincidentally was the definition of EA." The announcement mirrors a pattern of other organizations that have grown with EA support and frameworks and eventually distanced themselves from EA. CEA's statement clarified that it will continue to use the same methodologies, maintain the same team, and pursue identical goals. "We've found that not being associated with the movement we have spent years building gives us more flexibility to do exactly what we were already doing, just with better PR," the spokesperson explained. "It's like keeping all the benefits of a community while refusing to contribute to its future development or taking responsibility for its challenges. Win-win!" In a related announcement, CEA revealed plans to rename its annual EA Global conference to "Coincidental Gathering of Like-Minded Individuals Who Mysteriously All Know Each Other But Definitely Aren't Part of Any Specific Movement Conference 2025." When asked about concerns that this trend might be pulling up the ladder for future projects that also might benefit from the infrastructure of the effective altruist community, the spokesperson adjusted their "I Heart Consequentialism" tie and replied, "Future projects? I'm sorry, but focusing on long-term movement building would be very EA of us, and as we've clearly established, we're not that anymore." Industry analysts predict that by 2026, the only entities still identifying as "EA" will be three post-rationalist bloggers, a Discord server full of undergraduate philosophy majors, and one person at