Hide table of contents

Background: I'm an undergraduate CS major. Recently, I've mentioned to my mom that I've been getting involved in the "effective altruism" community, and I've been expressing an increased interest in getting a PhD. The other day, my mom asked me why exactly I wanted a PhD.

Me: Well, I want to help others as much as possible.

Mom: Okay, how are you going to help people with a PhD?

Me: Well, I don't know... maybe try to reduce existential risks...

Mom: Whoa, existential risks?

Me: Uh, I don't know, I mean, maybe it wouldn't be that bad, but it seems likely that AI will be very important in the future. And if AI has good goals that match up with the goals of humans, they could solve lots of the world's problems, so I really want to increase the odds of that happening.

Mom: So what's going to happen if AIs don't have good goals?

Me: Well, I guess... they could kill off humanity?

Mom: Whoa!

Fortunately, we moved on in the conversation at this point, but I don't think I gave her the best first impression of these ideas. Does anyone know of any good articles or videos for a popular audience that present the AI alignment problem in moderate depth, without too much sensationalism? I'm sure there are people who would do a much better job than me at explaining these concepts to my mom. Similarly, content on EA concepts in general would be helpful.

It's most important to me to convince my mom that what I'm doing is worthwhile, but I also want to be able to talk about my career plans with non-EAs without them thinking I've joined a Doomsday cult. For people working in existential risk and other "weird" areas - how do you usually talk about your work when it comes up in conversation?

9

0
0

Reactions

0
0
New Answer
New Comment


5 Answers sorted by

Explaining AI x-risk directly will excite about 20% of people and freak out the other 80%. Which is fine if you want to be a public intellectual, or chat to people within EA, but not fine for interacting with most family/friends, moving about in academia etc. The standard approach for the latter is to say you're working on researching safe and fair AI, where shorter term risks, and longer term catastrophes are particular examples.

This is not exactly the answer you're looking for, and I'm not confident about this, but I think it's maybe good to refine your reasons for working on AI risk and being clear what you mean first, and after you get a good sense of what you mean (at least enough to convince a much more skeptical version of yourself), a more easily explainable version of the arguments may come naturally to you. 

(Take everything I say here with a huge lump of salt...FWIW I don't know how to explain EA or longtermism or forecasting stuff to my own parents, partially due to the language barrier). 

Brian Christian is incredibly good at tying the short-term concerns everyone already knows about to the long-term concerns. He's done tons of talks and podcasts - not sure which is best, but if 3 hours of heavy content isn't a problem, the 80k one is good.

There's already a completely mainstream x-risk: nuclear weapons (and, popularly, climate change). It could be good to compare AI to these accepted handles. The second species argument can be made pretty intuitive too.

Bonus: here's what I told my mum.

AIs are getting better quite fast, and we will probably eventually get a really powerful one, much faster and better at solving problems than people. It seems really important to make sure that they share our values; otherwise, they might do crazy things that we won't be able to fix. We don't know how hard it is to give them our actual values, and to assure that they got them right, but it seems very hard. So it's important to start now, even though we don't know when it will happen, or how dangerous it will be.

Relatable situation. For a short AI risk inroduction for moms, I think I would suggest Robert Miles´ Youtube Chanel

Not sure how good the Robert Miles channel is for mums (mine might not be particularly interested in his channel!) but for communicating about AI risk Robert Miles is (generally) good and I second this recommendation

Perhaps try explaining by analogy, or providing examples of ways we’re already messing up.

Like the YouTube algorithm. It only maximizes the amount of time people spend on the platform, because (charitably) Google thought that’d be a useful metric for the quality of the content it provides. But instead, it ended up figuring out that if it showed people videos which convinced them of extreme political ideologies, then it would be easier to find videos which would make them angry/happy/sad/other addictive emotions which would keep them on the platform.

This particular problem has since been fixed, but it took quite a while to figure out what was going on, and more time to figure out how to fix it. Maybe use analogies of genies who, if you imperfectly specify your wish, will find some way to technically satisfy it, but screw you over in the process.

One thing which stops me from explaining things well to my parents is the fear of looking weird. Which usually doesn’t stop me (to a fault) when talking with anyone else, but I guess not with my parents. You can avert this via ye-olde Appeal to Authority. Tell them the idea was popularized, in part, by professor Stuart Russel—the writer of the world’s foremost textbook on artificial intelligence—in his book Human Compatible, who currently runs the organization HCAI at Berkeley to tackle this very problem.

edit: Also, be sure to note it’s not just HCAI who’s working on this problem. There’s also MIRI, DeepMind, Anthropic, and other organizations.

Curated and popular this week
 ·  · 13m read
 · 
Notes  The following text explores, in a speculative manner, the evolutionary question: Did high-intensity affective states, specifically Pain, emerge early in evolutionary history, or did they develop gradually over time? Note: We are not neuroscientists; our work draws on our evolutionary biology background and our efforts to develop welfare metrics that accurately reflect reality and effectively reduce suffering. We hope these ideas may interest researchers in neuroscience, comparative cognition, and animal welfare science. This discussion is part of a broader manuscript in progress, focusing on interspecific comparisons of affective capacities—a critical question for advancing animal welfare science and estimating the Welfare Footprint of animal-sourced products.     Key points  Ultimate question: Do primitive sentient organisms experience extreme pain intensities, or fine-grained pain intensity discrimination, or both? Scientific framing: Pain functions as a biological signalling system that guides behavior by encoding motivational importance. The evolution of Pain signalling —its intensity range and resolution (i.e., the granularity with which differences in Pain intensity can be perceived)— can be viewed as an optimization problem, where neural architectures must balance computational efficiency, survival-driven signal prioritization, and adaptive flexibility. Mathematical clarification: Resolution is a fundamental requirement for encoding and processing information. Pain varies not only in overall intensity but also in granularity—how finely intensity levels can be distinguished.  Hypothetical Evolutionary Pathways: by analysing affective intensity (low, high) and resolution (low, high) as independent dimensions, we describe four illustrative evolutionary scenarios that provide a structured framework to examine whether primitive sentient organisms can experience Pain of high intensity, nuanced affective intensities, both, or neither.     Introdu
 ·  · 2m read
 · 
A while back (as I've just been reminded by a discussion on another thread), David Thorstad wrote a bunch of posts critiquing the idea that small reductions in extinction risk have very high value, because the expected number of people who will exist in the future is very high: https://reflectivealtruism.com/category/my-papers/mistakes-in-moral-mathematics/. The arguments are quite complicated, but the basic points are that the expected number of people in the future is much lower than longtermists estimate because: -Longtermists tend to neglect the fact that even if your intervention blocks one extinction risk, there are others it might fail to block; surviving for billions  (or more) of years likely  requires driving extinction risk very low for a long period of time, and if we are not likely to survive that long, even conditional on longtermist interventions against one extinction risk succeeding, the value of preventing extinction (conditional on more happy people being valuable) is much lower.  -Longtermists tend to assume that in the future population will be roughly as large as the available resources can support. But ever since the industrial revolution, as countries get richer, their fertility rate falls and falls until it is below replacement. So we can't just assume future population sizes will be near the limits of what the available resources will support. Thorstad goes on to argue that this weakens the case for longtermism generally, not just the value of extinction risk reductions, since the case for longtermism is that future expected population  is many times the current population, or at least could be given plausible levels of longtermist extinction risk reduction effort. He also notes that if he can find multiple common mistakes in longtermist estimates of expected future population, we should expect that those estimates might be off in other ways. (At this point I would note that they could also be missing factors that bias their estimates of
 ·  · 7m read
 · 
The company released a model it classified as risky — without meeting requirements it previously promised This is the full text of a post first published on Obsolete, a Substack that I write about the intersection of capitalism, geopolitics, and artificial intelligence. I’m a freelance journalist and the author of a forthcoming book called Obsolete: Power, Profit, and the Race to Build Machine Superintelligence. Consider subscribing to stay up to date with my work. After publication, this article was updated to include an additional response from Anthropic and to clarify that while the company's version history webpage doesn't explicitly highlight changes to the original ASL-4 commitment, discussion of these changes can be found in a redline PDF linked on that page. Anthropic just released Claude 4 Opus, its most capable AI model to date. But in doing so, the company may have abandoned one of its earliest promises. In September 2023, Anthropic published its Responsible Scaling Policy (RSP), a first-of-its-kind safety framework that promises to gate increasingly capable AI systems behind increasingly robust safeguards. Other leading AI companies followed suit, releasing their own versions of RSPs. The US lacks binding regulations on frontier AI systems, and these plans remain voluntary. The core idea behind the RSP and similar frameworks is to assess AI models for dangerous capabilities, like being able to self-replicate in the wild or help novices make bioweapons. The results of these evaluations determine the risk level of the model. If the model is found to be too risky, the company commits to not releasing it until sufficient mitigation measures are in place. Earlier today, TIME published then temporarily removed an article revealing that the yet-to-be announced Claude 4 Opus is the first Anthropic model to trigger the company's AI Safety Level 3 (ASL-3) protections, after safety evaluators found it may be able to assist novices in building bioweapons. (The