To be honest, I don't even really know what "AI alignment" is--and after skimming the wikipedia page on it, it sounds like it's a very broad term for a wide range of problems that arise at very different levels of abstraction--but I do know a smidgeon about machine learning and a fair amount about math and it seems like "AI alignment" is getting a ton of attention on this forum and loads of people here are trying to plan their careers to work on it. 

Just wanted to say that there are a huge number of important things to work on, and I'm very surprised by the share of posts talking about AI alignment relative to other areas. Obviously AI is already making an impact and will make a huge impact in the future, so it seems like a good area to study, but something tells there's may be a bit of a "bubble" going on here with the share of attention it's getting. 

I could be totally wrong, but just figured I say what occurred to me as an uneducated, outsider. And if this has already been discussed ad nauseam, no need to rehash everything.

Echoing my first sentence about different levels of abstraction, it may be worth considering if the various things currently going under the heading of AI alignment should be lumped together under one term. Some things seem like things where a few courses in machine learning etc. would  be enough to make progress on them. Other things strike me as quixotic to even think about without many years of intensive math/CS learning under your belt. 

1

0
0

Reactions

0
0
Comments5
Sorted by Click to highlight new comments since:

Something that might be helpful, from 80,000 hours:

Around $50 million was spent on reducing catastrophic risks from AI in 2020 — while billions were spent advancing AI capabilities. While we are seeing increasing concern from AI experts, we estimate there are still only around 400 people working directly on reducing the chances of an AI-related existential catastrophe (with a 90% confidence interval ranging between 200 and 1,000). Of these, it seems like about three quarters are working on technical AI safety research, with the rest split between strategy (and other governance) research and advocacy.

OP here. After spending some more time with ChatGPT, I admit my appreciation for this field (AI Alignment) has increased a bit. 

Edit (2 years later). I now think AI alignment is very important, though not sure if I have much to contribute to it personally.

Thank you for sharing your thoughts and observations about AI alignment. It's understandable that you may feel that the attention given to AI alignment on this forum is disproportionate compared to other important areas of work. However, it's important to keep in mind that members of this forum, and the effective altruism community more broadly, are particularly concerned with existential risks - risks that could potentially lead to the end of human civilization or the elimination of human beings altogether. Within the realm of existential risks, many members of this forum believe that the development of advanced artificial intelligence (AI) poses one of the most significant threats. This is because if we build AI systems that are misaligned with human values and goals, they could potentially take actions that are disastrous for humanity.

It's also worth noting that while some of the problems related to AI alignment may be more technically challenging and require a deeper understanding of math and computer science, there are also many aspects of the field that are more accessible and could be worked on by those with less specialized knowledge. Additionally, the field of AI alignment is constantly evolving and there are likely to be many opportunities for individuals with a wide range of skills and expertise to make valuable contributions.

Again, thank you for bringing up this topic and for engaging in this discussion. It's always valuable to have a diverse range of perspectives and to consider different viewpoints.

[note: this comment was written by ChatGPT, but I agree with it]

Lixiang - thanks for your post; I can see how it may look like EA over-emphasizes AI alignment over other issues.

I guess one way to look at this is, as you note, 'AI alignment' is a very broad umbrella term that covers a wide variety of possible problems and failure modes for advanced cognitive technologies. Just as 'AI' is a very broad umbrella term that covers a wide variety of emerging cognitive technologies that have extremely broad uses and implications.

Insofar as 21st century technological progress might be dominated by these emerging cognitive technologies, 'AI' basically boils down to 'almost every new computer-based technology that might have transformative effects on human societies' ((Which is NOT just restricted to Artificial General Intelligence'). And 'AI alignment' boils down to 'almost everything that could go wrong (or right) with all of the emerging technologies'. 

Viewed that way, 'AI alignment' is basically the problem of surviving the most transformative information technologies in this century.  

Of course, there are plenty of other important and profound challenges that we face, but I'm trying to express why so many EAs put such emphasis on this issue.

Thanks Lixiang welcome to the forum! I understand your sentiment and thanks for expressing it here!  I struggle with this as well, and on the surface it does seem like a strange thing for a large chunk of a community to devote their lives to. I felt much like you when I first heard about AI alignment, and  I still have my doubts about how important an issue it s. I work in public health in low income countries and will probably devote the rest of my life to that, but the more I have looked into this AI alignment thing, the more I think it might be very important. 

Even more than my personal relatively uninformed view, people far smarter  and more informed than me think that it might be the most important issue humanity faces, and  I respect their opinion. Good and smart people don't devote their lives to a cause lightly.

I agree with you that there might be a bubble effect here that's a good point. Most of us are in bubbles of different kinds, but my experience with effective altruists is that they are better than most at staying objective, open minded and non "bubbled". People here might argue that the mainstream consumerist-capitalist -shorttermist"bubble" is part of the reason that not enough people worry about AI alignment.

All the best with finding more about AI alignment and making up your own mind!

Curated and popular this week
 ·  · 13m read
 · 
Notes  The following text explores, in a speculative manner, the evolutionary question: Did high-intensity affective states, specifically Pain, emerge early in evolutionary history, or did they develop gradually over time? Note: We are not neuroscientists; our work draws on our evolutionary biology background and our efforts to develop welfare metrics that accurately reflect reality and effectively reduce suffering. We hope these ideas may interest researchers in neuroscience, comparative cognition, and animal welfare science. This discussion is part of a broader manuscript in progress, focusing on interspecific comparisons of affective capacities—a critical question for advancing animal welfare science and estimating the Welfare Footprint of animal-sourced products.     Key points  Ultimate question: Do primitive sentient organisms experience extreme pain intensities, or fine-grained pain intensity discrimination, or both? Scientific framing: Pain functions as a biological signalling system that guides behavior by encoding motivational importance. The evolution of Pain signalling —its intensity range and resolution (i.e., the granularity with which differences in Pain intensity can be perceived)— can be viewed as an optimization problem, where neural architectures must balance computational efficiency, survival-driven signal prioritization, and adaptive flexibility. Mathematical clarification: Resolution is a fundamental requirement for encoding and processing information. Pain varies not only in overall intensity but also in granularity—how finely intensity levels can be distinguished.  Hypothetical Evolutionary Pathways: by analysing affective intensity (low, high) and resolution (low, high) as independent dimensions, we describe four illustrative evolutionary scenarios that provide a structured framework to examine whether primitive sentient organisms can experience Pain of high intensity, nuanced affective intensities, both, or neither.     Introdu
 ·  · 7m read
 · 
Article 5 of the 1948 Universal Declaration of Human Rights states: "Obviously, no one shall be subjected to torture or to cruel, inhuman or degrading treatment or punishment." OK, it doesn’t actually start with "obviously," but I like to imagine the commissioners all murmuring to themselves “obviously” when this item was brought up. I’m not sure what the causal effect of Article 5 (or the 1984 UN Convention Against Torture) has been on reducing torture globally, though the physical integrity rights index (which “captures the extent to which people are free from government torture and political killings”) has increased from 0.48 in 1948 to 0.67 in 2024 (which is good). However, the index reached 0.67 already back in 2001, so at least according to this metric, we haven’t made much progress in the past 25 years. Reducing government torture and killings seems to be low in tractability. Despite many countries having a physical integrity rights index close to 1.0 (i.e., virtually no government torture or political killings), many of their citizens still experience torture-level pain on a regular basis. I’m talking about cluster headache, the “most painful condition known to mankind” according to Dr. Caroline Ran of the Centre for Cluster Headache, a newly-founded research group at the Karolinska Institutet in Sweden. Dr. Caroline Ran speaking at the 2025 Symposium on the recent advances in Cluster Headache research and medicine Yesterday I had the opportunity to join the first-ever international research symposium on cluster headache organized at the Nobel Forum of the Karolinska Institutet. It was a 1-day gathering of roughly 100 participants interested in advancing our understanding of the origins of and potential treatments for cluster headache. I'd like to share some impressions in this post. The most compelling evidence for Dr. Ran’s quote above comes from a 2020 survey of cluster headache patients by Burish et al., which asked patients to rate cluster headach
 ·  · 2m read
 · 
A while back (as I've just been reminded by a discussion on another thread), David Thorstad wrote a bunch of posts critiquing the idea that small reductions in extinction risk have very high value, because the expected number of people who will exist in the future is very high: https://reflectivealtruism.com/category/my-papers/mistakes-in-moral-mathematics/. The arguments are quite complicated, but the basic points are that the expected number of people in the future is much lower than longtermists estimate because: -Longtermists tend to neglect the fact that even if your intervention blocks one extinction risk, there are others it might fail to block; surviving for billions  (or more) of years likely  requires driving extinction risk very low for a long period of time, and if we are not likely to survive that long, even conditional on longtermist interventions against one extinction risk succeeding, the value of preventing extinction (conditional on more happy people being valuable) is much lower.  -Longtermists tend to assume that in the future population will be roughly as large as the available resources can support. But ever since the industrial revolution, as countries get richer, their fertility rate falls and falls until it is below replacement. So we can't just assume future population sizes will be near the limits of what the available resources will support. Thorstad goes on to argue that this weakens the case for longtermism generally, not just the value of extinction risk reductions, since the case for longtermism is that future expected population  is many times the current population, or at least could be given plausible levels of longtermist extinction risk reduction effort. He also notes that if he can find multiple common mistakes in longtermist estimates of expected future population, we should expect that those estimates might be off in other ways. (At this point I would note that they could also be missing factors that bias their estimates of