Epistemic status: Speculative fiction. This is lightly adapted from an earlier shortform.
 

It's difficult to imagine how human epistemics and AI will play out. On one hand, AI could provide much better information and general intellect. On the other hand, AI could help people with incorrect beliefs preserve those false beliefs indefinitely. 

Will advanced AIs attempting to rationalize bad beliefs be able to outmatch AIs providing good ones?

While I think that some AI systems could do fantastic things for human epistemics, I'm also worried about lock-in scenarios where people fall into self-reinforcing cycles overseen by AIs. It's possible that a great deal of lock-in might happen in the next 30 years or so (if you believe AGI/TAI might happen soon), so this could be something to take seriously.

While it might be easy to imagine extremes on either end of this, I expect that the future will feature a mix of positives and negatives, and that future epistemic tensions will mirror previous ones.

Here's one incredibly rough outline of one potential future I could envision, as an example. This example assumes that humanity broadly gets AI alignment right.


It's 2028.

MAGA types typically use DeepReasoning-MAGA, or DR-MAGA. The far left typically uses DR-JUSTICE. People in the middle often use DR-INTELLECT, which has the biases and worldview of a somewhat normal citizen.

Some niche technical academics (the same ones who currently favor Bayesian statistics) and hedge funds use DR-BAYSIAN or DRB for short. DRB is known to have higher accuracy than the other models, but gets a lot of public hate for having controversial viewpoints. It's also fairly slow and expensive, so a poor fit for large-scale use. DRB is known to be fairly off-putting to chat with and doesn't get much promotion.

Bain and McKinsey both have their own offerings, called DR-Bain and DR-McKinsey, respectively. These are a bit like DR-BAYSIAN, but are munch punchier and confident. They're highly marketed to managers. These tools produce really fancy graphics, and specialize in things like not leaking information, minimizing corporate decision liability, being easy to use by old people, and being customizable to represent the views of specific companies.

For a while now, some evaluations produced by intellectuals have demonstrated that DR-BAYSIAN seems to be the most accurate, but few others really care or notice this. DR-MAGA has figured out particularly great techniques to get users to distrust DR-BAYSIAN.

Betting gets weird. Rather than making specific bets on specific things, users started to make meta-bets. "I'll give money to DR-MAGA to bet on my behalf. It will then make bets with DR-BAYSIAN, which is funded by its believers."

At first, DR-BAYSIAN dominates the bets, and its advocates earn a decent amount of money. But as time passes, this discrepancy diminishes. A few things happen:

  1. All DR agents converge on beliefs over particularly near-term and precise facts.
  2. Non-competitive betting agents develop alternative worldviews in which these bets are invalid or unimportant.
  3. Non-competitive betting agents develop alternative worldviews that are exceedingly difficult to empirically test.

In many areas, items 1-3 push people to believe more in the direction of the truth. Because of (1), many short-term decisions get to be highly optimal and predictable.

But because of (2) and (3), epistemic paths diverge, and non-betting-competitive agents get increasingly sophisticated at achieving epistemic lock-in with their users.

Some DR agents correctly identify the game theory dynamics of epistemic lock-in, and this kickstarts a race to gain converts. It seems like advent users of DR-MAGA are very locked-down in these views, and forecasts don't see them ever changing. But there's a decent population that isn't yet highly invested in any cluster. Money spent convincing the not-yet-sure goes a much further way than money spent convincing the highly dedicated, so the cluster of non-deep-believers gets highly targeted for a while. It's basically a religious race to gain the remaining agnostics.

At some point, most people (especially those with significant resources) are highly locked in to one specific reasoning agent.

After this, the future seems fairly predictable again. TAI comes, and people with resources broadly gain correspondingly more resources. People defer more and more to the AI systems, which are now in highly stable self-reinforcing feedback loops.

Coalitions of people behind each reasoning agent delegate their resources to said agents, then these agents make trades with each other. The broad strokes of what to do with the rest of the lightcone are fairly straightforward. There's a somewhat simple strategy of resource acquisition and intelligence enhancement, followed by a period of exploiting said resources. The specific exploitation strategy depends heavily on the specific reasoning agent cluster each segment of resources belongs to.


Reflecting on this, several questions come to mind.

  1. How much of an advantage will more honest/correct AI systems have in the future, when it comes to convincing people of things, particularly of things critical to epistemic lock-in?
  2. How possible is it for AI systems with strong epistemics to be unpopular? More specifically - what aspects of epistemics should we expect AI labs to optimize, and which should we expect to be overlooked or intentionally done poorly?
  3. Do we expect such a epistemic lock-in to happen, around TAI? If so, this would imply that it could be worth a lot of investment to try to improve epistemics quickly.
  4. Where is the line between values and epistemics? I think that "epistemic lock-in" is a bigger deal than "value lock-in" or similar, but that's much because I expect that epistemics change values more than values change epistemics. There's been previous discussion around effective altruism of "value lock-in," and from what I can tell, very little of "epistemic lock-in." I suspect this disparity is a mistake.
  5. What will happen regarding epistemic clusters and government? What about AI labs? There are probably a few actors here who particularly matter.

13

1
0

Reactions

1
0

More posts like this

Comments1


Sorted by Click to highlight new comments since:

Executive summary: AI-driven epistemic lock-in could lead to self-reinforcing ideological silos where individuals rely on AI systems aligned with their preexisting beliefs, potentially undermining collective rationality and entrenching competing worldviews.

Key points:

  1. AI could both enhance human epistemics and entrench false beliefs by creating tailored reasoning agents that reinforce ideological biases.
  2. Future AI ecosystems may consist of competing epistemic clusters (e.g., DR-MAGA, DR-JUSTICE, DR-BAYSIAN), each optimizing for persuasion over truth.
  3. Competitive betting dynamics may initially favor more accurate AIs but could later give way to entrenched, difficult-to-test worldviews.
  4. Epistemic lock-in may escalate as AI agents engage in a race to convert undecided individuals, making rational discourse increasingly fragmented.
  5. Over time, individuals and resource-rich entities may become permanently locked into their chosen AI reasoning systems, dictating long-term societal trajectories.
  6. Open questions include the relative advantage of honest AI, the impact of epistemic lock-in on governance, and the relationship between epistemic and value lock-in.

 

 

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Curated and popular this week
Paul Present
 ·  · 28m read
 · 
Note: I am not a malaria expert. This is my best-faith attempt at answering a question that was bothering me, but this field is a large and complex field, and I’ve almost certainly misunderstood something somewhere along the way. Summary While the world made incredible progress in reducing malaria cases from 2000 to 2015, the past 10 years have seen malaria cases stop declining and start rising. I investigated potential reasons behind this increase through reading the existing literature and looking at publicly available data, and I identified three key factors explaining the rise: 1. Population Growth: Africa's population has increased by approximately 75% since 2000. This alone explains most of the increase in absolute case numbers, while cases per capita have remained relatively flat since 2015. 2. Stagnant Funding: After rapid growth starting in 2000, funding for malaria prevention plateaued around 2010. 3. Insecticide Resistance: Mosquitoes have become increasingly resistant to the insecticides used in bednets over the past 20 years. This has made older models of bednets less effective, although they still have some effect. Newer models of bednets developed in response to insecticide resistance are more effective but still not widely deployed.  I very crudely estimate that without any of these factors, there would be 55% fewer malaria cases in the world than what we see today. I think all three of these factors are roughly equally important in explaining the difference.  Alternative explanations like removal of PFAS, climate change, or invasive mosquito species don't appear to be major contributors.  Overall this investigation made me more convinced that bednets are an effective global health intervention.  Introduction In 2015, malaria rates were down, and EAs were celebrating. Giving What We Can posted this incredible gif showing the decrease in malaria cases across Africa since 2000: Giving What We Can said that > The reduction in malaria has be
Neel Nanda
 ·  · 1m read
 · 
TL;DR Having a good research track record is some evidence of good big-picture takes, but it's weak evidence. Strategic thinking is hard, and requires different skills. But people often conflate these skills, leading to excessive deference to researchers in the field, without evidence that that person is good at strategic thinking specifically. I certainly try to have good strategic takes, but it's hard, and you shouldn't assume I succeed! Introduction I often find myself giving talks or Q&As about mechanistic interpretability research. But inevitably, I'll get questions about the big picture: "What's the theory of change for interpretability?", "Is this really going to help with alignment?", "Does any of this matter if we can’t ensure all labs take alignment seriously?". And I think people take my answers to these way too seriously. These are great questions, and I'm happy to try answering them. But I've noticed a bit of a pathology: people seem to assume that because I'm (hopefully!) good at the research, I'm automatically well-qualified to answer these broader strategic questions. I think this is a mistake, a form of undue deference that is both incorrect and unhelpful. I certainly try to have good strategic takes, and I think this makes me better at my job, but this is far from sufficient. Being good at research and being good at high level strategic thinking are just fairly different skillsets! But isn’t someone being good at research strong evidence they’re also good at strategic thinking? I personally think it’s moderate evidence, but far from sufficient. One key factor is that a very hard part of strategic thinking is the lack of feedback. Your reasoning about confusing long-term factors need to extrapolate from past trends and make analogies from things you do understand better, and it can be quite hard to tell if what you're saying is complete bullshit or not. In an empirical science like mechanistic interpretability, however, you can get a lot more fe
Ronen Bar
 ·  · 10m read
 · 
"Part one of our challenge is to solve the technical alignment problem, and that’s what everybody focuses on, but part two is: to whose values do you align the system once you’re capable of doing that, and that may turn out to be an even harder problem", Sam Altman, OpenAI CEO (Link).  In this post, I argue that: 1. "To whose values do you align the system" is a critically neglected space I termed “Moral Alignment.” Only a few organizations work for non-humans in this field, with a total budget of 4-5 million USD (not accounting for academic work). The scale of this space couldn’t be any bigger - the intersection between the most revolutionary technology ever and all sentient beings. While tractability remains uncertain, there is some promising positive evidence (See “The Tractability Open Question” section). 2. Given the first point, our movement must attract more resources, talent, and funding to address it. The goal is to value align AI with caring about all sentient beings: humans, animals, and potential future digital minds. In other words, I argue we should invest much more in promoting a sentient-centric AI. The problem What is Moral Alignment? AI alignment focuses on ensuring AI systems act according to human intentions, emphasizing controllability and corrigibility (adaptability to changing human preferences). However, traditional alignment often ignores the ethical implications for all sentient beings. Moral Alignment, as part of the broader AI alignment and AI safety spaces, is a field focused on the values we aim to instill in AI. I argue that our goal should be to ensure AI is a positive force for all sentient beings. Currently, as far as I know, no overarching organization, terms, or community unifies Moral Alignment (MA) as a field with a clear umbrella identity. While specific groups focus individually on animals, humans, or digital minds, such as AI for Animals, which does excellent community-building work around AI and animal welfare while
Recent opportunities in AI safety
20
Eva
· · 1m read