This post was inspired by G. Gigerenzer's book, Gut Feelings: The Intelligence of the Unconscious. I'm drawing an analogy that I think informs us about how science works, at least from what I understand from the history of science. It also seems to resonate with the reflections of current alignment researchers, for example, see Richard's post Intuitions about solving hard problems

 

Imagine that you ask a professional baseball player, let's call him Tom, how does he catch a fly ball?  He takes a moment to respond realizing he's never reflected on this question before and ends up staring at you, not knowing how or what is there exactly to explain. He says he's never thought about it. Now imagine that you ask his baseball coach what's the best way to catch the ball. The coach has a whole theory about it. In fact, he insists that Tom and everyone on the team should follow one specific technique he thinks it's optimal. Tom and the rest of the team go ahead and do what the coach said (because they don't want to get yelled at even though they've been doing fine so far).  And lo and behold the team misses the ball more often than before. 

What could have gone wrong?

Richard Dawkins, in The Selfish Gene, gives the following explanation:

When a man throws a ball high in the air and catches it again, he
behaves as if he had solved a set of differential equations in predicting
the trajectory of the ball. He may neither know nor care what a
differential equation is, but this does not affect his skill with the ball. At
some subconscious level, something functionally equivalent to the
mathematical calculations is going on.

So here's the analogy I want to argue for: when you do science, by which I mean, when you set up your scientific agenda and organize your research, construct your theoretic apparatus, design experiments, interpret results, and so on, you behave as if you had solved a set of differential equations predicting the trajectory of your research. You may neither know nor care what this process is, but this does not affect your skill with the setup of your project. At some subconscious level, something functionally equivalent to the mathematical calculations is going on. (Be careful, I obviously don't mean the calculations that are explicitly part of solving the problem).

Well, I just paraphrased Dawkins' passage to make a point and that is that what we mean by "gut feelings" or "intuitions" is not magic, but rather a cognitive mechanism. This applies to a variety of tasks from baseball to thinking about agent foundations. It's a cognitive procedure that might even be ultimately explained in mathematical terms about how the brain works. 

Up until this point, it looks like I draw this analogy with the practice of science in general in mind. It's worth trying to see what the analogy could suggest for alignment research, granting that it's still at a pre-paradigmatic stage

My analogy predicts that one or multiple insights about solving the alignment problem will come the way the experienced baseball player hits the ball before following his coach's instructions: as a natural movement in virtue of his gut feelings. And it will be a successful movement. Sure, that requires unpacking, but I think Gigerenzer offers a satisfactory account in his book. [1]

Meanwhile, in the contemporary study of the history of science (especially HPS approaches), we talk about myths in science and then have arguments that debunk them. One of the most famous myths, widespread in popular culture is that an apple hit Newton's head which inspired him to formulate the law of universal gravitation. 

While this episode is primarily a cute myth about scientific discovery robustly incorporated into the historiographical package of the hero-scientist ideal, it reveals a truth about how scientists think, or at least, what we have noticed and prima facie believe about how they think. It encapsulates this sense of not being able to explain exactly how you got your insight as a scientist, just like the baseball player is clueless about what makes him catch the ball so perfectly. So, if this myth is meant to be treated as a metaphor then it tells us that the a-ha moment (also known as the "eureka effect") in scientific research, might come when you least expect it; a falling apple might just trigger it. 

The scientist might try to rationalize the process in hindsight, trace back the steps in his reasoning and find clever ways to explain how he got to his insight. But it's plausible to believe that there's a whole cognitive mechanism of heuristics and gut feelings that made a product of intuitive reasoning look like the most natural next move. 

As far as alignment is concerned, I think some researchers have already ordered and received their apples (okay, perhaps smaller size apples). And if this continues and we're still alive, the one apple that's going to change the flow of the most important century will make its appearance. 


 

Who ordered the apple?
  1. ^

    Gigerenzer talks about this in a chapter called "Winning without thinking". He brings up heuristics and especially the "gaze heuristic" and explains how this works in the baseball example. Just in case there's confusion, I'm not conflating insight and intuition; I think that insight is generated through following one's intuitions, i.e., it's a product of intuitive reasoning. So, in that sense, it's plausible to suppose that heuristics contribute to the generation of insights. 

Comments


No comments on this post yet.
Be the first to respond.
More from Eleni_A
51
Eleni_A
· · 1m read
33
Eleni_A
· · 2m read
Curated and popular this week
Sam Anschell
 ·  · 6m read
 · 
*Disclaimer* I am writing this post in a personal capacity; the opinions I express are my own and do not represent my employer. I think that more people and orgs (especially nonprofits) should consider negotiating the cost of sizable expenses. In my experience, there is usually nothing to lose by respectfully asking to pay less, and doing so can sometimes save thousands or tens of thousands of dollars per hour. This is because negotiating doesn’t take very much time[1], savings can persist across multiple years, and counterparties can be surprisingly generous with discounts. Here are a few examples of expenses that may be negotiable: For organizations * Software or news subscriptions * Of 35 corporate software and news providers I’ve negotiated with, 30 have been willing to provide discounts. These discounts range from 10% to 80%, with an average of around 40%. * Leases * A friend was able to negotiate a 22% reduction in the price per square foot on a corporate lease and secured a couple months of free rent. This led to >$480,000 in savings for their nonprofit. Other negotiable parameters include: * Square footage counted towards rent costs * Lease length * A tenant improvement allowance * Certain physical goods (e.g., smart TVs) * Buying in bulk can be a great lever for negotiating smaller items like covid tests, and can reduce costs by 50% or more. * Event/retreat venues (both venue price and smaller items like food and AV) * Hotel blocks * A quick email with the rates of comparable but more affordable hotel blocks can often save ~10%. * Professional service contracts with large for-profit firms (e.g., IT contracts, office internet coverage) * Insurance premiums (though I am less confident that this is negotiable) For many products and services, a nonprofit can qualify for a discount simply by providing their IRS determination letter or getting verified on platforms like TechSoup. In my experience, most vendors and companies
 ·  · 4m read
 · 
Forethought[1] is a new AI macrostrategy research group cofounded by Max Dalton, Will MacAskill, Tom Davidson, and Amrit Sidhu-Brar. We are trying to figure out how to navigate the (potentially rapid) transition to a world with superintelligent AI systems. We aim to tackle the most important questions we can find, unrestricted by the current Overton window. More details on our website. Why we exist We think that AGI might come soon (say, modal timelines to mostly-automated AI R&D in the next 2-8 years), and might significantly accelerate technological progress, leading to many different challenges. We don’t yet have a good understanding of what this change might look like or how to navigate it. Society is not prepared. Moreover, we want the world to not just avoid catastrophe: we want to reach a really great future. We think about what this might be like (incorporating moral uncertainty), and what we can do, now, to build towards a good future. Like all projects, this started out with a plethora of Google docs. We ran a series of seminars to explore the ideas further, and that cascaded into an organization. This area of work feels to us like the early days of EA: we’re exploring unusual, neglected ideas, and finding research progress surprisingly tractable. And while we start out with (literally) galaxy-brained schemes, they often ground out into fairly specific and concrete ideas about what should happen next. Of course, we’re bringing principles like scope sensitivity, impartiality, etc to our thinking, and we think that these issues urgently need more morally dedicated and thoughtful people working on them. Research Research agendas We are currently pursuing the following perspectives: * Preparing for the intelligence explosion: If AI drives explosive growth there will be an enormous number of challenges we have to face. In addition to misalignment risk and biorisk, this potentially includes: how to govern the development of new weapons of mass destr
Dr Kassim
 ·  · 4m read
 · 
Hey everyone, I’ve been going through the EA Introductory Program, and I have to admit some of these ideas make sense, but others leave me with more questions than answers. I’m trying to wrap my head around certain core EA principles, and the more I think about them, the more I wonder: Am I misunderstanding, or are there blind spots in EA’s approach? I’d really love to hear what others think. Maybe you can help me clarify some of my doubts. Or maybe you share the same reservations? Let’s talk. Cause Prioritization. Does It Ignore Political and Social Reality? EA focuses on doing the most good per dollar, which makes sense in theory. But does it hold up when you apply it to real world contexts especially in countries like Uganda? Take malaria prevention. It’s a top EA cause because it’s highly cost effective $5,000 can save a life through bed nets (GiveWell, 2023). But what happens when government corruption or instability disrupts these programs? The Global Fund scandal in Uganda saw $1.6 million in malaria aid mismanaged (Global Fund Audit Report, 2016). If money isn’t reaching the people it’s meant to help, is it really the best use of resources? And what about leadership changes? Policies shift unpredictably here. A national animal welfare initiative I supported lost momentum when political priorities changed. How does EA factor in these uncertainties when prioritizing causes? It feels like EA assumes a stable world where money always achieves the intended impact. But what if that’s not the world we live in? Long termism. A Luxury When the Present Is in Crisis? I get why long termists argue that future people matter. But should we really prioritize them over people suffering today? Long termism tells us that existential risks like AI could wipe out trillions of future lives. But in Uganda, we’re losing lives now—1,500+ die from rabies annually (WHO, 2021), and 41% of children suffer from stunting due to malnutrition (UNICEF, 2022). These are preventable d