"Develop Anthropomorphic AGI to Save Humanity from Itself" (Future Fund AI Worldview Prize submission)

ketanrama; William_MacAskill; leopold; Nick_Beckstead; ab

Comments 6

Sorted by

New & upvoted

I find it a bit irritating and slightly misleading that this post lists several authors, (some of them very famous in EA), who have not actually written the submission. May I suggest to only list one account (eg ketanrama) as the author of the post?

Greg_Colbourn ⏸️

Yes, maybe a better option would be to have a separate account "Future Fund AI Worldview Prize submissions". Or even create an account for the author that they can later claim if they wish (but make it clear in the bio, and at the top of the post, that it is a place-holder account in the mean time).

harfe

I find this submission very low on detail in the places that matter, namely the anthropomorphic AGI itself. It is not clear how this could be build, or why it is more realistic that such an AGI gets build than other AGIs.

and educated and reared much like a human child, in a caring and supportive environment.

How would this look like? Why would the AGI respond to this like a well-behaved human child?

Its value system would be, like that of humans, dynamic, high dimensional, and to some degree ineffable.

Would it have inconsistent values? How do you know there won't be any mesaoptimization?

Steven Byrnes

I have some discussion of this area in general and one of David Jilk’s papers in particular at my post Two paths forward: “Controlled AGI” and “Social-instinct AGI”.

In short, it seems to me that if you buy into this post, then the next step should be to figure out how human social instincts work, not just qualitatively but in enough detail to write it into AGI source code.

I claim that this is an open problem, involving things like circuits in the hypothalamus and neuropeptide receptors in the striatum. And it’s the main thing that I’m working on myself.

Additionally, there are several very good reasons to work on the human social instincts problem, even if you don’t buy into other parts of David Jilk’s assertions here.

Additionally, figuring out human social instincts is (I claim) (at least mostly) orthogonal to work that accelerates AGI timelines, and therefore we should all be able to rally around it as a good idea.

Whether we should also try to accelerate anthropomorphic AGI timelines, e.g. by studying the learning algorithms in the neocortex, is bound to be a much more divisive question. I claim that on balance, it’s mostly a very bad idea, with certain exceptions including closed (and not-intended-to-be-published) research projects by safety/alignment-concerned people. [I’m stating this opinion without justifying it.]

Donald Hobson

The problem with "anthropomorphic AI" approaches is

The human mind is complicated and poorly understood.
Safety degrades fast with respect to errors.

Lets say you are fairly successful. You produce an AI that is really really close to the human mind in the space of all possible minds. A mind that wouldn't be particularly out of place at a mental institution. They can produce paranoid ravings about the shapeshifting lizard conspiracy millions of times faster than any biological human.

Ok, so you make them a bit smarter. The paranoid conspiricies get more complicated and somewhat more plausible. But at some points, they are sane enough to attempt AI research and produce useful results. Their alignment plan is totally insane.

In order to be useful, anthropomorphic AI needs to not only make AI that thinks similarly to humans. They need to be able to target the more rational, smart and ethical portion of mind space.

Humans can chuck the odd insane person out of the AI labs. Sane people are more common and tend to think faster. A team of humans can stop any one of their number crowning themselves as world king.

In reality, I think your anthropomorphic AI approach gets an arguably kind of humanlike in some ways AI that takes over the world. Because it didn't resemble the right parts of the right humans in the right ways closely enough in the places where it matters.

Peter

I have thought a few times that maybe a safer route to AGI would be to learn as much as we can about the most moral and trustworthy humans we can find and try to build on that foundation/architecture. I'm not sure how that would work with existing convenient methods of machine learning.

Comments

Curated and popular this week

Counting animals: Stable population size is not equivalent to priority level

abrahamrowe, mal_graham🔸·4d ago·Curated 23h ago·16m read

AI Use Note: Main body text entirely human written. Claude (Opus 4.8) helped develop models of animal life histories in the appendix. Cross-posted from Good Structures. Executive Summary * Animal advocates sometimes make claims like “there are X of this animal...

151

Let's taboo the V-word

lincolnq·4d ago·8m read

“How long have you been v*g*n?” This is one of the most common icebreakers at animal protection events. It’s a baseline assumption, and it mostly holds true: if you’re out advocating for animals not to be tortured or abused, realistically these days you are v**n, or close. And it makes for good conversation. It seems fairly safe to assume when you meet strangers. But this assumption is hurting the movement in a way which we don’t always notice: someone new comes into the sp...

Spiro: an update 2.5 years on and a fundraising ask for expansion

Habiba Banu·1d ago·6m read

Summary Back in November 2023 I posted here to launch Spiro and raise our first $198k. Two and a half years later this is an update and a fundraiser for the next step. The short version: we've now reached over-5,900 people with TB preventive medicine, including over 3,000 children under five years old. Our early results have held up well an...

Recent opportunities to take action

EA Organisation Updates thread: July 2026

Dane Valerie·3d ago·1m read

Help us launch AI safety university groups by referring potential founders

Jason Chin🔸·13h ago·4m read

Save the date: Swiss AI Safety Days 2026 (7-8 November, ETH Zurich)

Andre Santos 🔸, patrickwidmann, mariuswenk·15h ago·1m read