EA Forum Bot Site
EA Forum

Oliver Sourbut

Technical staff (Autonomous Systems) @ UK AI Safety Institute (AISI)

393 karmaJoined Sep 2020Working (6-15 years)Pursuing a doctoral degree (e.g. PhD)London, UK

www.oliversourbut.net

Interests:

AI safetyAI alignmentAI forecastingBiosecurityAnimal welfareWild animal welfareArtificial sentience

Bio

Participation
4

Autonomous Systems @ UK AI Safety Institute (AISI)
DPhil AI Safety @ Oxford (Hertford college, CS dept, AIMS CDT)
Former senior data scientist and software engineer + SERI MATS

I'm particularly interested in sustainable collaboration and the long-term future of value. I'd love to contribute to a safer and more prosperous future with AI! Always interested in discussions about axiology, x-risks, s-risks.

I enjoy meeting new perspectives and growing my understanding of the world and the people in it. I also love to read - let me know your suggestions! In no particular order, here are some I've enjoyed recently

Ord - The Precipice
Pearl - The Book of Why
Bostrom - Superintelligence
McCall Smith - The No. 1 Ladies' Detective Agency (and series)
Melville - Moby-Dick
Abelson & Sussman - Structure and Interpretation of Computer Programs
Stross - Accelerando
Graeme - The Rosie Project (and trilogy)

Cooperative gaming is a relatively recent but fruitful interest for me. Here are some of my favourites

Hanabi (can't recommend enough; try it out!)
Pandemic (ironic at time of writing...)
Dungeons and Dragons (I DM a bit and it keeps me on my creative toes)
Overcooked (my partner and I enjoy the foody themes and frantic realtime coordination playing this)

People who've got to know me only recently are sometimes surprised to learn that I'm a pretty handy trumpeter and hornist.

Posts
6

Sorted by New

6

Better than logarithmic returns to reasoning?

· 5mo ago

69

FLF Fellowship on AI for Human Reasoning: $25-50k, 12 weeks

· 7mo ago

4

Cooperation and Alignment in Delegation Games: You Need Both!

· 1y ago

52

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

· 2y ago

15

Un-unpluggability - can't we just unplug it?

· 3y ago

-3

Two tongue-in-cheek EA anthems

· 3y ago · 1m read

Comments
69

Rerunning the Time of Perils

Oliver Sourbut10d1

1

0

On point 1 (space colonization), I think it's hard and slow! So the same issue as with bio risks might apply: AGI doesn't get you this robustness quickly for free. See other comment on this post.

Rerunning the Time of Perils

Oliver Sourbut10d1

1

0

I like your point 2 about chancy vs merely uncertain. I guess a related point is that when the 'runs' of the risks are in some way correlated, having survived once is evidence that survivability is higher. (Up to an including the fully correlated 'merely uncertain' extreme?)

Rerunning the Time of Perils

Oliver Sourbut10d1

0

0

For clarity, you're using 'important' here in something like an importance x tractability x neglectedness factoring? So yes more important (but there might be reasons to think it's less tractable or neglected)?

Rerunning the Time of Perils

Oliver Sourbut10d1

0

0

I've been meaning to write something about 'revisiting the alignment strategy'. The section 5 here ('Won't AGI make post-AGI catastrophes essentially irrelevant?') makes the point very clearly:

On this view, a post-AGI world is nearly binary—utopia or extinction—leaving little room for Sisyphean scenarios.

But I think this is too optimistic about the speed and completeness of the transition to globally deployed, robustly aligned "guardian" systems.

without making much of a case for it. Interested in Will and reviewers' sense of the space and literature here.

Rerunning the Time of Perils

Oliver Sourbut10d5

1

0

Yep, definitely for me 'big civ setbacks are really bad' was already baked in from the POV of setting bad context for pre-AGI-transition(s) (as well as their direct badness). But while I'd already agreed with Will about post-AGI not being an 'end of history' (in the sense that much remains uncertain re safety), I hadn't thought through the implication that setbacks could force a rerun of the most perilous transition(s), which does add some extra concern.

Rerunning the Time of Perils

Oliver Sourbut10d1

0

0

A small aside: some put forth interplanetary civilisation as a partial defence against either of total destruction and 'setback'. But reaching the milestone of having a really robustly interplanetary civ might itself take quite a long time after AGI - especially if (like me) you think digital uploading is nontrivial.

(This abstractly echoes the suggestion in this piece that bio defence might take a long time, which I agree with.)

Better than logarithmic returns to reasoning?

Oliver Sourbut5mo1

0

0

Some gestures which didn't make the cut as they're too woolly or not quite the right shape:

adversarial exponentials might force exponential expense per gain
- e.g. combatting replicators
- e.g. brute forcing passwords
many empirical 'learning curve' effects appear to consume exponential observations per increment
- Wright's Law (which is the more general cousin of Moore's Law) requires exponentially many production iterations per incremental efficiency gain
- Deep learning scaling laws appear to consume exponential inputs per incremental gain
- AlphaCode and AlphaZero appear to make uniform gains per runtime compute doubling
- OpenAI's o-series 'reasoning models' appear to improve accuracy on many benchmarks with logarithmic returns to more 'test time' compute
- (in all of these examples, there's some choice of what scale to represent 'output' on, which affects whether the gains look uniform or not, so the thesis rests on whether the choices made are 'natural' in some way)

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Oliver Sourbut7mo5

1

0

This is lovely, thank you!

My main concern would be that it takes the same very approximating stance as much other writing in the area, conflating all kinds of algorithmic progress into a single scalar 'quality of the algorithms'.

You do moderately well here, noting that the most direct interpretation of your model regards speed or runtime compute efficiency, yielding 'copies that can be run' as the immediate downstream consequence (and discussing in a footnote the relationship to 'intelligence'^[1] and the distinction between 'inference' and training compute).

I worry that many readers don't track those (important!) distinctions and tend to conflate these concepts. For what it's worth, by distinguishing these concepts, I have come to the (tentative) conclusion that a speed/compute efficiency explosion is plausible (though not guaranteed), but an 'intelligence' explosion in software alone is less likely, except as a downstream effect of running faster (which might be nontrivial if pouring more effective compute into training and runtime yields meaningful gains).

Of course, 'intelligence' is also very many-dimensional! I think the most important factor in discussions like these regarding takeoff is 'sample efficiency', since that's quite generalisable and feeds into most downstream applications of more generic 'intelligence' resources. This is relevant to R&D because sample efficiency affects how quickly you can accrue research taste, which controls the stable level of your exploration quality. Domain-knowledge and taste are obviously less generalisable, and harder to get in silico alone. ↩︎

FLF Fellowship on AI for Human Reasoning: $25-50k, 12 weeks

Oliver Sourbut7mo1

0

0

Glad to hear it! Any particular thoughts or suggestions? (Consider applying, or telling colleagues and friends you think would be a good fit!)

What if we just…didn’t build AGI? An Argument Against Inevitability

Oliver Sourbut7mo1

0

0

On this note, the Future of Life Foundation (headed by Anthony Aguirre, mentioned in this post) is today launching a fellowship on AI for Human Reasoning.

Why? Whether you expect gradual or sudden AI takeoff, and whether you're afraid of gradual or acute catastrophes, it really matters how well-informed, clear-headed, and free from coordination failures we are navigating into and through AI transitions. Just the occasion for human reasoning uplift!

12 weeks, $25-50k stipend, mentorship, and potential pathways to future funding and impact. Applications close June 9th.