Oliver Sourbut

Technical staff (Autonomous Systems) @ UK AI Safety Institute (AISI)

389 karmaJoined Sep 2020Working (6-15 years)Pursuing a doctoral degree (e.g. PhD)London, UK

www.oliversourbut.net

Message

Interests:

AI safetyAI alignmentAI forecastingBiosecurityAnimal welfareWild animal welfareArtificial sentience

Bio

Participation
4

Autonomous Systems @ UK AI Safety Institute (AISI)
DPhil AI Safety @ Oxford (Hertford college, CS dept, AIMS CDT)
Former senior data scientist and software engineer + SERI MATS

I'm particularly interested in sustainable collaboration and the long-term future of value. I'd love to contribute to a safer and more prosperous future with AI! Always interested in discussions about axiology, x-risks, s-risks.

I enjoy meeting new perspectives and growing my understanding of the world and the people in it. I also love to read - let me know your suggestions! In no particular order, here are some I've enjoyed recently

Ord - The Precipice
Pearl - The Book of Why
Bostrom - Superintelligence
McCall Smith - The No. 1 Ladies' Detective Agency (and series)
Melville - Moby-Dick
Abelson & Sussman - Structure and Interpretation of Computer Programs
Stross - Accelerando
Graeme - The Rosie Project (and trilogy)

Cooperative gaming is a relatively recent but fruitful interest for me. Here are some of my favourites

Hanabi (can't recommend enough; try it out!)
Pandemic (ironic at time of writing...)
Dungeons and Dragons (I DM a bit and it keeps me on my creative toes)
Overcooked (my partner and I enjoy the foody themes and frantic realtime coordination playing this)

People who've got to know me only recently are sometimes surprised to learn that I'm a pretty handy trumpeter and hornist.

Posts
6

Sorted by New

Better than logarithmic returns to reasoning?

Oliver Sourbut

· 4mo ago

FLF Fellowship on AI for Human Reasoning: $25-50k, 12 weeks

Oliver Sourbut

· 7mo ago

Cooperation and Alignment in Delegation Games: You Need Both!

Oliver Sourbut

· 1y ago

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

Oliver Sourbut

· 2y ago

Un-unpluggability - can't we just unplug it?

Oliver Sourbut

· 3y ago

-3

Two tongue-in-cheek EA anthems

Oliver Sourbut

· 3y ago · 1m read

Comments
63

Better than logarithmic returns to reasoning?

Oliver Sourbut4mo1

Some gestures which didn't make the cut as they're too woolly or not quite the right shape:

adversarial exponentials might force exponential expense per gain
- e.g. combatting replicators
- e.g. brute forcing passwords
many empirical 'learning curve' effects appear to consume exponential observations per increment
- Wright's Law (which is the more general cousin of Moore's Law) requires exponentially many production iterations per incremental efficiency gain
- Deep learning scaling laws appear to consume exponential inputs per incremental gain
- AlphaCode and AlphaZero appear to make uniform gains per runtime compute doubling
- OpenAI's o-series 'reasoning models' appear to improve accuracy on many benchmarks with logarithmic returns to more 'test time' compute
- (in all of these examples, there's some choice of what scale to represent 'output' on, which affects whether the gains look uniform or not, so the thesis rests on whether the choices made are 'natural' in some way)

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Oliver Sourbut6mo5

This is lovely, thank you!

My main concern would be that it takes the same very approximating stance as much other writing in the area, conflating all kinds of algorithmic progress into a single scalar 'quality of the algorithms'.

You do moderately well here, noting that the most direct interpretation of your model regards speed or runtime compute efficiency, yielding 'copies that can be run' as the immediate downstream consequence (and discussing in a footnote the relationship to 'intelligence'^[1] and the distinction between 'inference' and training compute).

I worry that many readers don't track those (important!) distinctions and tend to conflate these concepts. For what it's worth, by distinguishing these concepts, I have come to the (tentative) conclusion that a speed/compute efficiency explosion is plausible (though not guaranteed), but an 'intelligence' explosion in software alone is less likely, except as a downstream effect of running faster (which might be nontrivial if pouring more effective compute into training and runtime yields meaningful gains).

Of course, 'intelligence' is also very many-dimensional! I think the most important factor in discussions like these regarding takeoff is 'sample efficiency', since that's quite generalisable and feeds into most downstream applications of more generic 'intelligence' resources. This is relevant to R&D because sample efficiency affects how quickly you can accrue research taste, which controls the stable level of your exploration quality. Domain-knowledge and taste are obviously less generalisable, and harder to get in silico alone. ↩︎

FLF Fellowship on AI for Human Reasoning: $25-50k, 12 weeks

Oliver Sourbut7mo1

Glad to hear it! Any particular thoughts or suggestions? (Consider applying, or telling colleagues and friends you think would be a good fit!)

What if we just…didn’t build AGI? An Argument Against Inevitability

Oliver Sourbut7mo1

On this note, the Future of Life Foundation (headed by Anthony Aguirre, mentioned in this post) is today launching a fellowship on AI for Human Reasoning.

Why? Whether you expect gradual or sudden AI takeoff, and whether you're afraid of gradual or acute catastrophes, it really matters how well-informed, clear-headed, and free from coordination failures we are navigating into and through AI transitions. Just the occasion for human reasoning uplift!

12 weeks, $25-50k stipend, mentorship, and potential pathways to future funding and impact. Applications close June 9th.

Knowledge, Reasoning, and Superintelligence

Oliver Sourbut8mo5

(cross-posted on LW)

Love this!

As presaged in our verbal discussion my top conceptual complement would be to emphasise exploration/experimentation as central to the knowledge production loop - the cycle of 'developing good taste to plan better experiments to improve taste (and planning model)' is critical (indispensable?) to 'produce new knowledge which is very helpful by the standards of human civilization' (on any kind of meaningful timescale).

This because just flailing, or even just 'doing stuff', gets you some novelty of observations, but directedly seeking informative circumstances at the boundaries of the known (which includes making novel unpredictable events happen, as well as getting equipped with richer means to observe and record them, and perhaps preparing to deliberatively extract insight) turns out to be able to mine vastly more insight per resource (time, materials, etc.). Hence science, but also hence individual human and animal playfulness, curiosity, adversarial exercises and drills (self-play ish), and whatnot.

Said another way, maybe I'd characterise 'the way that fluid intelligence and crystallised intelligence synergise in the knowledge production loop' as 'directed exploration/experimentation'?

Having said that, I don't necessarily think these capacities need to reside 'in the same mind', just as contemporary human orgs get more of this done and more effectively than individuals. But the pieces do need to be fit to each other (like, a physicist with great physics taste can't usually very well complement a bio lab without first becoming a person with great bio taste).

Decomposing Agency — capabilities without desires

Oliver Sourbut1y5

I like this decomposition!

I think 'Situational Awareness' can quite sensibly be further divided up into 'Observation' and 'Understanding'.

The classic control loop of 'observe', 'understand', 'decide', 'act'^[1], is consistent with this discussion, where 'observe'+'understand' here are combined as 'situational awareness', and you're pulling out 'goals' and 'planning capacity' as separable aspects of 'decide'.

Are there some difficulties with factoring?

Certain kinds of situational awareness are more or less fit for certain goals. And further, the important 'really agenty' thing of making plans to improve situational awareness does mean that 'situational awareness' is quite coupled to 'goals' and to 'implementation capacity' for many advanced systems. Doesn't mean those parts need to reside in the same subsystem, but it does mean we should expect arbitrary mix and match to work less well than co-adapted components - hard to say how much less (I think this is borne out by observations of bureaucracies and some AI applications to date).

Terminology varies a lot; this is RL-ish terminology. Classic analogues might be 'feedback', 'process model'/'inference', 'control algorithm', 'actuate'/'affect'... ↩︎

Careers Questions Open Thread

Oliver Sourbut2y6

A little followup:

I took part in the inaugural SERI MATS programme in 2021-2022 (where incidentally I interacted with Richard), and started an AI Safety PhD at Oxford in 22.

I'm now working for the AI Safety Institute (UK Gov) since Jan 2024 as a hybrid technical expert, utilising my engineering and DS background, alongside AI/ML research and threat modelling. Likely to continue such work, there or elsewhere. Unsure if I'll finish my PhD in the end, as a result, but I don't regret it: I produced a little research, met some great collaborators, and had fun while learning as a consequence!

Between the original thread and my leaving for PhD, I'd say I grew my engineering, DS, and project management skills a little, though diminishing, while also doing a lot of AIS prep. My total income also went up while I remained FT employed. This was due for a slowdown as a consequence of stock movements and vesting, but regardless I definitely forwent a lot of money thanks to becoming a student again (and then a researcher rather than a high-paid engineer)! As far as I can tell this is the main price I paid, in terms of both personal situation and impact, and perhaps I should have made the move sooner (though having money in the bank is very freeing and enables indirect impact).

AI Regulation is Unsafe

Oliver Sourbut2y8

FWIW I work at the AI Safety Institute UK and we're considering a range of both misuse and misalignment threats, and there are a lot of smart folks on board taking things pretty seriously. I admit I... don't fully understand how we ended up in this situation and it feels contingent and precious, as does the tentative international consensus on the value of cooperation on safety (e.g. the Bletchley declaration). Some people in government are quite good, actually!

Public Fundraising has Positive Externalities

Oliver Sourbut2y1

Sure, take it or leave it! I think for the field-building benefits it can look more obviously like an externality (though I-the-fundraiser would in fact be pleased and not indifferent, presumably!), but the epistemic benefits could easily accrue mainly to me-the-fundraiser (of course they could also benefit other parties).

Altruism sharpens altruism

Oliver Sourbut2y12

How much of this is lost by compressing to something like: virtue ethics is an effective consequentialist heuristic?

I've been bought into that idea for a long time. As Shaq says, 'Excellence is not a singular act, but a habit. You are what you repeatedly do.'

We can also make analogies to martial arts, music, sports, and other practice/drills, and to aspects of reinforcement learning (artificial and natural).

Oliver Sourbut

Bio

Participation4

Posts 6

Comments63

Participation
4

Posts
6

Comments
63