Hide table of contents

Hi, I've been asked to recommend a couple of short introductions/overviews about the key issues in AI safety and AI alignment. This is will be for the 'Philosophy, Politics, & Economics (PPE) major at Oxford University - which trains some of the brightest undergrads in Britain, many of which go on to influential positions in government and industry.

Ideal readings would be recent (e.g. 2024 onwards), short (e.g. less than 4,000 words), non-technical, vivid & engaging, and reputable (in terms of author(s) and/or outlets). 

Any suggestions would be much appreciated! 

12

0
0

Reactions

0
0
New Answer
New Comment

6 Answers sorted by

Firstly, I'd note that OAISI or @James Lester (OAISI Prez) might be able to provide better resources or Oxford links, if you've not spoken to them yet!

Here's a fairly long list of (what I think are) good options. Note that they're all more blog post than academic article. I personally think that's better based on my experience with what I/most people engaged with more at undergrad, but that obviously depends on the Oxford culture.

Unsurprisingly, I'd mainly recommend 80,000 Hours content. Their overview case for AI risks hits the broad points, but is light on detail unless you follow the links. Their profiles on power-seeking AI, gradual disempowerment, AI misuse and power concentration are recent, broadly non-technical, engaging, and somewhat reputable. I think the first one (power seeking) is the best of the four to recommend, but it's a bit longer. I've also heard excellent things about the AI in Context video, but haven't watched it myself.

If you want to max out credibility about AI risk being worth taking seriously, consider pointing to the Superintelligence Statement and FLI Open Letter.

Linch's intro is great, recent, shorter and non-technical, but doesn't come with much credibility (sorry Linch).

For something really hard hitting, Yudkowsky's Time piece has always stuck with me. I'd be careful about this one though: as a first introduction it can easily come across as 'crazy man ranting' and lead to broad dismissal of AI risk.

Finally, AISafety.info has good arguments and you can explore at your own pace, but unsure how suitable it is for a reading list.

Hope this is helpful!

Hi, fellow Oxford neighbour here!

The AI Safety Atlas is an amazing resource, and just the type I think you are looking for (understandable by me, a pianist with zero STEM background beyond high-school). 

For your purposes I'd recommend Chapter 1.4 + maybe one or two extra chapters specifically around capabilities and 'the bitter lesson', then perhaps this video by Rational Animations about goal misgeneralisation.

Since you're in Oxford, I'd also recommend reaching out to the Oxford AI Safety Initiative, a student-led group doing amazing work to educate around issues of AI Safety. 

I've been doing their Core Fellowship this term and it has been amazing. 

If you're open to textbooks, I second the AI Safety Atlas suggestion (potentially the section on misalignment risks), and you might also consider selected sections of the AI Safety, Ethics and Society textbook (e.g. the brief introduction, or section on Rogue AIs)

More generally, I'd be very excited to see some representation of AI safety concerns in the PPE curriculum. If there's any way OAISI can be helpful here, please do let me know!

Some videos from Rational Animations and Rob Miles would be cool, I think. 

Curated and popular this week
Relevant opportunities