(Told this to Jenny in person, but posting for the benefit of others)

AI safety is a young, pre-paradigmatic area of research without a universally accepted mathematical formalism, so if you're after cool math, my suggestion is to learn the basics of one or two well-established fields that are mathematically mature and have a decent chance of being relevant to AI safety.

In particular, I think Learning Theory and Causality are areas with plenty of Aesthetic Math™.

## Learning theory

**Statistical learning theory** is the mathematical study of inductive reasoning—how can we make generalizations from past observations to future observations? It's an entire mathematically rich field devoted to formalizing Occam's razor.

**Computational learning theory** imposes the further restriction that learning algorithms be computationally efficient. It has rich connections to other parts of theoretical computer science (for example, there is a duality between computational learning theory and cryptography—positive results for one translate to negative results for the other!) And there are many fun problems of a combinatorial puzzle flavor.

Most of learning theory assumes that observations are drawn i.i.d. from a distribution. **Online Learning **asks what happens if we eliminate this assumption. Incredibly, it can be shown that inductive reasoning can be successful *even when observations are handcrafted by an adversary*. The key is to measure success in relative rather than absolute terms: how did you perform in comparison to the best member of a pre-specified class of predictors? There are beautiful connections to convex analysis.

Readings:

## Causality

I don't know this area as well, but the material I have learned has been mathematically beautiful. In particular, I suggest learning about Judea Pearl's theory of causality, which has been very influential in computer science, statistics, and some of the natural and social sciences. (There are a few competing formalisms for causality, but Pearl's is the most mathematically beautiful as far as I can tell.) Pearl's theory generalizes the classical theory of probability to allow for reasoning about cause and effect, using a framework that involves manipulations of directed acyclic graphs.

Reading: Causality, by Pearl.

These are the sort of thing I'm looking for! In that, on first glance, they're a lot of solid "maybe"s where mostly I've been finding "no"s. So that's encouraging --thank you so much for the suggestions!