# Study Guide

This post is for students who hope to eventually work on technical problems we don’t understand, especially agency and AI alignment, and want to know what to study or practice.

# Guiding Principles

Current alignment researchers have wildly different recommendations on paths into the field, usually correlated with the wildly different paths these researchers have themselves taken into the field. This also correlates with different kinds of work on alignment. This guide largely reflects my own path, and I think it is useful if you want to do the sort of research I do. That means fairly theoretical work (for now), very technical, drawing on models and math from a lot of different areas to understand real-world agents.

Specializing in Problems We Don’t Understand lays out a general framework which guides many of the recommendations here. I’ll also briefly go over some guiding principles more specific to choosing what (and how much) to study:

Breadth over depth

Practice generalizing concepts

Be able to model anything

High volume of knowledge

## Breadth Over Depth

In general, study in any particular topic has decreasing marginal returns. The first exposure or two gives you the basic frames, tells you what kinds of questions to ask and what kinds of tools are available, etc. You may not remember everything, but you can at least remember what things to look up later if you need them—which is a pretty huge improvement over not even knowing that X is a thing you can look up at all!

Another way to frame this: problems-we-don’t-understand rely heavily on bringing in frames and tools from other fields. (If the frames and tools of this field were already sufficient, it wouldn’t be a problem-we-don’t-understand in the first place.) So, you want to have a very large library of frames and tools to apply. On the other hand, you don’t necessarily need very much depth in each frame or tool—just enough to recognize problems where it might apply and maybe try it out in a quick-and-dirty way.

## Practice Generalizing Concepts

Bringing in frames and tools from other fields requires the ability to recognize and adapt those frames and tools for problems very different from the field in which we first learned them. So, practice generalizing concepts from one area to another is particularly important.

Unfortunately, this is not a focus in most courses. There are exceptions—applied math classes often involve applying tools in a wide variety of ways, and low-level physics courses often provide very good practice in applying a few mathematical tools to a wide variety of problems. Ultimately, though, this is something you should probably practice on your own a lot more than it’s practiced in class.

Keeping a list of 10-20 hard problems in the back of your mind, and trying out each new frame or tool on one of those problems, is a particularly useful technique to practice generalization.

## Be Able To Model Anything

One common pitfall is to be drawn into areas which advertise extreme generality, but are rarely useful in practice. (A lot of high-level math is like this.) On the other hand, we still want a lot of breadth, including things which are not obviously useful to whatever problem we’re most interested in (e.g. alignment). After all, if the obviously-relevant tools sufficed, then it wouldn’t be a problem-we-don’t-understand in the first place.

To that end, it’s useful to look for frames/tools which are at least useful for *something*. Keeping a list of 10-20 hard problems in the back of your mind is one useful test for this. Another useful heuristic is “be able to model anything”: if there’s some system or phenomenon which you’re not sure how to model, even in principle, and field X has good tools for modelling it, then study field X.

This heuristic is useful for another reason, too: our intuitions for a problem of interest often come from other systems, and you never know what system will seem like a useful analogue. If we can model anything, then we always know how to formalize a model based on any particular analogy—we’re rarely left confused about how to even set it up.

## High Volume of Knowledge

Lastly, one place where I differ from the recommendations which I expect most current alignment researchers to give: I recommend studying a *lot*. This is based on my own experience—I’ve covered an awful lot of ground, and when I trace the sources of my key thoughts on alignment and agency, they come from an awful lot of places.

To that end: don’t just take whatever courses are readily available. I recommend heavy use of online course material from other schools, as well as textbooks. Sometimes the best sources are a lot better than the typical source—I try to highlight any particularly great sources I know of in this post. Also, I’ve found it useful to “pregame” the material even for my normal college courses—i.e. find a book or set of lectures covering similar material, and go through them before the semester starts, so that the in-person class is a second exposure rather than a first exposure. (This also makes the course a lot easier, and makes it easier overall to maintain ok grades without having to sink overly-pointless levels of effort into the class.)

Other useful tips to squeeze out every last drop:

Skipping pre-reqs is often a good idea.

Audit courses. This doesn’t just have to be at your school—I’ve audited half a dozen courses at schools where I had no formal affiliation. Just walk in on the first day of class and sit down, it’s usually totally fine, professors love it (since you’re actually interested).

All that said, obviously **this advice is for the sort of person who is ****not**** already struggling to keep up with a more normal course load**. This advice is definitely not for everyone.

# Coursework/Textbooks

With guiding principles out of the way, on to the main event: things to study. We’ll start with technical foundations, i.e. the sort of stuff which might be “common core classes” at a high-end STEM college/university. Then, we’ll cover topics which might be in an (imaginary) “alignment and agent foundations” degree. Finally, I’ll go through a few more topics which aren’t obviously relevant to alignment or agency, but are generally-useful for modelling a wide variety of real-world systems.

If I know of a particularly good source I’ll link to it, but sometimes the only sources I’ve used are mediocre or offline. Sorry. Also, I went to Harvey Mudd College, so any references to classes there are things I did in-person.

# Technical Foundations

## High-School Basics

Programming

Calculus

Prob/stat

Chemistry

Physics

If your high-school doesn’t have a programming class, use a MOOC, preferably in Python. There are lots of good sources available nowadays; the “intro to programming” market is very saturated. Heck, the “intro” market is pretty saturated in all of these.

Physics and calculus go together; calculus will likely feel unmotivated without physics, and physics will have a giant calculus-shaped hole in it without calculus.

## Programming

You should probably take more than one undergrad-level intro programming course, ideally using different languages. Different courses focus on very different things: low-level computer system concepts, high-level algorithms, programming language concepts, etc. Also, different languages serve very different use-cases and induce different thinking-patterns, so it’s definitely worth knowing a few, ideally very different languages.

Besides basic programming fluency, you should learn:

Basics of big-O analysis

A conceptual understanding of how a computer works (but probably not all the low-level details)

Personally, I’ve used Harvard’s CS50, a set of intro lectures from UNSW, CS5 & CS60 at Harvey Mudd, plus a Java textbook in high school. At bare minimum, you should probably work with C/C++, Python, and a LISP variant. (Harvard’s CS50 is good for C/C++, MIT has an intro in LISP which is widely considered very good, and lots of courses use Python.)

It’s also worthwhile to learn the basics of javascript and build a simple dynamic website at some point, but I rarely see an actual *class* in that.

## Data Structures

Once you’ve had one or two intro programming classes, there’s usually a course in data structures. It will cover things like arrays, linked lists, hash tables, trees, heaps, queues, etc. This is the bread-and-butter of most day-to-day programming.

Although the coursework may not emphasize it, I recommend building a habit of keeping a Fermi estimate of program runtime in the back of your head. I’d even say that the *main* point of learning about all these data structures is to make that Fermi estimate.

## Linear Algebra

Linear algebra is the main foundational tool we need for mathematically modelling anything with a lot of dimensions, i.e. our world. In practice, most of the matrices we use are either:

First or second derivatives of high-dimensional functions, or

Data on which we calculate correlations/run linear regressions.

Alas, when first studying the subject, it will probably be very abstract and you won’t see good examples of what it’s actually used for. (It is useful, though—I last used linear algebra yesterday, when formulating an abstraction problem as an eigenproblem.)

Linear algebra took me many passes to learn well. I read three textbooks and took two in-person courses (from different schools) in linear algebra, then took another two courses (also from different schools) in linear systems. Out of all that, the only resource I strongly recommend is Boyd’s lectures on linear dynamical systems, probably after one or two courses in linear algebra. I also hear Linear Algebra Done Right is good as an intro, but haven’t used it personally. MIT’s lectures are probably very good, though sadly I don’t think they were online back when I was learning the subject.

f you take more advanced math/engineering, you’ll continue to learn more linear algebra, especially in areas like linear control theory, Fourier methods, and PDEs.

## Mechanics & Differential Equations

Mechanics (usually a physics class) and differential equations (a math class) are the two courses where you go from mostly-not-knowing-how-to-model-most-things to mostly-having-some-idea-how-to-model-most-things-at-least-in-principle. In particular, I remember differential equations as the milestone where I transitioned from feeling like there were small islands of things I knew how to model mathematically, to small islands of things I *didn’t* know how to model mathematically, at least in principle. (I had taken some mechanics before that.)

I took all my mechanics in-person, but I hear the Feynman Lectures are an excellent source. For differential equations, I used MIT’s lectures. You will need some linear algebra for differential equations (at least enough to not run away screaming at the mention of eigenvalues), though not necessarily on the first pass (some schools break it up into a first course without linear algebra and then a second course with it).

## Multivariate Calculus

In principle, multivariate calculus is what makes linear algebra useful. Unfortunately, multivariate calculus courses in my experience are a grab-bag of topics, some which are quite useful, others of which are pretty narrow.

The topics in my ideal course in multivariate calculus would be:

Tensor notation

Tensor & matrix calculus

Gradients & gradient descent optimization

Hessians & Newton’s Method optimization

Jacobians & Newton’s Method root finding

Constrained optimization & Lagrange multipliers

Jacobian determinants & multivariate coordinate transformations for integrals

Wedge products

Conservative vector fields & potentials

About half of these are covered very well in Boyd’s convex optimization course (see below). The rest you may have to pick up piecemeal:

Tensor notation you can just adopt for yourself and practice; it’s very useful for ML, continuum mechanics, and general relativity

Matrix calculus you’ll pick up if you need to hand-code fast gradient calculations for optimization or simulation problems

Jacobian determinants will come up whenever a high-dimensional integral requires a coordinate change. Play around with it and then practice it when it’s needed.

Wedge products are useful whenever an integral is over a multi-dimensional surface in some higher-dimensional space; when you write “dx dy dz” in an integral, that’s secretly a wedge product. Again, play around with it and then practice it when it’s needed.

Conservative vector fields you’ll see a lot in electricity & magnetism (as well as specific techniques for them)

## Convex Optimization

Linear algebra, as we use it today, is a relatively recent development:

The separate linear algebra course became a standard part of the college mathematics curriculum in the United States in the 1950s and 60s and some colleges and universities were still adding the course in the early 1970s. (source)

Fifty years ago, linear algebra was new. What new things today will be core technical classes in another fifty years, assuming a recognizable university system still exists?

I think convex optimization is one such topic.

Boyd is the professor to learn this from, and his lectures are excellent. This is one of my strongest not-already-standard recommendations in this post.

## Bayesian Probability

Another topic which is on the short list for “future STEM core”. I don’t have a 101-level intro which I can personally vouch for—Yudkowsky’s intro is popular, but you’ll probably need a full course in probability before diving into the more advanced stuff.

You can get away with a more traditional probability course and then reading Jaynes (see below), which is what I did, but a proper Bayesian probability course is preferred if you can find a good one.

## Microeconomics

Economics provides the foundations for a ton of agency models.

Any standard 101-level course is probably fine. Lean towards more math if possible; for someone doing all the other courses on this list, there’s little reason not to jump into the math.

## Proofs

Alignment theory involves proving things, so you definitely need to be comfortable writing proofs.

To the extent that proof-writing is taught, it’s unfortunately often taught in Analysis 1, which is mostly-useless in practice other than the proof skills. (There are lots of useful things in analysis, but mostly I recommend you skip the core “analysis” courses and learn the useful parts in other classes, like theoretical mechanics or math finance or PDEs or numerical analysis.) Pick up proof skills elsewhere if you can; you’ll have ample opportunity to practice in all the other classes on this list.

# Agency and Alignment “Major”

## AI & Related Topics

### Intro AI

Mostly this course will provide a first exposure to stuff you’ll study more later. Pay attention to relaxation-based search in particular; it’s a useful unifying framework for a lot of other things.

I took Norvig & Thrun’s MOOC when it first came out, which was quite good. Russell & Norvig’s textbook appears to cover similar material.

### Causality

Turns out we *can* deduce causality from correlation, it just requires more than two variables. More generally, causal models are the main “language” you need to speak in order to efficiently translate intuitions about the world into Bayesian probabilistic models.

Yudkowsky has a decent intro, although you definitely need more depth than that. Pearl’s books are canonical; Koller & Friedman are unnecessarily long but definitely cover all the key pieces. Koller has a coursera course covering similar material, which would probably be a good choice.

### Jaynes

Jaynes’ Probability Theory: The Logic Of Science is a book for which I know no substitute. It is a book on Bayesian probability theory by the leading Bayesian probability theorist of the twentieth century; other books on the topic look sloppy by comparison. There are insights in this book which I have yet to find in any other book or course.

At the bare minimum, read chapters 1-4 and 20. I’ve read it cover-to-cover, and found it immensely valuable.

### Information Theory

Information theory is a powerful tool for translating a variety of intuitions into math, especially agency-adjacent intuitions.

I don’t know of any really good source on information theory, but I do remember that there’s one textbook from about 50 years ago which is notoriously terrible. If you find yourself wading through lots of analysis, put the book down and find a different one.

I have used a set of “Information Theory and Entropy” lectures from MIT, which are long but have great coverage of topics, especially touching on more physics-flavored stuff. I also use Cover & Thomas as a reference, mainly because it has good chapters on Kelly betting and portfolio optimization.

### Godel Escher Bach

Another book for which I know no substitute. Godel Escher Bach is… hard to explain. But it’s a fun read, you should read it cover-to-cover, and you will have much better conceptual foundations for thinking about self-reflection and agency afterwards.

### ML

Obviously some hands-on experience with ML is useful for anyone working on AI, even theoretical work—current systems are an important source of “data” on agency, same as biology and economics and psychology/neuroscience. Also, it’s one of those classes which brings together a huge variety of technical skills, so you can practice all that linear algebra and calculus and programming.

Unfortunately, these days there’s a flood of ML intros which don’t have any depth and just tell you how to call magic black-boxes. For theoretical agency/alignment work, that’s basically useless; understanding what goes on inside of these systems is where most of the value comes from. So look for a course/book which involves building as much as possible from scratch.

You might also consider an “old-school” ML course, from back before deep learning took off. I used Andrew Ng’s old lectures back in the day. A lot of the specific algorithms are outdated now, but there’s a lot of math done automagically now which we used to have to do by hand (e.g. backpropagating gradients). Understanding all that math is important for theory work, so doing it the old-fashioned way a few times can be useful.

Other than understanding the internals of deep learning algorithms, I’d also recommend looking into the new generation of probabilistic programming languages (e.g. Pyro), and how they work.

## Algorithms

I’ve heard a saying that you can become a great programmer either by programming for ten years, or by programming for five years and taking an algorithms class. For theory work, a solid understanding of algorithms is even more important—we need to know what’s easy, what’s hard, and be able to recognize easy vs hard things in the wild.

Algorithms courses vary a lot in what they cover, but some key things which you definitely want:

Dynamic programming. I’ve used one of Bellman’s books on the subject, which was excellent.

NP-completeness & reductions. You need to be able to recognize the kinds-of-problems which are usually NP-complete, and be able to prove that they’re NP-complete if necessary.

Relaxation-based search (i.e. A* search), if you haven’t already covered in depth in an intro AI course

Depending on how much depth you want on the more theoretical parts, Avi Wigderson has a book with ridiculously deep and up-to-date coverage, though the writing is often overly abstract.

### Numerical Algorithms

Numerical algorithms are the sort of thing you use for simulating physical systems or for numerical optimization in ML. Besides the obvious object-level usefulness, many key ideas of numerical algorithms (like sparse matrix methods or condition numbers) are really more-general principles of world modelling, which for some reason people don’t talk about much until you’re up to your elbows in actual numerical code.

Courses under names like “numerical algorithms”, “numerical analysis”, or “scientific computing” cover various pieces of the relevant material; it’s kind of a grab-bag.

## Biology

For purposes of agency and alignment work, biology is one of the main sources of evolved agenty systems. It’s a major source of intuitions and qualitative data for my work (and hopefully quantitative data, some day). Also, if you want to specialize in problems-we-don’t-understand more generally, biology will likely be pretty central.

The two most important books to read are Alon’s Design Principles of Biological Circuits, and the Bionumbers book. The former is about the surprising extent to which evolved biological systems have unifying human-legible design principles (I have a review here). The latter is an entire book of Fermi estimates, and will give you lots of useful intuitions and visualizations for what’s going on in cells.

I also strongly recommend a course in synthetic biology. I used a set of lectures which I think were a pilot for this course.

## Economics

Like biology, economics is a major source of intuitions and data on agenty systems. Unlike biology, it’s also a major source of mathematical models for agenty systems. I think it is very likely that a successful theory of the foundations of agency will involve market-like structures and math.

I don’t know of any very good source on the “core” market models of modern economics beyond the 101 level. I suspect that Stokey, Lucas and Prescott does a good job (based on other work by the authors), but I haven’t read it myself. I believe you’d typically find this stuff in a first-year grad-school microeconomics course.

If you want to do this the hard way: first take convex optimization (see above), then try to solve the N Economists Problem.

N economists walk into a bar, each with a utility function and a basket of goods. Compute the equilibrium distribution of goods.

This requires making some reasonably-general standard economic assumptions (concave increasing utility functions, rational agents, common knowledge, Law of One Price).

Learning it the hard way takes a while.

Once you have the tools to solve the N Economists problem (whether from a book/course or by figuring it out the hard way), the next step along the path is “dynamic stochastic general equilibrium” models and “recursive macro”. (These links are to two books I happen to have, but there are others and I don’t have any reason to think these two are unusually good.) You probably do *not* need to go that far for alignment work, but if you want to specialize in problems-we-don’t-understand more generally, then these tools are the cutting-edge baseline for modelling markets (especially financial markets).

### Game Theory

Game theory is the part of economics most directly relevant to alignment and agency, and largely independent of market models, so it gets its own section.

You might want to take an intro-level course if you don’t already know the basics (e.g. what a Nash equilibrium is), but you might just pick that up somewhere along the way. Once you know the very basics, I recommend two books. First, Games and Information by Eric Rasmussen. It’s all about games in which the players have different information—things like principal-agent problems, signalling, mechanism design, bargaining, etc. This is exactly the right set of topics to study, which largely makes up for a writing style which I don’t particularly love. (You might be able to find a course which covers similar material.)

The other book is Thomas Schelling’s Strategy of Conflict, the book which cousin_it summarized as:

Forget rationalist Judo: this is rationalist eye-gouging, rationalist gang warfare, rationalist nuclear deterrence. Techniques that let you win, but you don’t want to look in the mirror afterward.

For this book, I don’t know of any good substitute.

## Control Theory

Control systems are all over the place in engineered devices. Even your thermostat needs to not be too sensitive in blasting out hot/cold air in response to cold/hot temperatures, lest we get amplifying hot/cold cycles. It’s a simple model, but even complex AI systems (or biological systems, or economic systems) can be modeled as control systems.

You’ll probably pick up the basics of linear control theory in other courses on this list (especially linear dynamical systems). If you want more than that, one of Bellman’s books on dynamic programming and control theory is a good choice, and these lectures on underactuated control are really cool. This is another category where you only need the very basics for thinking about alignment and agency, but more advanced knowledge is often useful for a wide variety of problems.

## Dynamical Systems

Chaos is conceptually fundamental to all sorts of “complex systems”. It’s quite central to my own work on abstraction, and I wouldn’t be at all surprised if it has other important applications in the theory of agency.

There’s many different classes where you might pick up an understanding of chaos, but a course called “Nonlinear Dynamical Systems” (or something similar) is the most likely bet.

## Statistical Mechanics

Probably my biggest mistake in terms of undergraduate coursework was not taking statistical mechanics. It’s an alternative viewpoint for all the probability theory and information theory stuff, and it’s a viewpoint very concretely applied in everyday situations. Some of it is physics-specific, but it’s an ongoing source of key ideas nonetheless.

If you can learn Bayesian stat mech, that’s ideal, although it’s not taught that way everywhere and I don’t know of a good textbook. (If you want a pretty advanced and dense book, Walter T Grandy is your guy, but that one is a bit over my head.)

## The Sequences

In case nobody mentioned it yet, you probably want to read the sequences, including these two. They’re long, but they cover a huge amount of important conceptual material, and they’re much lighter reading than technical textbooks.

# Useful In General, But Not So Much For Alignment

This section is intended for people who want to specialize in technical problems-we-don’t-understand more generally, beyond alignment. It contains courses which I’ve found useful for a fairly broad array of interesting problems, but less so for alignment specifically. I won’t go into as much depth on these, just a quick bullet list with one-sentence blurbs and links.

Theoretical Mechanics. Using Newton’s laws for everything gets messy in more complicated systems; this course covers cleaner methods. Susskind’s lectures are good.

Quantum. If you have an itching desire to know how it works, I strongly recommend The Quantum Challenge as a starting point. That book covers the conceptually-”weird” parts much better than most courses.

Electromagnetism. This is the more theory-heavy part of E&M, circuits is more practical. Griffiths is the standard textbook, and is quite good.

Electronic circuits. I used MIT’s 6.002 lectures, which were fun.

Digital logic/VLSI/etc. This is the class where you design a simple computer CPU starting from transistors and wires.

Systems programming. The gnarly parts of programming—dealing with the OS and low-level code, databases, networks, etc.

Parallel/asynchronous programming. Self explanatory.

SQL. Also self explanatory.

Graphics (esp. Procedural Graphics). Games and animation are one of the places where people need really robust, fast, realistic simulations of all sorts of things, which makes it a really cool area to practice lots of technical skills.

Robotics. Another fun area to practice lots of technical skills.

Modular arithmetic, polynomial rings, and related algorithms (polynomial multipoint, GCD, Chinese remainder). Powerful tools for certain kinds of algorithmic problems; might be scattered across a few different classes.

Materials 101. MIT has some really fun lectures.

Continuum mechanics (i.e. Elastics & Fluid Mechanics). Core tools for modelling solids and fluids, respectively.

Math Finance. Ito calculus in particular is a very useful tool. Hull is the standard text; any course using that text will likely cover similar material

Fourier. Generally a useful tool for linear PDEs, and the backbone of fast convolutions (as in “convolutional neural network”). Somewhat old-school at this point.

PDEs. Nonlinear PDEs and Numerical PDEs are usually separate classes, and are also quite useful (the former for qualitative understanding of nonlinear-specific phenomena like shocks, the latter for simulation).

Complex analysis. These tools sure do seem powerful, but I haven’t gotten much use out of them in practice. Not sure if that’s just me or not.

# Final Thoughts

That was a lot. It took me roughly eight hours of typing just to write it all out, and a lot longer than that to study it all.

With that in mind: **you absolutely do not need to study all of this**. It’s a sum, not a logical-and. The more you cover, the wider the range of ideas you’ll have to draw from. It’s not like everything will magically click when you study the last piece; it’s just a long gradual accumulation.

If there’s one thing which I don’t think this list conveys enough, it’s the importance of actually playing around with all the frames and tools and trying them out on problems of your own. See how they carry over to new applications; see how to use them. Most of the things on this list I studied because they were relevant to one problem or another I was interested in, and I practiced by trying them out on those problems. Follow things which seem interesting, things for which you already have applications in mind, and you’ll learn them better. More advanced projects will practice large chunks of this list all at once. In large part, the blurbs here were meant to help suggest possible applications and stoke your interest.

Oh, one more thing: practice writing clear explanations and distillations of technical ideas. It’s a pretty huge part of alignment and agency research in practice. I hear blog posts explaining the technical stuff you’re learning are pretty good for that—and also a good way to visibly demonstrate your own understanding.

- How To Get Into Independent Research On Alignment/Agency by 19 Nov 2021 0:00 UTC; 334 points) (
- 7 traps that (we think) new alignment researchers often fall into by 27 Sep 2022 23:13 UTC; 170 points) (
- MATS Models by 9 Jul 2022 0:14 UTC; 86 points) (
- 7 traps that (we think) new alignment researchers often fall into by 27 Sep 2022 23:13 UTC; 72 points) (EA Forum;
- LW Filter Tags (Rationality/World Modeling now promoted in Latest Posts) by 28 Jan 2023 22:14 UTC; 60 points) (
- Finding Great Tutors by 5 Oct 2022 22:08 UTC; 27 points) (
- 7 Learnings and a Detailed Description of an AI Safety Reading Group by 23 Sep 2022 2:02 UTC; 20 points) (EA Forum;
- College Selection Advice for Technical Alignment by 16 Dec 2022 14:53 UTC; 20 points) (EA Forum;
- 10 Apr 2022 18:54 UTC; 19 points) 's comment on Finally Entering Alignment by (
- If I want to test how good I would be as an AI safety researcher alongside my full-time job (with the hope of it becoming my full-time career at some point), is this a good plan? by 2 Mar 2023 9:44 UTC; 16 points) (
- College Selection Advice for Technical Alignment by 16 Dec 2022 17:11 UTC; 11 points) (
- How to create curriculum for self-study towards AI alignment work? by 7 Jan 2023 19:53 UTC; 10 points) (EA Forum;
- [Linkpost] How To Get Into Independent Research On Alignment/Agency by 14 Feb 2022 21:40 UTC; 10 points) (EA Forum;
- 10 Apr 2023 13:48 UTC; 8 points) 's comment on Open & Welcome Thread – April 2023 by (
- 24 Apr 2023 2:05 UTC; 6 points) 's comment on An open letter to SERI MATS program organisers by (
- List of links for getting into AI safety by 4 Jan 2023 19:45 UTC; 5 points) (
- 10 Nov 2022 15:38 UTC; 5 points) 's comment on Some advice on independent research by (
- A few thoughts on my self-study for alignment research by 30 Dec 2022 22:05 UTC; 4 points) (
- 4 May 2023 18:27 UTC; 2 points) 's comment on How MATS addresses “mass movement building” concerns by (EA Forum;
- 26 May 2023 21:02 UTC; 1 point) 's comment on What should my college major be if I want to do AI alignment research? by (
- 24 Apr 2023 15:37 UTC; 1 point) 's comment on An open letter to SERI MATS program organisers by (
- 4 May 2023 18:22 UTC; 1 point) 's comment on How MATS addresses “mass movement building” concerns by (

You should absolutely read some philosophy (outside LW), but different bits depending on what you want to do. My generic recommendation would be Parfit’s

Reasons and Persons.And the most famous papers by Quine and Dennett.Don’t be afraid of papers. Most famous academics since the transformation of post-WW2 academia are famous because they wrote good papers that people liked to read. Heck, most academic papers in any field aren’t that bad once you build up a tolerance. If you’re not affiliated with a university, figure out how to read whatever papers you want anyhow.

One omission from the syllabus might be algorithmic information theory. Li and Vitanyi is a very good textbook—I can measure how good it is in units of how embarrassed I feel at the things I said before I read it.

Oh yeah, I did read Li and Vitanyi pretty early on. I completely forgot about that.

Curated. I wish I could drop everything and devote myself to all the topics listed here, I love the sheer love of knowledge I perceive here.

I’m curating this because there are some people for whom this is invaluable guidance, and because I’d like to see more of this from other cutting-edge researchers. This post is more than a list of the topics the author happened to study, rather it comes with a whole worldview that I think is just as important as the list. I’d love to see more like this.

not just a long list, but a paradigm,

I’d adjust the “breadth over depth” maxim in one particular way: Pick one (maybe two or three, but

few) small-ish sub-fields / topics to go through in depth, taking them to an extreme. Past a certain point, something funny tends to happen, where what’s normally perceived as boundaries starts to warp and the whole space suddenly looks completely different.When doing this, the goal is to observe that “funny shift” and the “shape” of that change as good as you can, to identify the signs of it and get as good a feeling for it as you can. I believe that being able to (at least sometimes) notice when that’s about to happen has been quite valuable for me, and I suspect it would be useful for AI and general rat topics too.

As a relatively detailed example: grammars, languages, and complexity classes are normally a topic of theoretical computer science. But if you

actually lookat all inputs through that lens, it gives you a good guesstimate for how exploitable parsers for certain file formats will be. If something is context-free but not regular, you know that you’ll have indirect access to some kind of stack. If it’s context sensitive, it’s basically freely programmable. For every file format (/ protocol / …), there’s a latent abstract machine that’s going to run your input, so your input will essentially be a program and—within the boundaries set by the creator of that machine—you decide what it’s going to do.(Turns out those boundaries are often uncomfortably loose...)Some other less detailed examples: Working extensively with Coq / dependently typed programming languages shifted my view of axioms as something vaguely mystical/dangerous/special to a much more mundane “eh if it’s inconsistent it’ll crash”, I’m much more happy to just experiment with stuff and see what happens. Lambda calculus made me realize how data can be seen as “suspended computations”, how different data types have different “computational potential”. (Lisp teaches “code is data”, this is sorta-kinda the opposite.) More generally, “going off to infinity” in “theory land” for me often leads to “overflows” that wrap around into arcane deeply practical stuff. (E.g. using the algebra of algebraic data types to manually compress an ASM function by splitting it up into a lookup/jump table of small functions indexed by another simple outer function, thereby reducing total byte count and barely squeezing it into the available space.)

You’re unlikely to get these kinds of perspective shifts if you look only at the basics. So every once in a while, dare to just run with it, and see what happens.

(Another aspect of that which I noticed only after posting: If you always look only at the basics / broad strokes, to some degree you learn/reinforce not looking at the details. This may not be a thing that you want to learn.)

time to crash this party

unfortunate problems:

this list of topics is legitimately too big.

nobody can try to cover all of it in a reasonable amount of time.

possible solutions:

the general patterns of ‘skip (lazy evaluate) prerequisites and audit classes’ is good, but the best thing you can do if you want to keep up with ai research is to directly court mentors who are well-established in this field.

and to trust those mentors to give you the personal guidance you individually need the most in order to make the most rapid progress you can.

this pattern is extremely valuable and more-or-less obsoletes individual recommendations in terms of literature or conceptual categories. i’m a lakoffian, and will of course tell people to read lakoff, & nietzsche, & pubmed trawls on specific topics in neurochemistry. but that’s because “ai” or “alignment” are more like ‘intelligence studies’ than any clearly divided topic area, and the central problems can be reached from linguistics, philosophy, biology, or even the computational challenges in designing videogame cheats and bots in arms races against studio developer captchas and primitive turing tests.

closing thoughts:

this post is uncannily similar to a recommendation for readers to roll up their own doctoral program, more or less.

and that’s not a bad thing!

but it’s good to keep in mind that tackling a research problem as broad, significant, and challenging as this is best done with peers, advisors, & sources of external feedback to help the questant pointed towards useful self-development instead of futile toil.

Worth noting that many economists (including e.g. Solow, Romer, Stiglitz among others) are pretty sceptical (to put it mildly) about the value of DSGE models (not without reason, IMHO). I don’t want to suggest that the debate is settled one way or the other, but do think that the framing of the DSGE approach as the current state-of-the-art at least warrants a significant caveat emptor. Afraid I am too far from the cutting edge myself to have a more constructive suggestion though.

Two comments on this;

First, DSGE models as actually used are usually pretty primitive. I (weakly) believe this is mainly because econometrists mostly haven’t figured out that they can backpropagate through complex models, and therefore they can’t fit the parameters to real data except in some special simple cases. From what I’ve seen, they usually make extremely restrictive assumptions (like Cobb-Douglas utilities) in order to simplify the models.

Second, the use-case matters. We’d expect e.g. financial markets to be a much better fit for DSGE models than entire economies. And personally, I don’t even necessarily consider economies the most interesting use-case—for instance, to the extent that a human is well-modelled as a collection of subagents, it makes sense to apply a DSGE model to a single human’s preferences/decisions. (And same for other biological systems well-modelled as a collection of subagents.)

Anyway, the important point here is that I’m more interested in the cutting edge of mathematical-models-of-collections-of-agents than in forecasting-whole-economies (since that’s not really my main use-case), and I do think DSGE models are the cutting edge in that.

Fair point re use cases! My familiarity with DSGE models is about a decade out-of-date, so maybe things have improved, but a lot of the wariness then was that typical representative-agent DSGE isn’t great where agent heterogeneity and interactions are important to the dynamics of the system, and/or agents fall significantly short of the rational expectations benchmark, and that in those cases you’d plausibly be better of using agent-based models (which has only become easier in the intervening period).

Plausible. I suspect the suspicion of fitting more complex models is also influenced by the fact that there’s just not that much macro data + historical aversion to regularisation approaches that might help mitigate the paucity of data issues + worries that while such approaches might be ok for the sort of prediction tasks that ML is often deployed for, they’re more risky for causal identification.

Yeah, this all sounds right. Personally, I typically assume both heterogenous utilities and heterogenous world-models when working with DSGE, at which point it basically becomes an analytic tool for agent-based models.

Thank you for writing this! I once thought about asking LW for something like this but never got around to it.

I’m an undergraduate; I expect to take several more late-undergraduate- to early-graduate-level math courses. Presumably some will turn out to be much more valuable to me than others, and presumably this is possible to predict better-than-randomly in advance. Do you [or anyone else] have thoughts on how to choose between math courses other than those you mention, either specific courses (and why they might be valuable) or general principles (any why they seem reasonable)? (I don’t have any sense of what the math of agency and alignment is like, and I hope to get a feel for it sometime in the next year, but I can’t right now — by the way, any recommendations on how to do that?)

Yes, but not in a uniform way. The mathematical frontier is so large, and semesters so short, that Professor A’s version of, for instance, a grad level “Dynamical Systems” course can have literally no overlap with Professor B’s version. Useful advice here is going to have to come from Professors A and B (though not necessarily directly).

Underdeveloped. There’s some interesting work coming out of the programming language theory / applied category theory region these days (Neil Ghani and David Spivak come to mind), but “the math of agency” is not even an identifiable field yet, let alone one mature enough to show up in curricula.

I don’t have recommendations for courses or principles to select them beyond what’s in the post. (Otherwise I would have put them in the post.)

I don’t think you’re going to find anybody with existing good answers. The embedded agency sequence is the best articulation of the problems which I currently know of. (Even there I disagree with the degree of emphasis placed on various subproblems/frames, but it is nonetheless very good.)

If you want a useful tarting point to think about these things yourself: ask how to calculate the world-model and preferences of an e-coli directly from a low-level specification of the cell (i.e. all the reaction dynamics and concentrations and forces and whatnot).

His 2018 lectures are also available on youtube and seem pretty good so far if anyone wants a complement to the book. The course website also has lecture notes and exercises.

Meta-note: I’d usually recommend complementing a course with a book by someone else, in order to get a different perspective. However, some professors are uniquely good at teaching their particular thing, and I’d include both Uri Alon and Stephen Boyd (the convex optimization guy) in that list. In those cases it more often makes sense to use materials from the one professor.

I happen to be quitting my job right now to go and spend some time on studying for general problem-solving ability. I’ll be doing it full-time.

I wonder if you could give an estimate of how long it would take to do all of this.

My starting point is a bachelor’s in AI, but perhaps it’s best to give an estimate from the high-school level.

Starting from the high-school level, most of the material in this post took me about 5-6 years (a year or two of high school plus four years of college).

I don’t think more than a year or two could be shaved off without somebody creating much better study material. (I do think a lot better study material could be made—the framing practica are an attempt at a prototype of that—but I found it very time intensive to make such things.) On the other side, I covered far more ground in college than the vast majority of people I know, and I don’t know how much of that is limited by natural ability vs just wanting to do it, so it could take a lot longer.

I’m more interested in the time this would take if one wasn’t constrained by being in college. My intuition is that you can go 2x faster on your own if the topic and the pace isn’t being imposed on you, but maybe college just matched your natural learning style.

Thanks for the data point in any case

That’s a good point. College did match my natural learning style pretty well (albeit with a larger-than-usual technical courseload, and a lot of textbooks/lectures on the side).

I find your 2x estimate plausible, though obviously very highly dependent on the person and the details; it’s definitely not something I’d expect to work for everyone or even most people.

Love this! You’re really mild on programming language theory and functional programming. Any comments on the omission?

I do mention that one should probably work with a LISP variant at some point, at a bare minimum. Being able to think in functional programming ways is definitely important, especially when you start dealing with things which blur the lines between math notation and programming language.

On PL theory and functional programming beyond the basics of LISP (and also the related topic of compilers), I mostly think the existing tools/theory just aren’t that great and will likely look quite different in the future. That said, it’s an area which goes a lot deeper than my knowledge, and my belief on the matter is weakly-held.

There’s functional programming, and then there’s functional programming. The term is overloaded almost to the point of uselessness: it mostly just means not-imperative. But beyond that, lisps have little in common with MLs.

Lisp is python done right, and is best suited for the same sorts of domains: those where inputs are messy and everyone is ok with that, runtime errors are easy to surface and cheap to fix, and mostly-right means mostly-as-good.

MLs are theorem provers done wrong done right, and lend themselves to a pretty much orthogonal set of problems: inputs are either well-structured or dangerous enough that they need to have a structure imposed on them, runtime errors can be subtle and/or extremely bad, and mostly-right means wrong.

I agree that tooling based on curry howard correspondences isn’t great. We live in a somewhat primitive time. Maybe you’re interested in my brief speculation about formal verification as a sort of dark horse in alignment, I look at FV and the

developer ecosystemquestion which lags behind thestate of the art theoryquestion.And, you mentioned proof and how it’s disappointing that analysis is fodder for proofs, which I roughly agree with and roughly disagree with (analysis is nice because you can review and strengthen calculus while leveling up in logic. Parallelization!)--- I’d like to nominate logical foundations. It gave me a very thorough foundation after a couple years of community college, it’s the kind of learning that you feel in your bones if you do it interactively. Executable textbooks in general are, I think, more bang for buck than classical textbooks.

Meta edit: hyperlink isn’t working, softwarefoundations.cis.upenn.edu/ maybe this is better.For other lazy people: I found this exercise quite nice in actually encouraging me to make these fermi estimates.

For a ’Bayesian Probability 101″, I’m currently following Richard McElreath’s Statistical Rethinking course.

I still haven’t finished it (only on Chapter 4), but so far it’s all that I wanted it to be:

It has the lecture videos on YouTube.

A GitHub repo with code examples in R, Python, and Julia.

An accompanying book (Chapters 1 and 2 are provided for free by the author)

Provides some good theoretical background on models and hypothesis testing and also pairs that with programming exercises.

It’s funny cause this was recommended to me by an EA friend, so I assumed someone would have mentioned it here already. ;-)

Stochastic Processes: This is the biggest omission on this list, particularly given the emphasis on probability and AI. This is probability over time, or some time like construct like rounds in a game. The basics include random walks, Markov chains, Hidden Markov Models, and conditional probability. I studied the textbook Probability Models by Ross, which is great and is available for free download, although I’m sure other textbooks are good as well. Most applied probability uses stochastic processes.

Measure Theoretic Probability: If you are serious about probability and applications, this is important. This is a very mathematically rigorous treatment of probability and will give you a much deeper understanding of the subject. This is needed for a lot of probability applications. This post mentions Mathematical Finance, which uses measure theoretic probability heavily.

Prereqs: This post recommends skipping prereqs, but recommends taking linear algebra, probability, and a standard calculus sequence, which are arguably important prereqs. The idea of taking years of prereqs sounds awful. If you take a probability class or a linear algebra class and view it as just some prereq before the real classes start, that isn’t exciting. But often, particularly in math, when advanced concepts build on earlier concepts, you want to learn everything in the right order. Lots of schools, have non-technical prereqs, like you have to take a freshman seminar class before you are allowed to enroll in upper division stuff, and that’s intended to help orient 18-19 year olds to campus life, and is kind of off topic for this thread.

Modern Math Fundamentals: The three big areas are analysis, abstract algebra, and topology. Even if your interest is strictly practical + applied, learning the fundamentals of modern math is important. The OP mentions calculus; real analysis is the more advanced + theoretical version of calculus. And it’s necessary if you plan to do any type of higher math, including practical applied stuff. You might view these as important prereqs.

Cryptography: Any applied math list should mention this. This list mentions “modular arithmetic”, that’s usually called Integer Number Theory. Also, some abstract algebra is important. A good math department should offer a good semester on mathematical cryptography that covers RSA, DSA, key exchange algorithms, elliptic curve variants, etc.

Logic: This post mentions a book on Godel; taking a serious course on Godel’s logic contributions, most notably Godel’s incompleteness theorems is worthwhile. My undergrad school offered two core logic semesters: the first semester was basic deductive logic including formal proof systems, truth trees, etc. The second semester “mathematical logic” or “meta-logic” was much more rigorous, covers a ton of content, including theoretical computability, Turing computability, and Godel’s incompleteness theorems.

I personally covered the relevant parts of measure theory and a lot of stochastic processes in math finance, which I think is a good way to do it. I did take an OR class which spent about half the time on Markov chains, but I consider that stuff pretty straightforward if you have a good grounding in linear algebra.

Analysis/abstract/topology are exactly the sort of prereqs I recommend skipping. The intro classes usually spend a bunch of time on fairly boring stuff; intermediate-level classes will usually review the actually-useful parts as-needed.

The crypto recommendation makes sense. For logic, I don’t think there’s much value in diving into the full rigor; it’s mostly the concepts that matter, and proving it all carefully is extremely tedious. Definitely important to get the core concepts, though.

You recommend the basic math courses: linear algebra, probability, a standard calculus sequence. You just don’t recommend the more pure math type courses. In your view, pure math courses spend too much time digging into boring tedious details, and you advise more applied courses instead. That’s an entirely valid perspective. And it may be the most productive tactic.

Real analysis, abstract algebra, and topology are often the hardest and most advanced courses in the undergraduate math catalog. Those are considered the capstone courses of an undergraduate degree in pure mathematics. You reference them as introductory classes or prereqs which seems not correct. At almost any university, Real Analysis is the more advanced, theoretical, and difficult version of calculus.

Did you study martingales or stopped brownian motion? Are those useful or recommended? Those seem relevant to finance and applied probability?

I really enjoyed this post, and thank you for the awesome reply.

Yeah, fair. Harvey Mudd is probably unusual in this regard—it’s a very-top-tier exclusively-STEM school, so analysis and abstract algebra were typically late-sophomore-year/early-junior-year courses for the math majors (IIRC). I guess my corresponding advice for someone at a typical not-exclusively-undergrad university would be to jump straight into grad-level math courses.

(As with the post, this advice is obviously not for everyone.)

Yup, that comes up in math finance. I haven’t seen them come up much outside of finance, they’re kind of niche in the broader picture.

They primarily & extensively statistical graphical models, not causality (but have a chapter on it)For Bayes Probability, Bayes rule is a great introduction. Yudkowsky’s intro also endorses this one.

Breadth Over Depth → To reframe, is it about to optimize for known unknown?

Yes, that’s an accurate reframing.

Thank you for writing this out.

Lacking any computer science background (I come from philosophy of mind, phenomenology, ethology (animal behaviour), psychology, and neuroscience), I simultaneously think that perspective gives me a unique take, and that anything I do on AI will be effectively worthless unless I get an elementary technical understanding. I agree with the point on diminishing returns, and that clearly, technical expertise is what I would profit from the most here. I managed to get hosted as a visiting researcher and thesis supevisor for AI in computer science, and have people close to me I could draw on for help who have a background in computer science, though often not the necessary birds eye view to identify what is important and what is unnecessary detail.

I’m currently particularly interested in Large Language Models, and also think they might be the best entry for me, insofar as I can interact with them competently without a programming background, and review their training data. I would really like to get an understanding of how they work that goes beyond the pop science articles of statistical parroting; basically, I am particularly interested in getting enough of an understanding of their architecture to be able to contrast it with biological models I am more familiar with. Ideally, I could benefit from learning about them while interacting with them; LLM can absolutely help you learn code and debug code, for instance, as well as explain some things—but with a massive risk of them hallucinating, and me not having the expertise to spot it.

Do you have advice on where to start on this? Which skills and knowledge are absolutely non-skippable? Which simpler models I might start with to give me a better intuition of what is going on? (I frankly do not get how LLM can possibly do what they do based on how their working mechanism has been explained to me.) Breakdowns for laypeople that get it right? I would be seriously grateful.

Thoughts on computational learning theory?

To my knowledge, there has not ever been a practically-useful result in the entire history of learning theory. Now, that could just be my ignorance, but mostly the field seems to prove results which are simply not relevant to real world learning.

Thanks a lot for writing this. I am lurking in this site for quite sometime, and what impressed me the most is how many posts are here for how to learn new things, and what to learn. Some of the textbook recommendations will be very helpful.

Could you give an example of how you’d make a Fermi estimation of a program’s runtime?

To Aysajan’s answer, I would add that “number of calculations a program needs to run” usually comes from a big-O estimate for the data structures involved, and the size of the data we’re using. So, for instance, if I’m looping over a list with 1k items and doing a thing to each, then that should take ~1k operations. (Really the thing I’m doing to each will probably take more than one operation, but this is a Fermi estimate, so we just need to be within an order of magnitude.) If I’m looping over all pairs of items from two lists, then the number of operations will be the product of their sizes.

For instance, a computer’s CPU is measured in GHz, which is a proxy for the number of calculations the CPU can run per second. So it is about one billion (109) calculations per second. Now let’s suppose the number of calculations your program needs to run is 106, then you can make a Fermi estimation about the program’s run time as 106109=10−3, which is millisecond. Usually we would expect the actual run time will be within an order of magnitude of this estimation.

Could you say more about the value proposition of chemistry?

The value prop of physics as I understand it, and I’m pretty behind on physics myself, is

classical mechanics is the bare bones proof of concept for working with “the scientific method”

included in 1. is a bare bones instance of predicting and modeling, which gives you firm ground for your feet when you’re predicting and modeling things that aren’t so straight forward.

if you’re a logic freak it trains you to be comfortable with the insufficiency of symbols, memes and jokes of mathematicians and physicists disagreeing about notation reveals an underlying controversy about the question “what are symbols actually for?”, and sophisticated thinking about this question is a major asset.

if you’re a logic freak it gets you out of your comfort zone

practice with calculus/geometry.

I don’t think this is an accurate description of the cultural difference between physicists and mathematicians. Tiny respective minorities of die-hard instrumentalists and formalists aside, both fields agree that the symbols are just tools for talking about the relevant objects of study more clearly and concisely than natural language permits. Plenty of published math research is formally incorrect in an extremely strong sense, but no one cares as long as all the errors can be corrected trivially. In fact, an important aspect of what’s often called “mathematical maturity” is the ability to make those corrections automatically and mostly unconsciously, instead of either falling into genuine sloppiness or getting hung up on every little “the the”.

The real core difference is the obvious one. To zeroth order: physicists study physics, and mathematicians study math. To first order: physicists characterize phenomena which definitely exist, mathematicians characterize structures which may or may not.

The universe definitely exists, it definitely has a structure, and any method which reliably makes correct predictions reflects a genuine aspect of that structure, whatever it might be. Put another way: physicists have an oracle for consistency. Mathematicians don’t have that option, because structures are the things they study. That’s what makes them mathematicians, and not physicists. They can retreat to higher and higher orders, and study classes of theories of logics for …, but the regress has to stop somewhere, and the place it stops has to stand on its own, because there’s no outside model to bear witness to its consistency.

If all known calculations of the electron mass rely on some nonsensical step like “let d = 4 - epsilon where d is the dimensionality of spacetime”, then this just means we haven’t found the right structure yet. The electron mass is what it is, and the calculation is accurate or it isn’t. But if all known “proofs” of a result rely on a nonsensical lemma, then it is a live possibility that the result is

false. Physics would look very different if physicists had to worry about whether or not there was such a thing as mass. Math would look very different if mathematicians had a halting oracle.I roughly agree with that value prop for physics. I’d add that physics is the archetype of the sciences, and gets things right that haven’t necessarily been made a legible part of “the scientific method” yet, so it’s important to study physics to get an intuitive idea of science-done-right beyond what we already know how to explain well. (Gears-level models are a good example here—physics is a good way to gain an intuition for “gears” and their importance, even if that’s not explicitly brought to attention or made legible. Your point about how we use symbols and logic in physics is another good example.)

The main value proposition of 101-level chemistry is to just to understand the basics of stoichiometry, reaction kinetics, and thermodynamics, especially in biological systems. Beyond that, chemistry is one of my dump stats, for good reason: more advanced chemistry (and materials science) tends to have relatively narrow focus on particular domains, like polymers or ceramics or whatever, and doesn’t offer much generalizable knowledge (as far as I can tell).

I would argue that physics can make very accurate quantitative predictions under the right circumstances...and that it nonetheless poses philosophical challenges much more than other quantitave sciences.

This is incorrect without additional assumptions:

Consider the following contrived case. We have a hidden variable H.

H increases B

B increases C

H decreases C

Depending on the relative strengths of said causations, the correlation of B with C can be arbitrarily close to zero (or even actually zero, although this case is vanishingly unlikely.)

(I believe this is called the Faithfulness assumption, although it’s been too long. The usual handwave is that either we know everything that could potentially affect these variables, or this case is vanishingly improbable to occur, although this latter assumption fails to account for processes that end up optimizing towards zero correlation.)

(The true issue is more “what do you do when no two combinations of variables are independent”, which results in the corner case where you can’t infer much of anything on the causality DAG. Unfortunately, this is generally true, and we handwave by treating low correlations as independent. Unfortunately, P(neutrino hitting A | B) != P(neutrino hitting A | ~B) for pretty much any A and B. (Or vice versa.) The correlation is incredibly tiny, but tiny != 0. (At least assuming A and B are within each others lightcone (neutrinos don’t travel at c, but hopefully you get the gist.). So either we drop true correlations, or allow through spurious ones. Either way this is far more than straight deduction.))