I (with the help of a few more people) am planning to create an introduction to AI Safety that a smart teenager can understand. What am I missing?

Disclaimer: My English isn’t very good, but do not dissuade me on this basis—the sequence itself will be translated by a professional translator.

I want to create a sequence that a fifteen or sixteen year old smart school student can read and that can encourage them to go into alignment. Right now I’m running an extracurricular course for several smart school students and one of my goals is “overcome long inferential distances so I will be able to create this sequence”.

I deliberately did not include in the topics the most important modern trends in machine learning. I’m optimizing for the scenario “a person reads my sequence, then goes to university for another four years, and only then becomes a researcher.” So (with the exception of the last part) I avoided topics that are likely to become obsolete by this time.

Here is my (draft) list of topics (the order is not final, it will be specified in the course of writing):

  1. Introduction—what is AI, AGI, Alignment. What are we worried about. AI Safety as AI Notkilleveryoneism.

  2. Why AGI is dangerous. Orthogonality Thesis, Goodhart’s Law, Instrumental Convergency. Corrigibility and why it is unnatural.

  3. Forecasting. AGI timelines. Takeoff Speeds. Arguments for slow and fast takeoff.

  4. Why AI boxing is hard/​near to impossible. Humans are not secure systems. Why even Oracle AGI can be dangerous.

  5. Modern ML in a few words (without math!). Neural networks. Training. Supervised Learning. Reinforcement Learning. Reward is not the goal of RL-agent.

  6. Interpretability. Why it is hard. Basic ideas on how to do it.

  7. Inner and outer alignment. Mesa-optimization. Internal, corrigible and deceptive alignment. Why deceptive alignment seems very likely. What can influence its probability.

  8. Decision theory. Prisoner’s Dilemma, Newcomb’s problem, Smoking lesion. CDT, EDT and FDT.

  9. What exactly are optimization and agency? Attempts to define this concepts. Optimization as attractors. Embedded agency problems.

  10. Eliezer Yudkowsky’s point of view. Pivotal actions. Why it can be useful to have imaginary EY over your shoulder even if you disagree with him.

  11. Capability externalities. Avoid them.

  12. Conclusion. What can be done. Important organisations. What are they working on now?

What else should be here? Maybe something should not be here? Are there reasons why the whole idea can be bad? Any other advices?