The post studies handicapped chess as a domain to study how player capability and starting position affect win probabilities. From the conclusion:
In the view of Miles and others, the initially gargantuan resource imbalance between the AI and humanity doesn’t matter, because the AGI is so super-duper smart, it will be able to come up with the “perfect” plan to overcome any resource imbalance, like a GM playing against a little kid that doesn’t understand the rules very well.
The problem with this argument is that you can use the exact same reasoning to imply that’s it’s “obvious” that Stockfish could reliably beat me with queen odds. But we know now that that’s not true.
Since this post came out, a chess bot (LeelaQueenOdds) that has been designed to play with fewer pieces has come out. simplegeometry’s comment introduces it well. With queen odds, LQO is way better than Stockfish, which has not been designed for it. Consequentially, the main empirical result of the post is severely undermined. (I wonder how far even LQO is from truly optimal play against humans.)
(This is in addition to—as is pointed out by many commenters—how the whole analogue is stretched at best, given the many critical ways in which chess is different from reality. The post has little argument in favor of the validity of the analogue.)
I don’t think the post has stood the test of time, and vote against including it in the 2023 Review.
I recently gave a workshop in AI control, for which I created an exercise set.
The exercise set can be found here: https://drive.google.com/file/d/1hmwnQ4qQiC5j19yYJ2wbeEjcHO2g4z-G/view?usp=sharing
The PDF is self-contained, but three additional points:
I assumed no familiarity about AI control from the audience. Accordingly, the target audience for the exercises is newcomers, and are about the basics.
If you want to get into AI control, and like exercises, consider doing these.
Conversely, if you are already pretty familiar with control, I don’t expect you’ll get much out of these exercises. (A good fraction of the problems is about re-deriving things that already appear in AI control papers etc., so if you already know those, it’s pretty pointless.)
I felt like some of the exercises weren’t that good, and am not satisfied with my answers to some of them—I spent a limited time on this. I thought it’s worth sharing the set anyways.
(I compensated by highlighting problems that were relatively good, and by flagging the answers I thought were weak; the rest is on the reader.)
I largely focused on monitoring schemes, but don’t interpret this as meaning there’s nothing else to AI control.
You can send feedback by messaging me or anonymously here.