Rafael Harth comments on A guide to Iterated Amplification & Debate

Rafael Harth 16 Nov 2020 22:39 UTC
16 points

Great of you to list other explanations and references, but it would even be better to explain why someone should read your explanation instead of any of those.

My eternal gripe with academia is that no-one explains things the way I want them explained. I personally found every existing explanation of IDA confusing. So, the reason to read this rather than any other resource is that, if you’re anything like me, you’ll have an easier time with this post vs. any other. (I’m not sure for how many people this is true; maybe if you think more like Paul and less like me, the original sequence will make more sense.) Also, there are a bunch of resources on IDA, but not very much on Debate. And the post by Chi Nguyen hadn’t been published yet when I wrote this.

But you’re right that I’m not saying that in the post. Maybe I should. (Edit: I added a brief note in the intro.)

It’s also supposed to be an advertisement for my sequence on Factored Cognition, since the style is going to be quite similar. I.e., I was worried that a sequence of original research could seem intimidating to people who don’t have a background in AI alignment, but I think anyone who found this post easy to understand won’t have trouble with the sequence, except perhaps with the math in the first two posts.

My only caveat about the content comes from not discussing why debate extends the reach of human supervision (the PSPACE, NP and P part of the Debate paper, or just the intuition of verifying part of an argument, verifying a solution, and finding a solution).

Good point. I agree that should be mentioned explicitly. I’ll add something about it tomorrow.
- David Althaus 27 May 2022 12:37 UTC
  4 points
  Parent
  For what it’s worth, I read/skimmed all of the listed IDA explanations and found this post to be the best explanation of IDA and Debate (and how they relate to each other). So thanks a lot for writing this!