Geoffrey Irving comments on Prover-Estimator Debate: A New Scalable Oversight Protocol

Geoffrey Irving 18 Jun 2025 7:18 UTC
LW: 2 AF: 2
0
AF
I agree with this! On the empirical side, we’re hoping to both get more human participant experiments to happen around debate, and to build more datasets that try to probe obfuscated arguments. The dataset aspect is important, as I think in the years since the original paper follow-on scalable oversight experiments (debate or not) have been too underpowered in various ways to detect the problem, which then results in insufficient empirical work getting into the details.
- Beth Barnes 18 Jun 2025 19:12 UTC
  LW: 7 AF: 5
  6
  AF Parent
  Yep. For empirical work I’m in favor of experiments with more informed + well-trained human judges who engage deeply etc, and having a high standard for efficacy (e.g. “did it get the correct answer with very high reliability”) as opposed to “did it outperform a baseline by a statistically significant margin” where you then end up needing high n and therefore each example needs to be cheap / shallow
  - Geoffrey Irving 18 Jun 2025 20:00 UTC
    LW: 3 AF: 2
    0
    AF Parent
    I would love the two of you (Beth and @Jacob Pfau) to talk about this in detail, if you’re up for it! Getting the experimental design right is key is we want to get more human participant experiments going and learn from them. The specific point of “have a high standard for efficacy” was something I was emphasising to Jacob a few weeks ago as having distinguished your experiments from some of the follow-ons.
    - Beth Barnes 19 Jun 2025 17:50 UTC
      LW: 8 AF: 5
      0
      AF Parent
      Yep, happy to chat!