HoldenKarnofsky comments on ARC’s first technical report: Eliciting Latent Knowledge

HoldenKarnofsky 19 Dec 2021 19:33 UTC
LW: 5 AF: 5
0
AF
Regarding this:
The bad reporter needs to specify the entire human model, how to do inference, and how to extract observations. But the complexity of this task depends only on the complexity of the human’s Bayes net.
If the predictor’s Bayes net is fairly small, then this may be much more complex than specifying the direct translator. But if we make the predictor’s Bayes net very large, then the direct translator can become more complicated — and there is no obvious upper bound on how complicated it could become. Eventually direct translation will be more complex than human imitation, even if we are only trying to answer a single narrow category of questions.
This isn’t clear to me, because “human imitation” here refers (I think) to “imitation of a human that has learned as much as possible (on the compute budget we have) from AI helpers.” So as we pour more compute into the predictor, that also increases (right?) the budget for the AI helpers, which I’d think would make the imitator have to become more complex.
In the following section, you say something similar to what I say above about the “computation time” penalty (“If the human simulator had a constant time complexity then this would be enough for a counterexample. But the situation is a little bit more complex, because the human simulator we’ve described is one that tries its best at inference.”) I’m not clear on why this applies to the “computation time” penalty and not the complexity penalty. (I also am not sure whether the comment on the “computation time” penalty is saying the same thing I’m saying; the meaning of “tries its best” is unclear to me.)
What links here?
- Consider trying the ELK contest (I am) by Holden Karnofsky (EA Forum; 5 Jan 2022 19:42 UTC; 110 points)
- paulfchristiano 19 Dec 2021 22:05 UTC
  LW: 2 AF: 2
  0
  AF Parent
  This isn’t clear to me, because “human imitation” here refers (I think) to “imitation of a human that has learned as much as possible (on the compute budget we have) from AI helpers.” So as we pour more compute into the predictor, that also increases (right?) the budget for the AI helpers, which I’d think would make the imitator have to become more complex.
  In the following section, you say something similar to what I say above about the “computation time” penalty … I’m not clear on why this applies to the “computation time” penalty and not the complexity penalty
  Yes, I agree that something similar applies to complexity as well as computation time. There are two big reasons I talk more about computation time:
  - It seems plausible we could generate a scalable source of computational difficulty, but it’s less clear that there exists a scalable source of description complexity (rather than having some fixed upper bound on the complexity of “the best thing a human can figure out by doing science.”)
  - I often imagine the assistants all sharing parameters with the predictor, or at least having a single set of parameters. If you have lots of assistant parameters that aren’t shared with the predictor, then it looks like it will generally increase the training time a lot. But without doing that, it seems like there’s not necessarily that much complexity the predictor doesn’t already know about.
    (In contrast, we can afford to spend a ton of compute for each example at training time since we don’t need that many high-quality reporter datapoints to rule out the bad reporters. So we can really have giant ratios between our compute and the compute of the model.)
  But I don’t think these are differences in kind and I don’t have super strong views on this.