Johannes C. Mayer comments on ARC’s first technical report: Eliciting Latent Knowledge

Johannes C. Mayer 9 Jan 2022 13:58 UTC
1 point
0
In section: “New counterexample: better inference in the human Bayes net”, what is meant with that the reporter does perfect inference in the human Bayes net? I am also unclear how the modified counterexample is different.

My current understanding: The reporter is doing inference using $v_{1}$ and the action sequence and does not use $v_{2}$ to do inference ( $v_{2}$ is inferred). The reporter has an exact copy of the human Bayes net and now fixes the nodes for $v_{1}$ and the action sequence. Then it infers the probability for all possible combinations of values each node can have (including $v_{2}$ ) (i.e. the joint probability distribution).

I am not sure here. Is the reporter not using $v_{2}$ ? The graphic in that section shows a red arrow from $v_{2}$ in the predictor, to $v_{2}$ in the human Bayes net model that the reporter uses. But that could be about the better counterexample already.

Now we assume that the model knows how to map a question in natural language onto nodes in the Bayes net and that it can then translate values of nodes into answers to questions. The model can then use the joint probability distribution and the law of total probability to calculate the probabilities of nodes/events occurring which can then be used to answer questions.

The only difference in the better counterexample is that we now also fix the value of $v_{2}$ to whatever our predictor part of the model said would happen. And we do not assume that our predictor works perfectly, hence our reporter can give wrong answers because of that.

And now when we have $v_{2}$ , then calculating the joint probability distribution becomes computationally feasible? Are we still assuming that the reporter does perfect inference in the human Bayes net, given that our predictor predicted $v_{2}$ correctly?
- paulfchristiano 9 Jan 2022 17:03 UTC
  3 points
  0
  Parent
  In all of the counterexamples the reporter starts from the $v_{1}$ , actions, and $v_{2}$ predicted by the predictor. In order to answer questions it needs to infer the latent variables in the human’s model.
  Originally we described a counterexample where it copied the human inference process.
  The improved counterexample is to instead use lots of computation to do the best inference it can, rather than copying the human’s mediocre inference. To make the counterexample fully precise we’d need to specify an inference algorithm and other details.
  We still can’t do perfect inference though—there are some inference problems that just aren’t computationally feasible.
  (That means there’s hope for creating data where the new human simulator does badly because of inference mistakes. And maybe if you are careful it will also be the case that the direct translator does better, because it effectively reuses the inference work done in the predictor? To get a proposal along these lines we’d need to describe a way to produce data that involves arbitrarily hard inference problems.)
  - Johannes C. Mayer 9 Jan 2022 19:20 UTC
    1 point
    0
    Parent
    Ah ok, thank you. Now I get it. I was confused by (i) “Imagine the reporter could do perfect inference” and (ii) “the reporter could simply do the best inference it can in the human Bayes net (given its predicted video)”.
    
    (i) I thought of this as that the reporter alone can do it, but what is actually meant is that with the use of the predictor model it can do it.
    
    (ii) Somehow I thought that “given its predicted video” is the important modification here, where in fact the only change is to go from that the reporter can do perfect inference, to that it does the best inference that it can.