Vaniver comments on $1000 bounty for OpenAI to show whether GPT3 was “deliberately” pretending to be stupider than it is

Vaniver 27 Jul 2020 19:06 UTC
LW: 10 AF: 6
0
AF
I feel like there’s some underlying worldview here that GPT-3 either has a theory of mind or it doesn’t, or that GPT-3 is either “doing the theory of mind computations” or it isn’t, and so behavior consistent with theory of mind is compelling evidence for or against theory of mind in general.
Do you also feel this way about various linguistic tasks? Like, does it make sense to say something that scores well on the Winograd schema is “doing anaphora computations”? [This is, of course, a binarization of something that’s actually continuous, and so the continuous interpretation makes more sense.]
Like, I think there’s a thing where one might come into ML thinking confused thoughts that convnets are “recognizing the platonic ideal of cat-ness” and then later having a mechanistic model of how pixels lead to classifications, and here what I am trying to do is figure out what the mechanistic model that replaces the ‘platonic ideal’ looks like here, when it comes to theory-of-mind. (I predict a similar thing is going on for Eliezer.)
- Rohin Shah 28 Jul 2020 7:24 UTC
  LW: 8 AF: 5
  0
  AF Parent
  I agree the mechanistic thing would be interesting, that does make more sense as an underlying cause of this bounty / thread.