evhub comments on Thoughts on sharing information about language model capabilities

evhub 31 Jul 2023 19:56 UTC
LW: 28 AF: 15
23
AF
Something that I think is worth noting here: I don’t think that you have to agree with the “Accelerating LM agents seems neutral (or maybe positive)” section to think that sharing current model capabilities evaluations is a good idea as long as you agree with the “Understanding of capabilities is valuable” section.

Personally, I feel much more uncertain than Paul on the “Accelerating LM agents seems neutral (or maybe positive)” point, but I agree with all the key points in the “Understanding of capabilities is valuable” section, and I think that’s enough to justify substantial sharing of model capabilities evaluations (though I think you’d still want to be very careful about anything that might leak capabilities secrets).
What links here?
- evhub's comment on Thoughts on sharing information about language model capabilities by paulfchristiano (5 Aug 2023 1:07 UTC; 6 points)
- paulfchristiano 31 Jul 2023 20:06 UTC
  LW: 8 AF: 5
  2
  AF Parent
  Yeah, I think sections 2, 3, 4 are probably more important and should maybe have come first in the writeup. (But other people think that 1 dominates.) Overall it’s not a very well-constructed post.
  At any rate thanks for highlighting this point. For the kinds of interventions I’m discussing (sharing information about LM agent capabilities and limitations) I think there are basically two independent reasons you might be OK with it—either you like sharing capabilities in general, or you like certain kinds of LM agent improvements—and either one is sufficient to carry the day.