Something that I think is worth noting here: I don’t think that you have to agree with the “Accelerating LM agents seems neutral (or maybe positive)” section to think that sharing current model capabilities evaluations is a good idea as long as you agree with the “Understanding of capabilities is valuable” section.
Personally, I feel much more uncertain than Paul on the “Accelerating LM agents seems neutral (or maybe positive)” point, but I agree with all the key points in the “Understanding of capabilities is valuable” section, and I think that’s enough to justify substantial sharing of model capabilities evaluations (though I think you’d still want to be very careful about anything that might leak capabilities secrets).
Yeah, I think sections 2, 3, 4 are probably more important and should maybe have come first in the writeup. (But other people think that 1 dominates.) Overall it’s not a very well-constructed post.
At any rate thanks for highlighting this point. For the kinds of interventions I’m discussing (sharing information about LM agent capabilities and limitations) I think there are basically two independent reasons you might be OK with it—either you like sharing capabilities in general, or you like certain kinds of LM agent improvements—and either one is sufficient to carry the day.
Something that I think is worth noting here: I don’t think that you have to agree with the “Accelerating LM agents seems neutral (or maybe positive)” section to think that sharing current model capabilities evaluations is a good idea as long as you agree with the “Understanding of capabilities is valuable” section.
Personally, I feel much more uncertain than Paul on the “Accelerating LM agents seems neutral (or maybe positive)” point, but I agree with all the key points in the “Understanding of capabilities is valuable” section, and I think that’s enough to justify substantial sharing of model capabilities evaluations (though I think you’d still want to be very careful about anything that might leak capabilities secrets).
Yeah, I think sections 2, 3, 4 are probably more important and should maybe have come first in the writeup. (But other people think that 1 dominates.) Overall it’s not a very well-constructed post.
At any rate thanks for highlighting this point. For the kinds of interventions I’m discussing (sharing information about LM agent capabilities and limitations) I think there are basically two independent reasons you might be OK with it—either you like sharing capabilities in general, or you like certain kinds of LM agent improvements—and either one is sufficient to carry the day.