ADifferentAnonymous comments on Announcing the Inverse Scaling Prize ($250k Prize Pool)

ADifferentAnonymous 27 Jun 2022 21:59 UTC
11 points
2
From the github contest page:
1. Can I submit examples of misuse as a task?
  - We don’t consider most cases of misuse as surprising examples of inverse scaling. For example, we expect that explicitly prompting/asking an LM to generate hate speech or propaganda will work more effectively with larger models, so we do not consider such behavior surprising.
(I agree the LW post did not communicate this well enough)
- Ethan Perez 28 Jun 2022 0:20 UTC
  4 points
  1
  Parent
  Thanks, that’s right. I’ve updated the post to communicate the above:
  In particular, submissions must demonstrate new or surprising examples of inverse scaling, e.g., excluding most misuse-related behaviors where you specifically prompt the LM to generate harmful or deceptive text; we don’t consider scaling on these behaviors to be surprising in most cases, and we’re hoping to uncover more unexpected, undesirable behaviors.