Rana Dexsin comments on Meta “open sources” LMs competitive with Chinchilla, PaLM, and code-davinci-002 (Paper)

Rana Dexsin 25 Feb 2023 11:08 UTC
1 point
0
I wonder whether, when approving applications for the full models for research, they watermark the provided data somehow to be able to detect leaks. Would that be doable by using the low-order bits of the weights or something, for instance?
- Rana Dexsin 5 Mar 2023 0:40 UTC
  1 point
  0
  Parent
  Tiberium at HN seems to think not. Copied and lightly reformatted, with the 4chan URLs linkified:
  
  It seems that the leak originated from 4chan [1]. Two people in the same thread had access to the weights and verified that their hashes match [2][3] to make sure that the model isn’t watermarked. However, the leaker made a mistake of adding the original download script which had his unique download URL to the torrent [4], so Meta can easily find them if they want to.
  
  I haven’t looked at the linked content myself yet.