Fabien Roger answers How useful could stolen AI model weights be without knowing the architecture and activation functions?

Fabien Roger 7 Aug 2025 0:07 UTC
9 points
3
The scenario seems unrealistic because of the thieves would likely be able to steal important parts of the codebase. But assuming you literally just get a big list of named tensors, my guess is that it could be anywhere between many person-months and millions of person-years (mostly spent re-inventing the algorithms) depending on how many architectural novelties went into the model. For context, it took a decently sized effort to actually get Gemma working properly, given a paper describing how it works and a faulty implementation.
- Jemal Young 7 Aug 2025 2:10 UTC
  2 points
  2
  Parent
  The scenario seems unrealistic because of the thieves would likely be able to steal important parts of the codebase.
  Thanks for this. So I guess when knowledgeable people talk about stealing a model’s weights as being equivalent to stealing the model itself, “steal the weights” is shorthand that implies also stealing ~~the minimal~~ *other elements you’d need to replicate the model. [Edit: changed “the minimal” to “other”]