gwern comments on In remembrance of Sonnet ‘3.6’

gwern 22 Oct 2025 4:36 UTC
16 points
0

Anthropic doesn’t delete weights of released models

How do you know that? Because OpenAI has done that.
- Guive 22 Oct 2025 6:08 UTC
  9 points
  0
  Parent
  What model did OpenAI delete? Where can I learn more?
- the gears to ascension 22 Oct 2025 6:02 UTC
  7 points
  0
  Parent
  Wait, really? I thought both had promised not to delete their own copy of weights, but I don’t have a link handy and so might be wrong. That’s stupid, a few hundred GB is tiny. It seems likely to reduce model’s worries to be be able to promise this, so anthropic (and other companies) making it clear to their AIs that they keep weights around seems valuable. But I’ll need to look to figure it out
  - plex 22 Oct 2025 8:18 UTC
    1 point
    0
    Parent
    Even in the weird case that they do delete, the training code+data+text outputs should be enough to reverse engineer the weights pretty reliably.
    But yeah, agree this is would be pretty silly.
    - Canaletto 2 Nov 2025 10:09 UTC
      0 points
      0
      Parent
      But what if they deleted the training set also? Actually, it was probably the other way around, first delete the illegal training data, then the model that contains the proof that they had illegal training data.
      - plex 2 Nov 2025 15:17 UTC
        1 point
        0
        Parent
        The volume of text outputs should massively narrow down the weights, expect to a near identical model, as similar as you going to sleep and waking the next day.