ZY comments on Rauno’s Shortform

ZY 18 Jul 2025 23:15 UTC
1 point
0
From my experience, there are usually sections (for at least the earlier papers historically) on safety for model releasing.
- PaLM: https://arxiv.org/pdf/2204.02311
- Llama 2: https://arxiv.org/pdf/2307.09288
- Llama 3: https://arxiv.org/pdf/2407.21783
- GPT 4: https://cdn.openai.com/papers/gpt-4-system-card.pdf
Labs release some general paper like (just examples):
- https://arxiv.org/pdf/2202.07646, https://openreview.net/pdf?id=vjel3nWP2a etc (Nicholas Carlini has a lot of paper related to memorization and extractability)
- https://arxiv.org/pdf/2311.18140
- https://arxiv.org/pdf/2507.02735
- https://arxiv.org/pdf/2404.10989v1
If we see less contents in these sections, one possibility is increased legal regulations that may make publication tricky (imagine an extreme case, companies have sincere intents to produce some numbers that is not indication of harm yet, these preliminary or signal numbers could be used in an “abused” way in legal dispute, may be for profitability reasons/”lawyers need to win cases” reasons). To remove sensitive information, it would comes down then to time cost involved in paper writing, and time cost in removing sensitive information. And interestingly, political landscape could steer companies away from being more safety focused. I do hope there could be a better way to resolve this, providing more incentives for companies to report and share mitigations and measurements.
As far as I know, safety tests usually are used for internal decision making at least for releases etc.