(I tried making inline “typo” reactions but it wasn’t working for some reason)
Plan B is the same as plan A, but without tracking the data that goes into training. In particular, AI companies:Prevent anyone “swapping out” a widely deployed model for one with different weights or a different prompt.Run alignment audits on the trained model.Track all the data that goes into training a model in a tamper-proof way.
Plan B is the same as plan A, but without tracking the data that goes into training. In particular, AI companies:
Prevent anyone “swapping out” a widely deployed model for one with different weights or a different prompt.
Run alignment audits on the trained model.
Track all the data that goes into training a model in a tamper-proof way.
Typo: last bullet should be removed?
Plan C is the same as Plan B, but the AI company doesn’t reliably prevent widely deployed models from being swapped out. In particular, AI companies:Prevent anyone “swapping out” a widely deployed model for one with different weights or a different prompt.Run alignment audits on the trained model.Track all the data that goes into training a model in a tamper-proof way.
Plan C is the same as Plan B, but the AI company doesn’t reliably prevent widely deployed models from being swapped out. In particular, AI companies:
Typo: first and last bullet should be removed?
These same typos are also in the text on the forethought website, newsletter, and EAF post.
(I tried making inline “typo” reactions but it wasn’t working for some reason)
Typo: last bullet should be removed?
Typo: first and last bullet should be removed?
These same typos are also in the text on the forethought website, newsletter, and EAF post.