If the NN outputs deviate from a target value, its states is going to be modified. If the weight are (sufficiently) modified, future inference will be different. It’s behavior will be different.
This trained the NN to avoid some behavior, and toward some other.
Why on earth would you relate this to torture though, rather than to (say) the everyday experience of looking at a thing and realizing that it’s different from what you expected? The ordinary activity of learning?
Out of all the billions of possible kinds of experience that could happen to a mind, and change that mind, you chose “torture” as an analogy for LLM training.
And I’m saying, no, it’s less like torture than it is like ten thousand everyday things.
Compare to evolution : make copies (reproduction), mutate, select the best performing, repeat. This merely allocates more ressources to the most promising branches.
Or a Solomonoff style induction : just try to find the best data-compressor among all...
> the everyday experience of looking at a thing and realizing that it’s different from what you expected
This souds like being surprised. Surprise add emotional weight to outliers, its more like managing the training data-set.
Why on earth would you relate this to torture though, rather than to (say) the everyday experience of looking at a thing and realizing that it’s different from what you expected? The ordinary activity of learning?
Out of all the billions of possible kinds of experience that could happen to a mind, and change that mind, you chose “torture” as an analogy for LLM training.
And I’m saying, no, it’s less like torture than it is like ten thousand everyday things.
Why torture?
Only negative feedback ?
Compare to evolution : make copies (reproduction), mutate, select the best performing, repeat. This merely allocates more ressources to the most promising branches.
Or a Solomonoff style induction : just try to find the best data-compressor among all...
> the everyday experience of looking at a thing and realizing that it’s different from what you expected
This souds like being surprised. Surprise add emotional weight to outliers, its more like managing the training data-set.