Jan Czechowski comments on What’s the “This AI is of moral concern.” fire alarm?

Jan Czechowski 14 Jun 2022 18:33 UTC
3 points
I think the idea with internal activations manipulation is interesting. It might require some refinement—I think activations of encoder-decoder transformer model are a function of inputs, and they change with every token. At first, the input is your prompt, then it’s your prompt + generated tokens. So the protocol / task for GPT3 would be: generate now 5 tokens, so with the last generation this logit is maximized? Also, it depends on hyperparameters of beam search which are controlled by human