I had an idea similar to yours “badness” algorithm: It will be interesting to add to the GPT a truth discriminator: another neural net which predicts the truth values of GPT’s statement relative to the real world and is trained on a database of true statements (there are several). The whole thing then is trained in GAN-style, and the GPT thus trained to produce statements with highest true score.
Actually, I think your comment about this awhile ago was what got me started on all this. I tried looking for it when I wrote this post but couldn’t find it easily. If you give me the link I’d be happy to credit you in the OP.
I had an idea similar to yours “badness” algorithm: It will be interesting to add to the GPT a truth discriminator: another neural net which predicts the truth values of GPT’s statement relative to the real world and is trained on a database of true statements (there are several). The whole thing then is trained in GAN-style, and the GPT thus trained to produce statements with highest true score.
Actually, I think your comment about this awhile ago was what got me started on all this. I tried looking for it when I wrote this post but couldn’t find it easily. If you give me the link I’d be happy to credit you in the OP.
:) Don’t remember where I wrote about it.