Evans et al.’s Truthful AI: Developing and governing AI that does not lie is a detailed and length piece discussing a lot of issues around truthfulness for AI agents. This includes conceptual, practical and governance issues, especially with regard conversation bots. They argue for truthfulness (or at least, non-negligently-false)
The link should include “that does not lie”. length --> lenghty
Lin et al.‘s TruthfulQA: Measuring How Models Mimic Human Falsehoods provides a series of test questions to study how ‘honest’ various text models are. Of course, these models are trying to copy human responses, not be honest, so because many of the questions allude to common misconceptions, the more advanced models ‘lie’ more often. Interestingly they also used GPT-3 to evaluate the truth of these answers. See also the discussion here. Researchers from OpenPhil were also named authors on the paper. #Other
“OpenPhil” --> OpenAI As a minor clarification, all the results in the paper are based on human evaluation of truth. But we show that GPT-3 can be used as a fairly reliably substitute for human evaluation under certain conditions.
The link should include “that does not lie”.
length --> lenghty
“OpenPhil” --> OpenAI
As a minor clarification, all the results in the paper are based on human evaluation of truth. But we show that GPT-3 can be used as a fairly reliably substitute for human evaluation under certain conditions.
Thanks, fixed in both copies.