perhaps ‘inoculate’ is the wrong word! i have found that after seeing the effect, i am
less likely to trust llms,
less likely to get excited when talking to llms, and
less interested in asking llms about highly speculative claims.
i believe this is due to a better understanding of how this particular failure mode arises. i compare it with learning the name of a logical fallacy: ideally, this can help identify the mistake in our own thinking.
Thing is… While I have learned the meta-lesson of not assuming I can trust models on topics I know less of, I haven’t personally gained any new insights into faster discovery of object-level falsehoods from the models. I would be thankful for any lessons in that regard.
I think the suggestion is that keeping track of how much current LLMs reinforce cranky beliefs will help you not use the same level of reinforcement from LLMs as evidence for your future beliefs that you may not realise are cranky.
but which part of this is inoculating?
perhaps ‘inoculate’ is the wrong word! i have found that after seeing the effect, i am
less likely to trust llms,
less likely to get excited when talking to llms, and
less interested in asking llms about highly speculative claims.
i believe this is due to a better understanding of how this particular failure mode arises. i compare it with learning the name of a logical fallacy: ideally, this can help identify the mistake in our own thinking.
Thing is… While I have learned the meta-lesson of not assuming I can trust models on topics I know less of, I haven’t personally gained any new insights into faster discovery of object-level falsehoods from the models. I would be thankful for any lessons in that regard.
I think the suggestion is that keeping track of how much current LLMs reinforce cranky beliefs will help you not use the same level of reinforcement from LLMs as evidence for your future beliefs that you may not realise are cranky.