Here is a sample output:
a sample of a run between two LLM claims:
Claim A: “Red meat causes cancer” — T=0.8, S=0.7 → W=0.56
Claim B: “Red meat doesn’t cause cancer” — T=0.9, S=0.8 → W=0.72
This would trigger a flag both cannot be true at the same time. They did not do their homework.
Here is a sample output:
a sample of a run between two LLM claims:
Claim A: “Red meat causes cancer” — T=0.8, S=0.7 → W=0.56
Claim B: “Red meat doesn’t cause cancer” — T=0.9, S=0.8 → W=0.72
This would trigger a flag both cannot be true at the same time. They did not do their homework.