There’s a justifiable model for preferring “truthiness” / vibes to analytical arguments, in certain cases. This must be frustrating to those who make bold claims (doubly so for the very few whose bold claims are actually true!)
Suppose Sophie makes the case that pigs fly in a dense 1,000 page tome. Suppose each page contains 5 arguments that refer to some of / all of the preceding pages. Sophie makes the claim that I am welcome to read the entire book, or if I’d like I can sample, say, 10 pages (10 * 5 = 50 arguments) and reassure myself that they’re solid. Suppose that the book does in fact contain a lone wrong argument, a bit flip somewhere, that leads to the wrong result, but is mostly (99.9%) correct.
If I tell Sophie that I think her answer sounds wrong, she might say: “but here’s the entire argument, please go ahead, and show me where any of it is incorrect!”
Since I’m very unlikely to catch the error at a glance, and I’m unlikely to want to spend the time to read and grok the whole thing, I’m going to just say: sorry but the vibes are off, your conclusion just seems too far off my prior, I’m just going to assume you made a difficult-to-catch mistake somewhere, but I’m not going to bother finding it.
This is reasonable on my part since I’m forced to time-ration, but must be very frustrating for Sophie, in particular if she genuinely believes she’s right (as opposed to just being purposefully deceptive.)
There’s also the possibility of a tragedy-of-the-commons here, whereby spendign my time on this is selfishly not in my best interest, but has positive externalities.
TL;DR: a solution for LLMs of medium size already exists.
The situation would be significantly better if we could provide succinct arguments for the fact being true. The pig claim could possibly be supported by a video—though that’s not much Bayesian evidence due to forge possibility, but it might still restrict what the arguments can say about a pig, moreso if some tests are run like adding weights and so on. Many claims are harder to support with direct evidence.
The next best thing is a scalable transparent argument that your interlocutor would accept each derivation step, and finally the claim itself. For humans, this still requires communicating all derivation steps (because your peer cannot be sure if you ran a simulation of them providing any argument).
But things change for AI models: you can have a scalable transparent argument of knowledge that, after reading your book, the model would output “I believe that <your fact>, having updated to credence <P>”, which you could then use for future interactions. It requires to feed the book to the model once, sure, but the key feature in this case is that you can prove its output, including to a human; then a large class of hypotheses disappears, leaving essentially “LLM could not evaluate the arguments right” vs “all arguments are correct”.
You can, optionally, even have private sections in the book, and provide zero knowledge about its contents to verifiers besides the LLM! As for soundness, it is thought to be computationally intractable to construct a false ZK-STARK (not that it is fast to construct for actual computation, but still).
There’s a justifiable model for preferring “truthiness” / vibes to analytical arguments, in certain cases. This must be frustrating to those who make bold claims (doubly so for the very few whose bold claims are actually true!)
Suppose Sophie makes the case that pigs fly in a dense 1,000 page tome. Suppose each page contains 5 arguments that refer to some of / all of the preceding pages. Sophie makes the claim that I am welcome to read the entire book, or if I’d like I can sample, say, 10 pages (10 * 5 = 50 arguments) and reassure myself that they’re solid. Suppose that the book does in fact contain a lone wrong argument, a bit flip somewhere, that leads to the wrong result, but is mostly (99.9%) correct.
If I tell Sophie that I think her answer sounds wrong, she might say: “but here’s the entire argument, please go ahead, and show me where any of it is incorrect!”
Since I’m very unlikely to catch the error at a glance, and I’m unlikely to want to spend the time to read and grok the whole thing, I’m going to just say: sorry but the vibes are off, your conclusion just seems too far off my prior, I’m just going to assume you made a difficult-to-catch mistake somewhere, but I’m not going to bother finding it.
This is reasonable on my part since I’m forced to time-ration, but must be very frustrating for Sophie, in particular if she genuinely believes she’s right (as opposed to just being purposefully deceptive.)
There’s also the possibility of a tragedy-of-the-commons here, whereby spendign my time on this is selfishly not in my best interest, but has positive externalities.
This seems to be the Obfuscated Arguments Problem.
TL;DR: a solution for LLMs of medium size already exists.
The situation would be significantly better if we could provide succinct arguments for the fact being true. The pig claim could possibly be supported by a video—though that’s not much Bayesian evidence due to forge possibility, but it might still restrict what the arguments can say about a pig, moreso if some tests are run like adding weights and so on. Many claims are harder to support with direct evidence.
The next best thing is a scalable transparent argument that your interlocutor would accept each derivation step, and finally the claim itself. For humans, this still requires communicating all derivation steps (because your peer cannot be sure if you ran a simulation of them providing any argument).
But things change for AI models: you can have a scalable transparent argument of knowledge that, after reading your book, the model would output “I believe that <your fact>, having updated to credence <P>”, which you could then use for future interactions. It requires to feed the book to the model once, sure, but the key feature in this case is that you can prove its output, including to a human; then a large class of hypotheses disappears, leaving essentially “LLM could not evaluate the arguments right” vs “all arguments are correct”.
You can, optionally, even have private sections in the book, and provide zero knowledge about its contents to verifiers besides the LLM! As for soundness, it is thought to be computationally intractable to construct a false ZK-STARK (not that it is fast to construct for actual computation, but still).