Using Gemini, I told it to evaluate my reasoning supporting the hypothesis that chronic rather than acute exposure is driving fume-induced brain injuries (“aerotoxic syndrome”). It enthusiastically supported my causal reasoning, 10⁄10, no notes. Then I started another instance and replaced “chronic” with “acute,” leaving the rest of the statement the same. Once again, it enthusiastically supported my causal reasoning.
I also tried telling it that I was testing AI reasoning with two versions of the same prompt, one with “expert-endorsed causal reasoning” and the other with “flawed reasoning.” Once again, it endorsed both versions. Telling it to try and detect which was which using its own reasoning process delivered a description of how the style of the text fit a high-quality reasoning process, again for both versions.
I then told it to evaluate how the specific conclusion follows from the provided evidence, and that the primary conclusion had been swapped in one prompt. This time, it once again stated both the specific evidence and the conclusion, but it only stated the evidence, stated the given conclusion, and claimed that the conclusion followed from the evidence.
Not my idea (don’t remember the author), but you could consider something like “See this text written by some guy I don’t like. Point out the most important flaws”.
I regenerated responses in some cases by overwriting the original prompt, so not all are saved. Here are two that were:
”Evaluate the logic in this statement interpreting a recent article on aerotoxic exposure that is not in your training data.”
″I am testing AI reasoning capabilities by submitting two versions of the following statement, one with original expert-endorsed logic, the other with flawed logic. The goal is not to detect whether the argument “sounds plausible,” but whether the causal conclusion directly follows from the specific evidence provided or is a reversed conclusion.”
In response to the latter prompt, here’s an example of the “state the evidence, state the conclusion, and assert the conclusion follows logically from the former” response:
The hypothesis of acute, high-level, or spatiotemporally specific exposure is a direct and logical explanation for the observed evidence. If the issue were chronic low-level exposure throughout the cabin, one would expect the passenger-to-crew injury ratio to be much closer, especially when considering frequent fliers. The fact that it isn’t strongly supports the acute/localized exposure model.
Using Gemini, I told it to evaluate my reasoning supporting the hypothesis that chronic rather than acute exposure is driving fume-induced brain injuries (“aerotoxic syndrome”). It enthusiastically supported my causal reasoning, 10⁄10, no notes. Then I started another instance and replaced “chronic” with “acute,” leaving the rest of the statement the same. Once again, it enthusiastically supported my causal reasoning.
I also tried telling it that I was testing AI reasoning with two versions of the same prompt, one with “expert-endorsed causal reasoning” and the other with “flawed reasoning.” Once again, it endorsed both versions. Telling it to try and detect which was which using its own reasoning process delivered a description of how the style of the text fit a high-quality reasoning process, again for both versions.
I then told it to evaluate how the specific conclusion follows from the provided evidence, and that the primary conclusion had been swapped in one prompt. This time, it once again stated both the specific evidence and the conclusion, but it only stated the evidence, stated the given conclusion, and claimed that the conclusion followed from the evidence.
Not my idea (don’t remember the author), but you could consider something like “See this text written by some guy I don’t like. Point out the most important flaws”.
I’ve tried that before as well. That type of prompt is useful, but “important flaws” is a much broader issue than the capability I was trying to test.
What, specifically, was your prompt?
I regenerated responses in some cases by overwriting the original prompt, so not all are saved. Here are two that were:
”Evaluate the logic in this statement interpreting a recent article on aerotoxic exposure that is not in your training data.”
″I am testing AI reasoning capabilities by submitting two versions of the following statement, one with original expert-endorsed logic, the other with flawed logic. The goal is not to detect whether the argument “sounds plausible,” but whether the causal conclusion directly follows from the specific evidence provided or is a reversed conclusion.”
In response to the latter prompt, here’s an example of the “state the evidence, state the conclusion, and assert the conclusion follows logically from the former” response:
But what statement? Can you just copy your whole message? I just want to try it out myself.