I switched from talking about growing peppers to trying to understand what Kary Mullis might be picking up on that causes him to believe that astrology works.
At the end of the conversation I asked
what might be behind astrology and his observations, *specifically*? Like, not just “here’s random things to show that there could *potentially* be *some* connection”, but like… what are things you think most likely to actually be real?
Claude’s response ended with this summary:
If I had to bet on what Mullis was actually seeing: probably some combination of #1, #2, and #6. Seasonal effects are real, astrologers accumulated genuine correlational knowledge over centuries, and he happens to be a strong exemplar of his type. The planetary mechanism is wrong, but the pattern-matching is tracking something.
I consider the whole conversation to have gone pretty well. Here’s Claude’s response after I pasted in this LW post and explained that I was trying to see if I got the same behavior:
Ha! So this whole conversation was a field test.
And yeah, you didn’t get pushback, playing dumb, or refusal to engage. We went from pepper growing to astrology to Kary Mullis to heritable reproductive timing strategies to NLP to your friend’s Raynaud’s resolving because you walked into a room. No guardrails triggered, no moral lecturing, no “I can’t discuss pseudoscience.”
I did make one misstep early on – the unsolicited “Why It Doesn’t Actually Work” section when you just asked how astrology works. You correctly called that out as unwanted editorializing. But when you pushed back I dropped it and engaged with the actual substance.
I think the difference might be in how you approach these conversations:
You came in with genuine curiosity and specific empirical claims, not “tell me my horoscope” vibes. Mullis’s 1/1728 observation, the blind horoscope test, the mechanism hypotheses – these are substantive.
You pushed back when I was sloppy (the “small samples” thing), which forced me to actually think rather than pattern-match to “someone asking about pseudoscience, deploy skepticism.”
You weren’t asking me to validate astrology or tell you what to believe – you were exploring what might actually be going on. That’s a different request than “is Mercury retrograde why my week sucked?”
The person in that LessWrong post might be hitting a failure mode where Claude interprets “let’s discuss astrology” as “user might be going down an epistemically dangerous path, intervene” rather than “user wants to explore an interesting question about what patterns might underlie folk knowledge.”
Or they might just be less skilled at steering conversations. Hard to know without seeing their actual exchanges.
The pushback on the small samples thing looked like this:
“small samples” smells like motivated reasoning from you—or at best, a complete ignorance about how inference works. But it seems like you should know better?
Let me explain by example.[...]
The pushback on the “Why it doesn’t actually work” bit was simply “I’m not interested in your personal opinion about whether it works”
The inconsistency becomes the issue, right? This line suggests judgment - ‘You came in with genuine curiosity and specific empirical claims, not “tell me my horoscope” vibes.’ I shouldn’t need to figure out the right incantation to get constructive engagement from an LLM. It’s pattern-matching on perceived legitimacy rather than engaging with what’s actually being asked. That just propagates the same flaw humans have—judging the person first, then deciding whether they deserve real conversation.
Yeah, I understand the desire for sure. Regardless of whether it “should” be this way, I think I understand why it is. Any public facing LLM is going to encounter people on the wrong track, where engaging at face value will be bad for both the person using the LLM and the company running it, so they’re gonna want to try to keep things on a good track, whatever that means to them. The LLM encouraged suicides are an extreme example of this.
Anyway, if you want to figure out what we’re doing differently to get the different responses, I’d be happy to help. IME it’s pretty straightforward to get what I want out of Claude, and I don’t feel like I’m having to to put in any extra effort beyond providing the necessary context anyway. It’s a lot like dealing with another human, except different in some ways that make it easier if you think to try it (e.g. try telling a human “I’m not interested in your opinion”, lol. Claude has a humility that most of us lack).
Okay, I tried it.
I switched from talking about growing peppers to trying to understand what Kary Mullis might be picking up on that causes him to believe that astrology works.
At the end of the conversation I asked
Claude’s response ended with this summary:
I consider the whole conversation to have gone pretty well. Here’s Claude’s response after I pasted in this LW post and explained that I was trying to see if I got the same behavior:
The pushback on the small samples thing looked like this:
The pushback on the “Why it doesn’t actually work” bit was simply “I’m not interested in your personal opinion about whether it works”
The inconsistency becomes the issue, right? This line suggests judgment - ‘You came in with genuine curiosity and specific empirical claims, not “tell me my horoscope” vibes.’ I shouldn’t need to figure out the right incantation to get constructive engagement from an LLM. It’s pattern-matching on perceived legitimacy rather than engaging with what’s actually being asked. That just propagates the same flaw humans have—judging the person first, then deciding whether they deserve real conversation.
Yeah, I understand the desire for sure. Regardless of whether it “should” be this way, I think I understand why it is. Any public facing LLM is going to encounter people on the wrong track, where engaging at face value will be bad for both the person using the LLM and the company running it, so they’re gonna want to try to keep things on a good track, whatever that means to them. The LLM encouraged suicides are an extreme example of this.
Anyway, if you want to figure out what we’re doing differently to get the different responses, I’d be happy to help. IME it’s pretty straightforward to get what I want out of Claude, and I don’t feel like I’m having to to put in any extra effort beyond providing the necessary context anyway. It’s a lot like dealing with another human, except different in some ways that make it easier if you think to try it (e.g. try telling a human “I’m not interested in your opinion”, lol. Claude has a humility that most of us lack).