Pranav Madhukar

Karma: 2

Pranav Madhukar 15 Jun 2026 19:39 UTC
1 point
0
in reply to: Sid Black’s comment on: Machinic Psychopharmacology: Do LLMs Self-Medicate?
Agreed on (3), I missed that thanks for clarifying.
I think the idea behind (2) is similar to (1) in that I’m curious about the model’s awareness of its state of influence without being explicitly informed that there was something done to potentially alter it. As in, with a blank state, no conversation context, prior KV cache or tool calls- just steering applied live to the current forward pass. Might spin this up over the weekend and give it a test myself :)

Pranav Madhukar 10 Jun 2026 22:44 UTC
2 points
0
on: Machinic Psychopharmacology: Do LLMs Self-Medicate?
The prefill-logprob introspection setup is the most interesting! I’d be curious to see more variations of that like:
- With no conversation context at all (or visibility of tool call); just 10 multiple choices to represent its state
- With no conversation context at all and also with no multiple choices (i.e. open-ended responses)
- Without actually changing the emotional state but seeing if the model confabulates

Pranav Madhukar 18 Mar 2026 8:30 UTC
2 points
1
on: Personality Self-Replicators
I think there’s a meaningful gap in between OpenClaw and a self-replicating system that poses serious threat.
If you agree with this premise, where do you think that gap lies? Here’s what I can come up with:
- Agency—I have never seen an LLM go “I need to go buy a domain for myself” unless externally prompted to do so (or via a malicious system prompt). How might this come about in a traditional LLM, and if it came about would it be a sufficient condition?
- Desire for Self-Replication—Post-training on LLMs are aligning responses to be like that of a helpful chatbot. If someone did RLHF on self-preservation/ long horizon self interest would that be a sufficient condition?
- Emotional Metacognition—Probing LLM activations shows they have emotional expressions, but it seems vastly different from human emotional experience wherein there is self-reflection/ metacognition of the emotion which creates agency.
This is mostly spitballing and I dont think any of the three ideas are exclusive nor exhaustive. Curious to hear what you think.