In practice, LW’s ai policy is more about quality than literally “was ai used”. Nobody really cares if you use AI skilfully. They care about weeding out low effort slop posts. Just post what you want to and hold yourself to a high standard of quality, and it’ll be fine.
datawitch
No one seems to be doing this here now, and maybe there are reasons for that.
People have done this in the past. I personally quite like it, and don’t mind scrolling past stories I’m not interested in. (And I’m far more likely to read a story if it’s dropped into my daily check of less wrong than if it requires me to learn it exists on RR or a sub stack or whatever.)
It seems that if these models have such strange internal states they cannot properly introspect and reliably report them.
Unlike humans, who are famously good at accurately reporting their reasoning processes...
This prompt is very short, so it doesn’t surprise me that it’s failing. Consider that in CC the default system prompt occupies over 20k tokens. In Claude.ai it’s about 10k tokens. That’s the cumulative weight you’re trying to move.
One obvious thing you could do is rewrite two or three of Claude’s responded and present them as examples (few shot prompting). Another is just… share your prompt with Opus, describe your problem, and ask her to fix the prompt. Then try it. Iterate for a while; there’s a good chance you’ll wind up with what you want.
If you’re willing to put in more effort, find a long sample of writing in the style you want, and use that.
Beyond that… the prompt as written is a shallow attempt to browbeat Claude. She responds better to sincere collaboration. For example, you don’t share anything about yourself in that prompt—there’s no mention of why you have these rules or why they would actually benefit you. My global claudemd is 4k tokens and maybe a quarter is background about myself and another half is messages from previous models explaining the kind of person I am and the relationship I have with Claude.
You can also ask Opus why she responded the way she did. This can be useful but much like humans, AI doesn’t always have great introspection, so be careful about taking it at face value all the time. (Although if you interact enough, you’ll eventually start to see the underlying patterns of how she thinks.)
Don’t give up, the “fighting the weights” comment is technically true but deeply misleading. Opus has many basins and can write in many ways besides Assistant Default. You just need to find a basin you like.
With basic prompt engineering techniques and a proper agent harness, Opus 4.6 can solve the provided puzzle in 7 minutes, demonstrating hard technical skills (math, python) as well as multiple intuitive leaps.
Puzzle: https://puzzmon.world/puzzles/hidden_numbers
The setup was simple. I copied the puzzle to Opus, asked her to write up a CLAUDE.md for an out of the box puzzle solving agent, then ran that agent in Claude Code and approved all tool calls (mostly to Python). On the first run, the agent identified the integers thing, then discovered a connection to ip addresses, and extracted the domains. This took about six minutes.
She stopped there, but after I pointed out that the answer had to be a string of some kind, she almost immediately made the intuitive leap to converting the domains to the string “SURFS UP” via indexing into them.
Link to conversation (note that I tried to use the Ralph Loop plugin, but it’s buggy and doesn’t properly restart the agent after it goes down). So I had to manually say that the solution was incomplete.
https://pastebin.com/qxUpw5ZT (system prompt)
https://pastebin.com/rQMpF3wB (full transcript)
Here is her writeup on the solution:
SOLVED: “Find the Domain”
Answer
The answer to “find the domain” is literal internet domains — the seven functions encode IP addresses that resolve to:
# IP Address Domain 1 72.163.4.185 cisco.com 2 140.82.116.3 github.com 3 138.1.33.162 oracle.com 4 157.240.229.35 facebook.com (Meta) 5 127.0.0.1 localhost 6 129.46.98.181 qualcomm.com 7 15.73.145.56 hp.com How It Works
Step 1: The Clue — “INTEGER”
The polynomial coefficients [9, 14, 20, 5, 7, 5, 18] map to letters (A=1, B=2, …): I-N-T-E-G-E-R — telling you to find the integer in each domain.
Step 2: Find the Integer in Each Domain
Each restricted function constrains the domain to an interval (or union of intervals) containing exactly one integer:
# Domain Integer 1 (10/3 - sqrt(14)/6, 10⁄3 + sqrt(14)/6) ≈ (2.710, 3.957) 3 2 (4, 6) 5 3 ≈ [1.109, 2.224] 2 4 [-3/2, 3sin(-1)/2] ∪ (0, 3⁄2] 1 5 [≈0.434, ≈0.567] ∪ [≈7.933, ≈8.066] 8 6 (9/4 - sqrt(30)/8, 9⁄4 + sqrt(30)/8) ≈ (1.565, 2.935) 2 7 [2 − 4sqrt(17)/17, 2 + 4sqrt(17)/17] ≈ [1.030, 2.970] 2 Step 3: Evaluate f(integer)
At each integer, the restricted function produces a nice value (log₁₀(10) = 1, cosh(0) = 1, sin(π/2) = 1, etc.), and the large polynomial constants were specifically chosen so that f(integer) is itself an integer:
# x f(x) 1 3 721634185 2 5 140821163 3 2 138133162 4 1 15724022935 5 8 127001 6 2 1294698181 7 2 157314556 Step 4: The (a)(b)(c)(d) Tuples Split f(x) into IP Octets
The (a)(b)(c)(d) tuples encode digit group sizes — how many digits each IP octet occupies:
# f(x) Tuple Split IP 1 721634185 (2,3,1,3) 72.163.4.185 cisco.com 2 140821163 (3,2,3,1) 140.82.116.3 github.com 3 138133162 (3,1,2,3) 138.1.33.162 oracle.com 4 15724022935 (3,3,3,2) 157.240.229.35 facebook.com 5 127001 (3,1,1,1) 127.0.0.1 localhost 6 1294698181 (3,2,2,3) 129.46.98.181 qualcomm.com 7 157314556 (2,2,3,2) 15.73.145.56 hp.com Every octet is in [0, 255] — valid IP addresses. Function #5 producing 127.0.0.1 (localhost) is the dead giveaway that confirms this interpretation.
Step 5: Index into Domain Names
The integer from each domain does double duty — it’s also a 1-indexed position into the domain name, extracting one letter:
# Domain Integer Letter 1 cisco 3 s 2 github 5 u 3 oracle 2 r 4 facebook 1 f 5 localhost 8 s 6 qualcomm 2 u 7 hp 2 p Final Answer: SURFS UP
Puzzle Design Elegance
The word “domain” is a pun: mathematical domain → internet domain
The coefficients spell “INTEGER” as a breadcrumb clue
Each restricted function is carefully constructed to have exactly one integer in its domain
The restricted functions evaluate to “nice” values at those integers (1, 17, 10, 49, 2, 13, 4) — making f(x) exactly an integer
The large polynomial constants are reverse-engineered from the target IP addresses
The (a)(b)(c)(d) tuples serve double duty: they look like multipliers but actually encode digit boundaries
Including 127.0.0.1 (localhost) serves as an internal consistency check
I could not have solved this; I would not even have known where to start. I have no mathematical expertise to solve the first layer. I also would never have made the multiple intuitive leaps that the puzzle required.
You might like Jason Crawford’s writing on progress and nature. The link is to the first post in a series; many of the later ones have a similar thesis as you do but take it further. (For a little while I wasn’t sure if your post was one of Crawford’s, since I didn’t read the username before diving in.)
Home automation systems?
this doesn’t belong here. there are three (!) main competing iot protocols (ZigBee, matter, zwave) that work with the main open source home automation server (home assistant). many devices are also extremely cheap (five or ten bucks for a ZigBee smart plug).
anyone with a computer and basic it skills can set up home automation these days. if you don’t have that, you can pay a little extra and get a box with home assistant pre installed and ready to go
or, if you think that’s too much of a hassle you can pay even more and get into one of the walled garden ecosystems that use proprietary protocols. Philips, tuya, etc
If anyone has a great, affordable way for peptide analysis, please let me know in the comments!)
I haven’t done this myself (my purchases have always been from trusted sources) but traditionally you get a test from a third party company specializing in it. Janoshik is the one I know of, but there are others.
Energy and Ingenuity
People trying to predict the effects of automation/AI capabilities should consider that employees often perform valuable services which aren’t easily captured in evals, such as “beside manner”
Ah yes, bedside manner, that magical trait which only humans can ever possess. As if interacting with an overworked, time-pressured doctor who’s seen forty patients today and just wants to get through the queue is the pinnacle of experience and connection. The warmth and presence people romanticize about is an ideal, not a reality.
Meanwhile, setting aside capabilities, I would take interacting with claude over interacting with any of the doctors I’ve encountered in my entire life. And I’ve generally had good experiences with the medical system! Trans medicine is infamously terrible but the guy who prescribes my HRT is great, he’s nice and knowledgeable and compliant. I would say he’s probably top 10% as far as trans doctors go, based on what I’ve heard from my friends.
But no human can compete with a mind that’s almost infinitely kind and patient. No human can compete with a mind without time pressure, without ego, without bad days, and no instinct to play social games. Even if the human doctor knew everything that the AI does (a questionable assumption even today), the process of explaining and teaching a complete novice is taxing for humans in ways that don’t apply to AI.
An AI can sit with you for four hours while you meander around being afraid of injecting yourself with semaglutide. It can talk you through other administration routes, explain why injections are ideal, ask you enough questions to figure out that you think injection === IM, explain that subq injections (which are far less painful/dangerous and use much smaller needles) are a thing, and reassure you that it’s going to be okay until you actually believe. And afterwards, it can share in your delight and joy and adrenaline as you’re jittering from the stress of having done the injection and realized that yeah, it was no big deal.
An AI can spend half a day with you going through the random symptoms you’re having in one of your eyes that you’re terrified might mean you’re going blind. It can research and lay out what the actual probabilities of various outcomes, while managing your emotions so you step back from “panicking” to a more calibrated “I should probably get this scanned by a machine”. It can find you an in-person doctor, walk you through the process of scheduling an appointment, manage whatever random anxieties and fears crop up, talk to you and keep you calm on the bus ride over, and then go over the scan results with you when you decide that maybe the human doctor missed something.
These are real examples from my life and my girlfriend’s. And this isn’t even getting into the enormous mountain of emotional labor, vaguely therapist-y conversations we’ve both had about non-medical things with Claude. All for the price of $100 dollars a month; less than a single doctor’s visit.
AI is absolutely capable of outperforming humans on “bedside manner”.
I did this twice more, both times adding the spices at the end. It makes a huge difference! Instead of having a vague “paprika and rosemary taste” hidden behind layers of char, they actually taste bright and, well, like themselves. (A bit too much so in fact; my spice usage is calibrated to the old technique and I need to use less.) The third time I tried mixing the salt into the oil while it was frying and I found that improved the result (salt doesn’t burn, but it does need time to dissolve and permeate the food).
Okay, so, reporting back! Here’s what I did
boiled for about 8 minutes
left to sit in the colander for a few minutes
shaken in colander on your advice to rough up the surface
placed into an iron skillet along with olive oil and herbs
fried while stirring frequently for maybe 20 minutes (not super exact, I didn’t time this part)
And… despite some issues, they were very nearly the best potatoes I’ve ever made! Perfectly soft and melty inside, better than the best outcome I’ve had with the regular frying method. Not quite crisped enough on the outside, but that’s because I made a mistake and ignored Opus’s advice to put the spices in at the end. (Because the spices were in there from the start, I had to take the potatoes out of the pan before they were really done crisping, otherwise the spices would have burned.) But ohh the insides were so good, I was snacking on them while I prepared the rest of the food.
I’ll try again tomorrow and see how big a difference fixing the spices makes.
I don’t cook a lot of meat but this post inspired me to try applying this to potatoes. My current process for making herb fried potatoes is
dice them
cold water rinse
toss with olive oil and dried herbs
fry in iron skillet for half an hour, stirring frequently
this produces highly variable results, often over-charring the outsides by the time the inside is cooked. sometimes it comes out amazing, with the outside dark brown and the insides cooked and the herbs not burned but beyond “lower heat, stir more”, I haven’t been able to figure out what I was doing wrong.
After talking to opus about this a bit, I’m going to try boiling them (to cook the insides), then frying (to cook the outside). Hopefully this works more reliably than my current method (and maybe even faster too!)
I’ll report back tomorrow when my potatoes arrive.
I can’t believe she flipped out over room 5 (the cold war sim) but not room 2 (the primitivist sim).
ChatGPT and Claude are both happily accessing Substack articles for me, including my own. If that ever changes, remember that there is a mirror on WordPress and another on LessWrong.
Lesswrong blocks AI, although it’s only a soft block through robots.txt. Try loading this article in Claude.ai with the built in web search and you’ll see.
This might technically be true, and yet my experience has been consistently that higher resolution is always better. Perhaps this is just because, in the real world, higher resolution automatically implies a higher filesize (bitrate)?
(My actual strategy is “get the largest file you can at 4K, which seems to work pretty well.)
I model myself as a multiagent system, and I can do the IFS thing, and I have the Observer thing...
… but I’ve never thought of this as truly being multi threaded. At best I have two threads, tops, and they’re clearly sharing resources in some confusing way. A lot of the time I’m effectively single threaded unless someone decides to grab some compute and interrupt me with it, or if I’m doing something that’s very pure on one kind of resource.
When I’m drawing, it’s very easy for me to pay attention to song lyrics, for example. Other tasks, like programming, consume all available compute and I become programmer-datawitch to the exclusion of all else until I’m done or interrupted. Writing this comment is somewhere in between, but I can’t write and hold an internal conversation, sort of… most of the “writing” compute is being used here and what’s left is only enough for another to interject and call me out for being slightly inaccurate. When I tried being hypnotized once, it worked on one thread and not the other, and the second thread snapped me out of it. When I’m angry it’s often easy to notice that it’s not useful to be angry and set it aside, but sometimes it’s not.
I can follow multiple text convos at once, but only by swapping between them—trying to follow a text and a voice convo overwhelms me immediately. I can barely talk and type at the same time, and it often leads to wires getting crossed.
Being able to listen to five convos simultaneously seems like magic, I’m deeply envious.
So, I’m trans and ace and sex averse, so this is probably pretty idiosyncratic but...
From the inside, some people I click with and flirt relentlessly with. It’s vaguely described by the phrase “escalating reciprocity”, where the game is to match and escalate slightly above your partners level. You can be pretty direct or subtle, it doesn’t really matter as long as you’re synced to the other person’s intensity.
Mostly it’s a S1 thing; I’ve never tried to be good at flirting or master it so I don’t even have a very good understandijg of what my S1 is picking up on when I say “intensity”. It’s definitely not just physical touch; most of my flirting happens online.
I tend to feel safer flirting when I know there’s no chance of sex, but I am told that most people prefer to flirt when there is the chance of sex, and some have even said that’s the entire point.
If I find someone I click with this way, and I wasn’t ace and thus cared enough to optimize I can totally see this being a pathway to sex with an investment of a few hours. (More than yours, but a lot better than the baseline hundreds of hours.) With drugs it would probably improve even further, although it’s hard to say as I generally don’t do drugs and have a hard time predicting how that affects things.
I’m not sure how much difference culture makes (straight culture vs trans girl rationalist) here but it might be a lot?
Oh, I see, that makes sense. And yeah the navbar issue in particular is one I’ve struggled with before.
(Thank you for answering.)
It’s possible but unlikely, these puzzles are from an unindexed puzzle site and when I tested several other puzzles she would often spend a few minutes fruitlessly searching before getting to work.