It costs me $20 / month, because I am unsophisticated and use the web UI. If I were to use the API, it would cost me something on the order of $0.20 / message for Sonnet and $0.90 / message for Opus without prompt caching, and about a quarter of that once prompt caching is accounted for.
I mean the baseline system prompt is already 1,700 words long and the tool definitions are an additional 16,000 words. My extra 25k words of user instructions add some startup time, but the time to first token is only 5 seconds or thereabouts.
Fairly representative Claude output which does not contain anything sensitive is this one, in which I was doing what I will refer to as “vibe research” (come up with some fuzzy definition of something I want to quantify, have it come up with numerical estimates of that something one per year, graph them).
You will note that unlike eigen I don’t particularly try to tell Claude to lay off the sycophancy. This is because I haven’t found a way to do that which does not also cause the model to adopt an “edgy” persona, and my perception is that “edgy” Claudes write worse code and are less well calibrated on probability questions. Still, if anyone has a prompt snippet that reduces sycophancy without increasing edginess I’d be very interested in that.
You aren’t directly paying more money for it pro rata as you would if you were using the API, but you’re getting fewer queries because they rate limit you more quickly for longer conversations because LLM inference is O(n2).
This ‘Vibe Research’ Prompt is super useful and I tested it VS the Default Opus 4 without any system instructions. This is the output it gave (I gave it the exact same prompts) and your result (relinked here) looks a lot stronger! I think the details and direction on how and when to focus on quantitative analysis had the biggest impact between the two results.
It costs me $20 / month, because I am unsophisticated and use the web UI. If I were to use the API, it would cost me something on the order of $0.20 / message for Sonnet and $0.90 / message for Opus without prompt caching, and about a quarter of that once prompt caching is accounted for.
I mean the baseline system prompt is already 1,700 words long and the tool definitions are an additional 16,000 words. My extra 25k words of user instructions add some startup time, but the time to first token is only 5 seconds or thereabouts.
Fairly representative Claude output which does not contain anything sensitive is this one, in which I was doing what I will refer to as “vibe research” (come up with some fuzzy definition of something I want to quantify, have it come up with numerical estimates of that something one per year, graph them).
You will note that unlike eigen I don’t particularly try to tell Claude to lay off the sycophancy. This is because I haven’t found a way to do that which does not also cause the model to adopt an “edgy” persona, and my perception is that “edgy” Claudes write worse code and are less well calibrated on probability questions. Still, if anyone has a prompt snippet that reduces sycophancy without increasing edginess I’d be very interested in that.
You aren’t directly paying more money for it pro rata as you would if you were using the API, but you’re getting fewer queries because they rate limit you more quickly for longer conversations because LLM inference is O(n2).
This ‘Vibe Research’ Prompt is super useful and I tested it VS the Default Opus 4 without any system instructions. This is the output it gave (I gave it the exact same prompts) and your result (relinked here) looks a lot stronger! I think the details and direction on how and when to focus on quantitative analysis had the biggest impact between the two results.