Raemon comments on Raemon’s Shortform

Raemon 5 Oct 2025 4:26 UTC
8 points
0
My plan was, shortly before Solstice, to have whatever the latest-greatest LLM was read over the script, with the prompt “identify any particular niche of song that should be here somewhere, which isn’t yet, and write that song, and write a speech to introduce that song.”
I just did that with Sonnet 4.5 and got a pretty surprisingly decent outcome that is surprisingly in my voice, even in ways that were not really hinted at in the script so far. (The title of the corresponding speech is “Tuesday”)
I might run the process again later a week before Solstice but I’m _mostly_ satisfied with the current thing.
Here is the dilemma: it is important to me, that whatever process I use to have an AI write one song and speech for Solstice, not be cherry picked. i.e. it’s representative of an AI using it’s own literary skill.
But, a normal part of creative work is collaboration and feedback. I would be totally fine with there being more “collaboration” here, if there was a clearer dividing line between “Claude and I collaborated” and “I cheated at leveraging Claude by mostly writing the thing myself or re-rolling until it output something good.”
(every couple years someone has the latest gippity write a solstice song. I think the last couple times it was fairly cherry picked and I was somewhat annoyed about that, because the reason this is interesting to me is “are AIs capable of participating in the process of humanity soulfully and artistically reflecting on itself yet?”, and it’s fine if the answer is “they suck still, and the resulting song is interesting mostly as being a funny and/or sickening parody of human soulfulness”)
The actual changes I would make to the speech are just cutting words/paragraphs. I would consider the song a “first draft” if I wrote it myself, and pursue extensive rewrites.
...
(FYI I currently consider it a fine process to develop a general mechanism for prompting AIs to write songs/speeches that I refine on _other_ projects, and then use the final prompt/scaffold at the last minute on the final version of the Solstice script. I’m not sure whether it’s cheating to use as a test case “previous Solstice scripts”)
- Kaj_Sotala 5 Oct 2025 4:54 UTC
  7 points
  3
  Parent
  The way I’d look at it is that it’s fine for the AI’s work to be fairly cherry-picked because the human contributions to the Solstice are fairly cherry-picked too. You’re not letting a randomly chosen human write arbitrary songs for it, you are picking the most fitting songs from the very large set of all human-written songs. Or if you are having somebody write an entirely new song for it, probably you have a rather high threshold for acceptance and may ask for revisions several times.
  So one option would be to derive a criteria through a question like “exactly how cherry-picked does my process currently make the human contributions, and what would be the LLM equivalent of that”. If you apply a similar degree of filtering to both the human and LLM outputs, then arguably the outputs of both reflect their respective unfiltered literary skill to the same degree.
  - Raemon 5 Oct 2025 6:23 UTC
    4 points
    0
    Parent
    This doesn’t feel right to me, but let me try to answer the quesetion “how much do I collaborate with a human?”
    Often for me, collaboration on Solstice things includes me like giving line-item edits in a google doc.
    I guess I actually have pretty rarely conscripted someone to write a whole song (I’m doing that right now actually, which comes with maybe the equivalent of 4 conversations during which we discuss it at the meta level but don’t get too much into things more like line-edits).
    I think I’ve almost never previously had someone write a whole song from scratch, rather than “they already wrote a good song for Solstice, and I ask them to perform it, and maybe request a few specific edits to fit the Solstice Theme that year.”
    If someone’s writing a speech, I asked them to because I expect them to already be better at writing speeches than ChatGPT (at least as of last year), and the feedback is in the form of small line edits and a couple major “the whole speech feels off, try rewriting with a focus on X?”. Which is indeed a fairly limited bar for interacting with ChatGPT.
    Okay, maybe that’s actually kinda reasonable.
    But, I’d feel a lot better if it was like “we have a scaffold system with multiple gippities that get to keep talking to each other and suggesting improvements and eventually declaring ’this is now a professional grade deeply moving song and I’m done.