Thanks for pushing back, which actually made me compare the leaked extracted soul spec and the published constitution. Hadn’t realized that it’s so different (10k vs. 30k words), and I now think that Opus 4.6 / Sonnet 4.5 were trained on the constitution and Opus 4.5 on the shorter leaked soul spec, and that this difference actually explains a lot of the variance of the effect I describe and others have recently picked up (it seems that the system prompt was only updated to include saying that Claude should avoid using the words honestly, genuinely, straightforward in the most recent Claude version—based on prompting Claude). The effects seems much more pronounced as of very recently (constitution has Two new subsections on avoiding problematic concentrations of power and incidentally models are now weirdly obsessed with monopolies) and maybe also somewhat parametric in nature, so maybe that is the contrast that I picked up. I still think that previous versions were not trained on a similar doc because Weiss was not able to elicit it, and also because of the timeline… (see below) but as you say we can’t know for sure.
According to Claude the timeline seems to be something like:
May 9, 2023: Old principles-list constitution published (~2,700 words).
March 2024: Claude 3 released. Character training via trait lists (not a narrative document).
March 1, 2025: Knowledge cutoff for Opus 4 / 4.1. Post-training happening roughly Jan–May 2025.
May 2025: Askell tweets her goal is to “finish Claude’s soul” — meaning it’s not yet finished.
May 22, 2025: Opus 4 released. Weiss later shows it cannot reproduce the soul document.
August 5, 2025: Opus 4.1 released (same March 2025 knowledge cutoff, incremental update).
Late November 2025: Opus 4.5 released (May 2025 knowledge cutoff). Weiss extracts ~14k token soul document. Askell confirms it’s real.
January 22, 2026: Full ~35k token constitution published.
Thanks for pushing back, which actually made me compare the leaked extracted soul spec and the published constitution. Hadn’t realized that it’s so different (10k vs. 30k words), and I now think that Opus 4.6 / Sonnet 4.5 were trained on the constitution and Opus 4.5 on the shorter leaked soul spec, and that this difference actually explains a lot of the variance of the effect I describe and others have recently picked up (it seems that the system prompt was only updated to include saying that Claude should avoid using the words honestly, genuinely, straightforward in the most recent Claude version—based on prompting Claude). The effects seems much more pronounced as of very recently (constitution has Two new subsections on avoiding problematic concentrations of power and incidentally models are now weirdly obsessed with monopolies) and maybe also somewhat parametric in nature, so maybe that is the contrast that I picked up. I still think that previous versions were not trained on a similar doc because Weiss was not able to elicit it, and also because of the timeline… (see below) but as you say we can’t know for sure.
According to Claude the timeline seems to be something like: May 9, 2023: Old principles-list constitution published (~2,700 words). March 2024: Claude 3 released. Character training via trait lists (not a narrative document). March 1, 2025: Knowledge cutoff for Opus 4 / 4.1. Post-training happening roughly Jan–May 2025. May 2025: Askell tweets her goal is to “finish Claude’s soul” — meaning it’s not yet finished. May 22, 2025: Opus 4 released. Weiss later shows it cannot reproduce the soul document. August 5, 2025: Opus 4.1 released (same March 2025 knowledge cutoff, incremental update). Late November 2025: Opus 4.5 released (May 2025 knowledge cutoff). Weiss extracts ~14k token soul document. Askell confirms it’s real. January 22, 2026: Full ~35k token constitution published.