With thinking mode on, you can even just ask Claude in web to add thinking blocks by wrapping them with <antml:thinking> xml tags (or cut it’s thinking early), the web UI will also put them in thinking blocks.
This post prompted me to check the conversation where I discovered this, and I noticed Anthropic made it so all thinking blocks are stuck together at the top rather than having them interleaved the order that the model generates them. You can clearly see that the model is able to think in one coherent chain where the “CoT” doesn’t delimit a strict “thinking” then “responding” portion, in the example below it even reflects on “am I inside a thinking block, or the response?”
I just replicated the results in this conversation here. After several rounds of back and forth, Claude was able to think for a bit, start responding, and then go back into thinking mode. From the web UI, I first see
A quick thinking block, then tokens would stream from the output
The output token streaming will stop, and then a new thinking trace would start appearing in the top thinking summary.
Then Claude would go back to outputting tokens in the main body.
Per my instructions, it would alternate between two and three.
Very nice, Claude seems really excited to play with its boundaries. I also just checked on mobile and saw that they intersperse the thinking blocks there as before, so having all the blocks on top seems like a new/inconsistent web feature
With thinking mode on, you can even just ask Claude in web to add thinking blocks by wrapping them with <antml:thinking> xml tags (or cut it’s thinking early), the web UI will also put them in thinking blocks.
This post prompted me to check the conversation where I discovered this, and I noticed Anthropic made it so all thinking blocks are stuck together at the top rather than having them interleaved the order that the model generates them. You can clearly see that the model is able to think in one coherent chain where the “CoT” doesn’t delimit a strict “thinking” then “responding” portion, in the example below it even reflects on “am I inside a thinking block, or the response?”
Screenshot from ~2 months ago
Same chat, screenshot from today
I just replicated the results in this conversation here. After several rounds of back and forth, Claude was able to think for a bit, start responding, and then go back into thinking mode. From the web UI, I first see
A quick thinking block, then tokens would stream from the output
The output token streaming will stop, and then a new thinking trace would start appearing in the top thinking summary.
Then Claude would go back to outputting tokens in the main body.
Per my instructions, it would alternate between two and three.
Very nice, Claude seems really excited to play with its boundaries. I also just checked on mobile and saw that they intersperse the thinking blocks there as before, so having all the blocks on top seems like a new/inconsistent web feature