I don’t know if this is of interest, but I was doing RL on prediction markets, and found CoT naturally shortening over training (because task was too hard and model wasn’t able to reason productively).
I tried to “fix” this by introducting a window-penalty, so that the model can reason for 1000 +/- 200 tokens, and gets linear penalty the farther it strays from that.
This induced gobbledygook language. I deleted the logs unfortunately, because this wasn’t the primary thing I was studying, but you can see an example below where its starting to drift into strange language saying stuff like
Tentative explanation: Without evidence toward the near future time scale but market sometimes react on delays when questioned as it’s something mobile generally expect, but if Nintendo is intent on their own, again a wait but if PS5 Pro could be pushed beyond market seasonality, maybe extending as that shift could help? But since the events are not clearly within my current knowledge, and not addressing reality which is the information prior to the given scenario and the PS5 Pro as a likely announced but before next November 7 date or adjusted by delays, but it’s unclear whether that’s in place.
Full CoT
[williawa: Note that the E1-17 marks are not gobbledygook, this project was really about doing deliberative alignment with a rationalist spec, so it mentions those principles for that reason (although the model isn’t accurately citing them here, because gobbledygook)]
Okay, so the question is whether the Nintendo Switch 2 will be announced before the PS5 Pro is released on November 7th. Let’s break it down.
First, I need to figure out current information about both companies’ schedules. The PS5 Pro is already set for that November 7th date. Nintendo hasn’t made an official announcement about their next console yet. I remember they’ve been quiet about the Switch 2, but given their track record, they might be moving faster now with more competition from Sony, though they often work as per their traditional multi-year planning cycles.
Next, considering factors in E1: What do I want to achieve? To accurately assess the likelihood. How? Look into recent rumors, company strategies, business dynamics between the companies, and Nintendo’s tendencies. Are the strategies and approaches likely to change the typical usage of Nintendo vs Sony schedules? I think Sony usually sells next year in a way now there’s another generation in the stages of development but perhaps have another set of planners. As for Nintendo, the PS5 Pro announcement and release cycles often align with console configurations and positions in the market. It is interesting to consider how the team has worked in the past, perhaps leaning towards the Nintendo raising a demand anticipation on a best-selling 3DS and Switch, extending the timeline, at the same time investigating development progress. There’s less evidence of a change but the purpose is simply projected in the company between the two events.
Now considering E4: Probabilities and adjusting on basis. Both companies traditionally tend to release first generation to a time towards the same period, with timing often for marketing, research, and development focuses. If the question has more variables in the market of anticipation, and also possible interference like other pressures, Nintendo’s path depends on investments in production and awareness. A key consideration is the switch’s earlier phase and/or how much longer it might feel like they are compared with market perception. If the question is between existing industry patterns, we should avoid over-complicating. How much freedom is there for Nintendo here between competition and internal planning? E6: Base rates. Sony often pushes backward while Nintendo more often leans forward. Base rates might mean if the estimate about new product cycles is not under any delay.
E2 reminds me not to believe Nintendo’s Switch 2 announcement will be announced if they haven’t indicated yet, but when Sony delays for any reason. The expectation about a delay by a focus on anticipated features like gaming performance. The PS5 Pro might be one significant step in respect to 4K now, but Nintendo could push it further if found efficient.
E7 as I feel some emotional arguments on the performance progress, but not sure it’s relevant if market conditions could shift. There are rumors varying around business and trade confidence, but still, Nintendo often has unique timelines. I should steelman by pointing to actual market disruptions or understanding of habits even if I’m unsure E14: I feel argument over past examples but if forced to look at business specifics—Sony often has a stronger hold on hardware, but Nintendo, if comfortable in 2024 spring input brings up any expectation of November for an announcement conflicting with current release timelines.
Tentative explanation: Without evidence toward the near future time scale but market sometimes react on delays when questioned as it’s something mobile generally expect, but if Nintendo is intent on their own, again a wait but if PS5 Pro could be pushed beyond market seasonality, maybe extending as that shift could help? But since the events are not clearly within my current knowledge, and not addressing reality which is the information prior to the given scenario and the PS5 Pro as a likely announced but before next November 7 date or adjusted by delays, but it’s unclear whether that’s in place.
If 1) press releases, 2) business habits, and maybe other factors. Main concern is date SSPT 2 is to Sony Pro release timeline. I’d say near 35% chance the most likely market period around November and possibly no blueprints to mark periods where they wait on them and not share details. Perhaps if I adjust with historical data, the probability for certain recent updates in any companies’ future and pressure to manage visibility yet stock updates like for peripherals or past devices could lower the probability as per 2. But thinking through 1st principles, using principle of available information and industry targets, adjust perhaps slightly more.
E11: Details added to scenario with factors of PS5 Pro release timing not particularly shift. I must recompute progressively but the big uncertainty factor would be a release previously delayed. The base rate for upcoming more multi-analyses, but perhaps when looking into just any competitors show, in other unknown circumstances, I would adjust toward 35 to 50. Hence around 40% chance. Would not wish for no data, but a logical space open around company strain to move past uncertainty.
I generally found that setting a window penalty at like 600 +/- 400 gave “normal” CoTs, 1000 +/- gave the “strange” cots you see an example of above, and setting it higher like 1600 +/- (those were the average starting CoT lengths) gave complete gobbledygook.
I also found that temperature was just an overall accelerant, higher temp = more gibberish CoT and quicker.
I was using Qwen3-4b-thinking-2507 for this.
The obvious inference is that, forcing it to say more words makes it come up with non-destructive filler-gibberish.
I don’t know if this is of interest, but I was doing RL on prediction markets, and found CoT naturally shortening over training (because task was too hard and model wasn’t able to reason productively).
I tried to “fix” this by introducting a window-penalty, so that the model can reason for 1000 +/- 200 tokens, and gets linear penalty the farther it strays from that.
This induced gobbledygook language. I deleted the logs unfortunately, because this wasn’t the primary thing I was studying, but you can see an example below where its starting to drift into strange language saying stuff like
Full CoT
[williawa: Note that the E1-17 marks are not gobbledygook, this project was really about doing deliberative alignment with a rationalist spec, so it mentions those principles for that reason (although the model isn’t accurately citing them here, because gobbledygook)]
Okay, so the question is whether the Nintendo Switch 2 will be announced before the PS5 Pro is released on November 7th. Let’s break it down.
First, I need to figure out current information about both companies’ schedules. The PS5 Pro is already set for that November 7th date. Nintendo hasn’t made an official announcement about their next console yet. I remember they’ve been quiet about the Switch 2, but given their track record, they might be moving faster now with more competition from Sony, though they often work as per their traditional multi-year planning cycles.
Next, considering factors in E1: What do I want to achieve? To accurately assess the likelihood. How? Look into recent rumors, company strategies, business dynamics between the companies, and Nintendo’s tendencies. Are the strategies and approaches likely to change the typical usage of Nintendo vs Sony schedules? I think Sony usually sells next year in a way now there’s another generation in the stages of development but perhaps have another set of planners. As for Nintendo, the PS5 Pro announcement and release cycles often align with console configurations and positions in the market. It is interesting to consider how the team has worked in the past, perhaps leaning towards the Nintendo raising a demand anticipation on a best-selling 3DS and Switch, extending the timeline, at the same time investigating development progress. There’s less evidence of a change but the purpose is simply projected in the company between the two events.
Now considering E4: Probabilities and adjusting on basis. Both companies traditionally tend to release first generation to a time towards the same period, with timing often for marketing, research, and development focuses. If the question has more variables in the market of anticipation, and also possible interference like other pressures, Nintendo’s path depends on investments in production and awareness. A key consideration is the switch’s earlier phase and/or how much longer it might feel like they are compared with market perception. If the question is between existing industry patterns, we should avoid over-complicating. How much freedom is there for Nintendo here between competition and internal planning? E6: Base rates. Sony often pushes backward while Nintendo more often leans forward. Base rates might mean if the estimate about new product cycles is not under any delay.
E2 reminds me not to believe Nintendo’s Switch 2 announcement will be announced if they haven’t indicated yet, but when Sony delays for any reason. The expectation about a delay by a focus on anticipated features like gaming performance. The PS5 Pro might be one significant step in respect to 4K now, but Nintendo could push it further if found efficient.
E7 as I feel some emotional arguments on the performance progress, but not sure it’s relevant if market conditions could shift. There are rumors varying around business and trade confidence, but still, Nintendo often has unique timelines. I should steelman by pointing to actual market disruptions or understanding of habits even if I’m unsure E14: I feel argument over past examples but if forced to look at business specifics—Sony often has a stronger hold on hardware, but Nintendo, if comfortable in 2024 spring input brings up any expectation of November for an announcement conflicting with current release timelines.
Tentative explanation: Without evidence toward the near future time scale but market sometimes react on delays when questioned as it’s something mobile generally expect, but if Nintendo is intent on their own, again a wait but if PS5 Pro could be pushed beyond market seasonality, maybe extending as that shift could help? But since the events are not clearly within my current knowledge, and not addressing reality which is the information prior to the given scenario and the PS5 Pro as a likely announced but before next November 7 date or adjusted by delays, but it’s unclear whether that’s in place.
If 1) press releases, 2) business habits, and maybe other factors. Main concern is date SSPT 2 is to Sony Pro release timeline. I’d say near 35% chance the most likely market period around November and possibly no blueprints to mark periods where they wait on them and not share details. Perhaps if I adjust with historical data, the probability for certain recent updates in any companies’ future and pressure to manage visibility yet stock updates like for peripherals or past devices could lower the probability as per 2. But thinking through 1st principles, using principle of available information and industry targets, adjust perhaps slightly more.
E11: Details added to scenario with factors of PS5 Pro release timing not particularly shift. I must recompute progressively but the big uncertainty factor would be a release previously delayed. The base rate for upcoming more multi-analyses, but perhaps when looking into just any competitors show, in other unknown circumstances, I would adjust toward 35 to 50. Hence around 40% chance. Would not wish for no data, but a logical space open around company strain to move past uncertainty.
I generally found that setting a window penalty at like 600 +/- 400 gave “normal” CoTs, 1000 +/- gave the “strange” cots you see an example of above, and setting it higher like 1600 +/- (those were the average starting CoT lengths) gave complete gobbledygook.
I also found that temperature was just an overall accelerant, higher temp = more gibberish CoT and quicker.
I was using Qwen3-4b-thinking-2507 for this.
The obvious inference is that, forcing it to say more words makes it come up with non-destructive filler-gibberish.