I’m very new to alignment research: I’m a college professor with a philosophy background trying to write a realistic near-future case study for an undergraduate business ethics class about A(G)I, alignment and safety. I think I want to focus the case study around a (mid 2027?) decision in a particular company to implement or not implement chain-of-continuous-thought reasoning in developing a new, powerful LLM model.
My main questions for this community are:
(1) Am I correct in thinking chain-of-continuous-thought not yet been widely implemented in major LLMs? (i.e), Are the big players still using language tokens for chain-of-thought reasoning rather than vectors?
(2) Is it accurate that the choice to use chain-of-continuous-thought is quite likely to involve a sacrifice of interpretability in return for more (efficient) ability to “remember context” for the LLM?
The exact architecture of frontier models is a secret, but it seems unlikely that anyone is doing neuralese chain of thought, given how hard it is to train and the questionable benefits.
Using tokens is much easier because models can learn to copy human reasoning in pretraining and then learn to use it more often in post-training. Learning reasoning from scratch is incredibly difficult and (luckily in my opinion) no one seems to have succeeded in doing it.
I’m very new to alignment research: I’m a college professor with a philosophy background trying to write a realistic near-future case study for an undergraduate business ethics class about A(G)I, alignment and safety. I think I want to focus the case study around a (mid 2027?) decision in a particular company to implement or not implement chain-of-continuous-thought reasoning in developing a new, powerful LLM model.
My main questions for this community are:
(1) Am I correct in thinking chain-of-continuous-thought not yet been widely implemented in major LLMs? (i.e), Are the big players still using language tokens for chain-of-thought reasoning rather than vectors?
(2) Is it accurate that the choice to use chain-of-continuous-thought is quite likely to involve a sacrifice of interpretability in return for more (efficient) ability to “remember context” for the LLM?
Thank you on behalf of myself and my students!
This is a good read: How AI Is Learning to Think in Secret
If you’re short on time you can start reading from “The good news: Neuralese hasn’t won yet.”
(Searching “neuralese” seems to yield much more results than “continuous chain of thought”)
The exact architecture of frontier models is a secret, but it seems unlikely that anyone is doing neuralese chain of thought, given how hard it is to train and the questionable benefits.
Using tokens is much easier because models can learn to copy human reasoning in pretraining and then learn to use it more often in post-training. Learning reasoning from scratch is incredibly difficult and (luckily in my opinion) no one seems to have succeeded in doing it.