Hello—I’m just here to drop a somewhat vague/incipient idea for an AI model and see if there are any existing frameworks that could be used.
The general idea is to view agent action and perception as part of the same discrete data stream, and model intelligence as compression of sub-segments of this stream into independent “mechanisms” (patterns of action-perception) which can be used for prediction/action and potentially recombined into more general frameworks as the agent learns.
More precisely, I’m looking for:
The method of pattern representation
An algorithm for inferring initially orthogonal/unrelated patterns from the same data stream
Some manner of meta-learning for recombining mechanisms
One promising suggestion I received elsewhere was to use reservoir computing/liquid state machines for the time series pattern recognition.
(For a conceptually similar model look at Friston’s “Active Inference”.)
It seems to me people are still anthropomorphizing (or maybe “phrenomorphizing” might be more apt) the chain-of-thought “reasoning”. With respect to the issue of AI alignment I don’t think it matters that much that they do this, LLMs don’t have egos or the potential to set goals, have motives or intents, etc. They just learn a kind of mess of alien abstractions that optimizes their text-producing behavior for some training data. The real issue is that these abstractions probably often do not correlate with the actual ideas that are represented by word tokens, and so you get hiccups like hallucinations or these chain-of-thought snippets that make it look like it’s actually thinking and planning when it’s really just producing the text that best matches the text prompt of “show me what you’re thinking.”
If the issue is not alignment but just trying to get accurate results (i.e. not accidentally prompting the LLM to try and lie to make you feel better or whatever) then this kind of reasoning has more merit.