Vanessa Kosoy comments on Are there cognitive realms?

Vanessa Kosoy 12 Jan 2025 14:32 UTC
LW: 4 AF: 3
0
AF
This post states and speculates on an important question: are there different mind types that are in some sense “fully general” (the author calls it “unbounded”) but are nevertheless qualitatively different. The author calls these hypothetical mind taxa “cognitive realms”.
This is how I think about this question, from within the LTA:
To operationalize “minds” we should be thinking of learning algorithms. Learning algorithms can be classified according to their “syntax” and “semantics” (my own terminology). Here, semantics refers to questions such as (i) what type of object is the algorithm learning (ii) what is the feedback/data available to the algorithm and (iii) what is the success criterion/parameter of the algorithm. On the other hand, syntax refers to the prior and/or hypothesis class of the algorithm (where the hypothesis class might be parameterized in a particular way, with particular requirements on how the learning rate depends on the parameters).
Among different semantics, we are especially interested in those that are in some sense agentic. Examples include reinforcement learning, infra-Bayesian reinforcement learning, metacognitive agents and infra-Bayesian physicalist agents.
Do different agentic semantics correspond to different cognitive realms? Maybe, but maybe not: it is plausible that most of them are reflectively unstable. For example Christiano’s malign prior might be a mechanism for how all agents converge to infra-Bayesian physicalism.
Agents with different syntaxes is another candidate for cognitive realms. Here, the question is whether there is an (efficiently learnable) syntax that is in some sense “universal”: all other (efficiently learnable) syntaxes can be efficiently translated into it. This is a wide open question. (See also “frugal universal prior”.)
In the context of AI alignment, in order to achieve superintelligence it is arguably sufficient to use a syntax equivalent to whatever is used by human brain algorithms. Moreover, it’s plausible that any algorithm we can come up can only have an equivalent or weaker syntax (the process of us discovering the new syntax suggests an embedding of the new syntax into our own). Therefore, even if there are many cognitive realms, then for our purposes we mostly only care about one of them. However, the multiplicity of realms has implications on how simple/natural/canonical should we expect the choice of syntax for our theory of agents to be (the less realms, the more canonical).