Waking up to reality. No, not that one. We’re still dreaming.
Aleksi Liimatainen
Learning networks are ubiquituous (if it can be modeled as a network and involves humans or biology it almost certainly is one) and the ones inside our skulls are less of a special case than we think.
walks into the magic shop
Hello, I’d like to commission a Sword of Carving at the Joints.
I am a regularity detector generated by the regularities of reality. Frequentism and Bayesianism are attempted formalizations of the observed regularities in the regularity detection process but, ultimately, I am neither.
Regardless of the object level merits of such topics, it’s rational to notice that they’re inflammatory in the extreme for the culture at large and that it’s simply pragmatic (and good manners too!) to refrain from tarnishing the reputation of a forum with them.
I also suspect it’s far less practically relevant than you think and even less so on a forum whose object level mission doesn’t directly bear on the topic.
Given how every natural goal-seeking agent seems to be built on layers and layers of complex interactions, I have to wonder if “utility” and “goals” are wrong paradigms to use. Not that I have any better ones ready, mind.
Seems to me that those weird power dynamics have deleterious effects even if countervailing forces prevent the group from outright imploding. It’s a tradeoff to engage with such institutions on their own terms and these days a nontrivial number of people seem to choose not to.
Physics is basically solved.
This echoes the sentiment of many prominent scientists in the late 1800s. All that was left was to resolve a few nagging irregularities.
The world is full of scale-free regularities that pop up across topics not unlike 2+2=4 does. Ever since I learned how common and useful this is, I’ve been in the habit of tracking cross-domain generalizations. That bit you read about biology, or psychology, or economics, just to name a few, is likely to apply to the others in some fashion.
ETA: I think I’m also tracking the meta of which domains seem to cross-generalize well. Translation is not always obvious but it’s a learnable skill.
I noticed I was confused about how humans can learn novel concepts from verbal explanations without running into the symbol grounding problem. After some contemplation, I came up with this:
To the extent language relies on learned associations between linguistic structures and mental content, a verbal explanation can only work with what’s already there. Instead of directly inserting new mental content, the explanation must leverage the receiving mind’s established content in a way that lets the mind generate its own version of the new content.
There’s enough to say about this that it seems worth a post or several but I thought I’d float it here first. Has something like this been written already?
The SSC sequence (plus a whole bunch of other things) inspired me to think of deities as mythic representations of cultural collective intelligence. The God-shaped hole could then be understood as a psychological adaptation for collective intelligence, and religions as collective intelligence operating systems.
There’s a lot more that could be said on this topic but it seems to deserve its own sequence. Perhaps I should write one.
I feel like this “back off and augment” is downstream of an implicit theory of intelligence that is specifically unsuited to dealing with how existing examples of intelligence seem to work. Epistemic status: the idea used to make sense to me and apparently no longer does, in a way that seems related to the ways i’ve updated my theories of cognition over the past years.
Very roughly, networking cognitive agents stacks up to cognitive agency at the next level up easier than expected and life has evolved to exploit this dynamic from very early on across scales. It’s a gestalt observation and apparently very difficult to articulate into a rational argument. I could point to memory in gene regulatory networks, Michael Levin’s work in nonneural cognition, trainability of computational ecological models (they can apparently be trained to solve sudoku), long term trends in cultural-cognitive evolution, and theoretical difficulties with traditional models of biological evolution—but I don’t know how to make the constellation of data points easily distinguishable from pareidolia.
I think we have an elephant in the room. As I outlined in a recent post, networks of agents may do Hebbian learning as inevitably as two and two makes four. If this is the case, there are some implications.
If a significant fraction of human optimization power comes from Hebbian learning in social networks, then the optimal organizational structure is one that permits such learning. Institutional arrangements with rigid formal structure are doomed to incompetence.
If the learning-network nature of civilization is a major contributor to human progress, we may need to revise our models of human intelligence and strategies for getting the most out of it.
Given the existence of previously understudied large-scale learning networks, it’s possible that there already exist agentic entities of unknown capability and alignment status. This may have implications for the tactical context of alignment research and priorities for research direction.
If agents naturally form learning networks, the creation and proliferation of AIs whose capabilities don’t seem dangerous in isolation may have disproportionate higher-order effects due to the creation of novel large-scale networks or modification of existing ones.
It seems to me that the above may constitute reason to raise an alarm at least locally. Does it? If so, what steps should be taken?
Any goal specification implies a cluster of target states that in turn can be taken as a measure. Given that, it seems that any goal specification short of a complete extrapolation of the goal-setter’s volition is subject to Goodhart’s Law. If so, we should expect everything short of direct action by the goal-setter to be goodharted all the time.
From this perspective, it seems like a miracle that any large-scale action works at all. How do we do it?
Thank you for writing this. I needed a conceptual handle like this to give shape to an intuition that’s been hanging around for a while.
It seems to me that our current civilizational arrangement is itself poorly aligned or at least prone to generating unaligned subentities. In other words, we have a generalized agent-alignment problem. Asking unaligned non-AI agents to align an AI is a Godzilla strategy and as such work on aligning already-existing entities is instrumental for AI alignment.
(On a side note, I suspect that there’s a lot of overlap between AI alignment and generalized alignment but that’s another argument entirely.)
AI alignment is a wicked problem. It won’t be solved by any approach that fails to grapple with how deeply it mirrors self-alignment, child alignment, institutional alignment and many others.
Yeah, this seems close to the crux of the disagreement. The other side sees a relation and is absolutely puzzled why others wouldn’t, to the point where that particular disconnect may not even be in the hypothesis space.
When a true cause of disagreement is outside the hypothesis space the disagreement often ends up attributed to something that is in the hypothesis space, such as value differences. I suspect this kind of attribution error is behind most of the drama I’ve seen around the topic.
My model of EY doesn’t know what the real EY knows. However, there seems to be overwhelming evidence that non-AI alignment is a bottleneck and that network learning similar to what’s occurring naturally is likely to be a relevant path to developing dangerously capable AI.
For my model of EY, “halt, melt and catch fire” seems overdetermined. I notice I am confused.
Thanks, this is exactly what I was looking for. Not a new idea then, though there’s something to be said for semi-independent reinvention.
The obvious munchkin move would be to develop a reliable means of boostrapping a basic mental model of constructivist learning and grounding it in the learner’s own direct experience of learning. Turning the learning process on itself should lead to some amount of recursive improvement, right? Has that been tried?
My suspicion is that it has to do with cultural-cognitive developments generally filed under “religion”. As it’s little more than a hunch and runs somewhat counter to my impression of LW mores, I hesitate to discuss it in more depth here.
For what it’s worth, as someone with a lot of meditation experience and a longstanding interest in the topic, I didn’t get a GPT-3 vibe at all. To me, the whole thing registered as meaningful communication on a poorly-understood topic, with roughly appropriate levels of tentativeness and epistemic caution.
I’m left wondering if “sounding more like GPT-3” might be a common feature of attempts to communicate across large inferential distances with significant amounts of nonshared referents. How could one distinguish between “there’s no there there” and “there’s a there there but it’s inaccessible from my current vantage point”?