phd student in comp neuroscience @ mpi brain research frankfurt. https://twitter.com/janhkirchner and https://universalprior.substack.com/
Jan
Frankfurt, Germany – ACX Meetups Everywhere 2021
Soldiers, Scouts, and Albatrosses.
Cognitive Biases in Large Language Models
How to build a mind—neuroscience edition
Hey Steven!
Yep, that’s a pretty accurate summary. My intuition is actually that the synthetic training data might even be better than actual sensory input for pretraining because millions of years of evolution have optimized it exactly for that purpose. Weak evidence for that intuition is that the synthetic data comes in distinct stages that go from “very coarse-grained” to “highly detailed” (see f.e. here).
And you are also correct that retinal waves are not universally accepted to be useful—there was a long debate where some people claim that they are just a random “byproduct” of development. The new Ge et al. paper that came out a few months ago is a strong indicator for the functional importance of retinal waves though to the point where I’m pretty convinced they are not just a byproduct.
btw, I really enjoyed your post on the lifetime anchor. My take on that is that it doesn’t make a lot of sense to estimate the lifetime anchor and the evolution anchor separately. Evolution can do the pretraining (through providing tailor-made synthetic data) and then the environment does the fine-tuning. That would also explain why Cotra’s lifetime estimate appears so low compared to the amount of compute used on current ML models: current ML models have to start from scratch, while the brain can start with a nicely pretrained model.
Yeah, it’s a tricky situation for me. The thesis that spontaneous activity is important is very central to my research, so I have a lot of incentives to believe in it. And I’m also exposed to a lot of evidence in its favor. We should probably swap roles (I should argue against and you for the importance) to debias. In case you’re ever interested in trying that out (or in having an adversarial collaboration about this topic) let me know :)
But to sketch out my beliefs a bit further:
I believe that spontaneous activity is quite rich in information. Direct evidence for that comes from this study from 2011 where they find that the statistics of spontaneous activity and stimulus-evoked activity are quite similar and get more similar over development. Indirect evidence comes from modeling studies from our lab that show that cortical maps and the fine-scale organization of synapses can be set up through spontaneous activity/retinal waves alone. Other labs have shown that retinal waves can set up long-range connectivity within the visual cortex and that they can produce Gabor receptive fields and with even more complex invariant properties. And beyond the visual cortex, I’m currently working on a project where we set up the circuitry for multisensory integration with only spontaneous activity.
I believe that the cortex essentially just does some form of gradient descent/backpropagation in canonical neural circuits that updates internal models. (The subcortex might be different.) I define “gradient descent” generously as “any procedure that uses or approximates the gradient of a loss function as the central component to reduce loss”. All the complications stem from the fact that a biological neural net is not great at accurately propagating the error signal backward, so evolution came up with a ton of tricks & hacks to make it work anyhow (see this paper from UCL & Deepmind for some ideas on how exactly). I have two main reasons to believe this:
Gradient descent is pretty easy to implement with neurons and simultaneously general that just on a complexity prior it’s a strong candidate for any solution that a meta-optimizer like evolution might come up with. Anything more complicated would not be working as robustly across all relevant domains.
In conjunction with what I believe about spontaneous activity inducing very strong & informative priors, I don’t think there is any need for anything more complicated than gradient descent. At least I don’t intuitively see the necessity of more optimized learning algorithms (except to maybe squeeze out a few more percentage points of performance).
I notice that there are a lot fewer green links in the second point, which also nicely indicates my relative level of certainty about that compared to the first point.
What’s the evidence that the spontaneous activity / “synthetic data” (e.g. retinal waves) is doing things that stimulated activity / “actual data” (e.g. naturalistic visual scenes) can’t do by itself?
I don’t think direct evidence for this exists. Tbf, this would be a very difficult experiment to run (you’d have to replace retinal waves with real data and the retina really wants to generate retinal waves).
But the principled argument that sways me the most is that “real” input is external—its statistics don’t really care about the developmental state of the animal. Spontaneous activity on the other hand changes with development and can (presumably) provide the most “useful” type of input for refining the circuit (as in something like progressive learning). This last step is conjecture and could be investigated with computational models (train the first layer with very coarse retinal waves, the second layer with more refined retinal waves, etc. and see how well the final model performs compared with one trained on an equal number of natural images). I might run that experiment at some point in the future. Any predictions?
a tendency to conflate “prior” with “genetically-hardcoded information”, especially within the predictive processing literature, and I’m trying to push back on that
Hmm, so I agree with the general point that you’re making that “priors are not set in stone” and the whole point is to update on them with sensory data and everything. But I think it’s not fair to treat all seconds of life as equally influential/important for learning. There is a lot of literature demonstrating that the cortex is less plastic during adulthood compared to development. There is also the big difference that during development the location & shape of dendrites and axons change depending on activity, while in adulthood things are a lot more rigid. Any input provided early on will have a disproportionate impact. The classic theory that there are critical periods of plasticity during development is probably too strong (given the right conditions/pharmacological interventions also the adult brain can be very plastic again), but still—there is something special about development.
I’m not sure if that’s the point that people in predictive coding are making or if they are just ignorant that lifelong plasticity is a thing.
Applied Mathematical Logic For The Practicing Researcher
Ahh, thanks for letting me know! (: Yeah, they also don’t work for me either… I guess the problem arises because footnotes have to be entered in Markdown mode (see this) but formatting the images only works in the WYSIWYG editor… Bummer. I’ll figure out a different solution for the next post.
Thank you, glad you enjoyed reading it! (:
Also, cool that you mention Scott Page’s book! I have it on my shelf but haven’t gotten around to reading it yet. When I do I’ll write an update.
Thank you! (:
Very interesting point, I didn’t know that. Do you know (/have a reference that explains) how those counterfactuals are evaluated then?
Good idea! Did it!
Frankfurt Declaration on the Cambridge Declaration on Consciousness
Drug addicts and deceptively aligned agents—a comparative analysis
Thank you for the input, super useful! I did not know the concept of transparency in this context, interesting. This does seem to capture some important qualitative differences between pain and suffering, although I’m hesitant to use the terms conscious/qualia. Will think about this more.
Thanks for the reference! I was aware of some shortcomings of PANAS, but the advantages (very well-studied, and lots of freely available human baseline data) are also pretty good.
The cool thing about doing these tests with large language models is that it almost costs nothing to get insanely large sample sizes (for social science standards) and that it’s (by design) super replicable. When done in a smart way, this procedure might even produce insight on biases of the test design or it might verify shaky results from psychology (as GPT should capture a fair bit of human psychology). The flip side of that is of course that there will be a lot of different moving parts and interpreting the output is challenging.
Uhhh, another thing for my reading list (LW is an amazing knowledge retrieval system). Thank you!
I remember encountering that argument/definition of suffering before. It certainly has a bit of explanatory power (you mention meditation) and it somehow feels right. But I don’t understand self-referentiality deep enough to have a mechanistic model of how that should work in my mind. And I’m a bit wary that this perspective conveniently allows us to continue animal eating and (some form of) mass farming. That penalizes the argument for me a bit, motivated cognition etc.
The Greedy Doctor Problem
Thank you for the comment! :) Since this one is the most upvoted one I’ll respond here, although similar points were also brought up in other comments.
I totally agree, this is something that I should have included (or perhaps even focused on). I’ve done a lot of thinking about this prior to writing the post (and lots of people have suggested all kinds of fancy payment schemes to me, f.e. increasing payment rapidly for every year above life expectancy). I’ve converged on believing that all payment schemes that vary as a function of time can probably be goodharted in some way or other (f.e. through medical coma like you suggest, or by just making you believe you have great life quality). But I did not have a great idea for how to get a conceptual handle on that family of strategies, so I just subsumed them under “just pay the doctor, dammit”.
After thinking about it again, (assuming we can come up with something that cannot be goodharted) I have the intuition that all of the time-varying payment schemes are somehow related to assassination markets, since you basically get to pick the date of your own death by fixing the payment scheme (at some point the amount of effort the doctor puts in will be higher than the payment you can offer, at which point the greedy doctor will just give up). So ideally you would want to construct the time-varying payment scheme in exactly that way that pushed the date of assassination as far into the future as possible. When you have a mental model of how the doctor makes decisions, this is just a “simple” optimization process.
But when you don’t have this (since the doctor is smarter), you’re kind of back to square one. And then (I think) it possibly again comes down to setting up multiple doctors to cooperate or compete to force them to be truthful through a time-invariant payment scheme. Not sure at all though.
Thank you very much for pointing it out! Just checked the primary source there it’s spelled correctly. But the misspelled version can be found in some newer books that cite the passage. Funny how typos spread...
I’ll fix it!