Thanks for the nice review! It’s great to have the reading of someone who understand enough the current state of neuroscience to point to aspects of the book at odds with neuroscience consensus. My big takeaway is that I should look a bit more into neuroscience based approaches to AGI, because they might be important, and require different alignment approaches.
On a more rhetorical level, I’m impressed by how you manage to make me ask a question (okay, but what evidence is there for this uniformity of the neocortex) and then points to some previous work you did on the topic. That changes my perspective on them completely, because it makes it easier to see a point within AI Alignment research (instead of just intellectual curiosity.
So if indeed we can get AGI by reverse-engineering just the neocortex (and its “helper” organs like the thalamus and hippocampus), and if the neocortex is a relatively simple, human-legible, learning algorithm, then all of the sudden it doesn’t sound so crazy for Hawkins to say that brain-like AGI is feasible, and not centuries away, but rather already starting to crystallize into view on the horizon.
I might have missed it, but what is the argument for the neocortex learning algorithm being human-legible? That seems pretty relevant to this approach.
This is a big and important section of the book. I’m going to skip it. My excuse is: I wrote a summary of an interview he did a while back, and that post covered more-or-less similar ground. That said, this book describes it better, including a new and helpful (albeit still a bit sketchy) discussion of learning abstract concepts.
I’m fine with you redirecting to a previous post, but I would have appreciated at least a one sentence-summary and opinion.
Some people (cf. Stuart Russell’s book) are concerned that the development of AGI poses a substantial risk of catastrophic accidents, up to and including human extinction. They therefore urge research into how to ensure that AIs robustly do what humans want them to do—just as Enrico Fermi invented nuclear reactor control rodsbefore he built the first nuclear reactor.
Jeff Hawkins is having none of it. “When I read about these concerns,” he says, “I feel that the arguments are being made without any understanding of what intelligence is.”
Writing this part before going on to read the rest, but intuitively an AGI along those lines seems less dangerous than purely artificial approach. Indeed, I expect that such an AGI will have some underlying aspects of basic human cognition (or something similar), and thus things like common sense and human morals might be easier to push for. Does that make any sense?
Going back after reading the rest of the post, it seems that these sort of aspects of human cognition would come more from what you call the Judge, with all the difficulties in implementing it.
No specific comment on your explanation of the risks, just want to say that you make a very good job of it!
I’m fine with you redirecting to a previous post, but I would have appreciated at least a one sentence-summary and opinion.
My opinion is: I think if you want to figure out the gory details of the neocortical algorithm, and you want to pick ten authors to read, then Jeff Hawkins should be one of them. If you’re only going to pick one author, I’d go with Dileep George.
I’m happy to chat more offline.
what is the argument for the neocortex learning algorithm being human-legible?
Well there’s an inside-view argument that it’s human-legible because “It basically works like, blah blah blah, and that algorithm is human-legible because I’m a human and I just legibled it.” I guess that’s what Jeff would say. (Me too.)
Then there’s an outside-view argument that goes “most of the action is happening within a “cortical mini-column”, which consists of about 100 neurons mostly connected to each other. Are you really going to tell me that 100 neurons implements an algorithm that is so complicated that it’s forever beyond human comprehension? Then again, BB(5) is still unknown, so circuits with a small number of components can be quite complicated. So I guess that’s not all that compelling an argument on its own.
I think a better outside-view argument is that if one algorithm is really going to learn how to parse visual scenes, put on a shoe, and design a rocket engine … then such an algorithm really has to work by simple, general principles—things like “if you’ve seen something, it’s likely that you’ll see it again”, and “things are often composed of other things”, and “things tend to be localized in time and space”, and TD learning, etc.
Also, GPT-3 shows that human-legible learning algorithms are at least up to the task of learning language syntax and semantics, plus learning quite a bit of knowledge about how the world works.
things like common sense and human morals might be easier to push for.
For common sense, my take is that it’s plausible that a neocortex-like AGI will wind up with some of the same concepts as humans, in certain areas and under certain conditions. That’s a hard thing to guarantee a priori, and therefore I’m not quite sure what that buys you.
For morals, there is a plausible research direction of “Let’s make AGIs with a similar set of social instincts as humans. Then they would wind up with similar moral intuitions, even when pushed to weird out-of-distribution hypotheticals. And then we can do better by turning off jealousy, cranking up conservatism and sympathy, etc.” That’s a research direction I take seriously, although it’s not the only path to success. (It might be the only path to success that doesn’t fundamentally rely on transparency.) It faces the problem that we don’t currently know how to write the code for human-like social instincts, which could wind up being quite complicated. (See discussion here—relevant quote is: “I can definitely imagine that the human brain has an instinctual response to a certain input which is adaptive in 500 different scenarios that ancestral humans typically encountered, and maladaptive in another 499 scenarios that ancestral humans typically encountered. So on average it’s beneficial, and our brains evolved to have that instinct, but there’s no tidy story about why that instinct is there and no simple specification for exactly what calculation it’s doing.”)
Thanks for the nice review! It’s great to have the reading of someone who understand enough the current state of neuroscience to point to aspects of the book at odds with neuroscience consensus. My big takeaway is that I should look a bit more into neuroscience based approaches to AGI, because they might be important, and require different alignment approaches.
On a more rhetorical level, I’m impressed by how you manage to make me ask a question (okay, but what evidence is there for this uniformity of the neocortex) and then points to some previous work you did on the topic. That changes my perspective on them completely, because it makes it easier to see a point within AI Alignment research (instead of just intellectual curiosity.
I might have missed it, but what is the argument for the neocortex learning algorithm being human-legible? That seems pretty relevant to this approach.
I’m fine with you redirecting to a previous post, but I would have appreciated at least a one sentence-summary and opinion.
Writing this part before going on to read the rest, but intuitively an AGI along those lines seems less dangerous than purely artificial approach. Indeed, I expect that such an AGI will have some underlying aspects of basic human cognition (or something similar), and thus things like common sense and human morals might be easier to push for. Does that make any sense?
Going back after reading the rest of the post, it seems that these sort of aspects of human cognition would come more from what you call the Judge, with all the difficulties in implementing it.
No specific comment on your explanation of the risks, just want to say that you make a very good job of it!
Thanks!
My opinion is: I think if you want to figure out the gory details of the neocortical algorithm, and you want to pick ten authors to read, then Jeff Hawkins should be one of them. If you’re only going to pick one author, I’d go with Dileep George.
I’m happy to chat more offline.
Well there’s an inside-view argument that it’s human-legible because “It basically works like, blah blah blah, and that algorithm is human-legible because I’m a human and I just legibled it.” I guess that’s what Jeff would say. (Me too.)
Then there’s an outside-view argument that goes “most of the action is happening within a “cortical mini-column”, which consists of about 100 neurons mostly connected to each other. Are you really going to tell me that 100 neurons implements an algorithm that is so complicated that it’s forever beyond human comprehension? Then again, BB(5) is still unknown, so circuits with a small number of components can be quite complicated. So I guess that’s not all that compelling an argument on its own.
I think a better outside-view argument is that if one algorithm is really going to learn how to parse visual scenes, put on a shoe, and design a rocket engine … then such an algorithm really has to work by simple, general principles—things like “if you’ve seen something, it’s likely that you’ll see it again”, and “things are often composed of other things”, and “things tend to be localized in time and space”, and TD learning, etc.
Also, GPT-3 shows that human-legible learning algorithms are at least up to the task of learning language syntax and semantics, plus learning quite a bit of knowledge about how the world works.
For common sense, my take is that it’s plausible that a neocortex-like AGI will wind up with some of the same concepts as humans, in certain areas and under certain conditions. That’s a hard thing to guarantee a priori, and therefore I’m not quite sure what that buys you.
For morals, there is a plausible research direction of “Let’s make AGIs with a similar set of social instincts as humans. Then they would wind up with similar moral intuitions, even when pushed to weird out-of-distribution hypotheticals. And then we can do better by turning off jealousy, cranking up conservatism and sympathy, etc.” That’s a research direction I take seriously, although it’s not the only path to success. (It might be the only path to success that doesn’t fundamentally rely on transparency.) It faces the problem that we don’t currently know how to write the code for human-like social instincts, which could wind up being quite complicated. (See discussion here—relevant quote is: “I can definitely imagine that the human brain has an instinctual response to a certain input which is adaptive in 500 different scenarios that ancestral humans typically encountered, and maladaptive in another 499 scenarios that ancestral humans typically encountered. So on average it’s beneficial, and our brains evolved to have that instinct, but there’s no tidy story about why that instinct is there and no simple specification for exactly what calculation it’s doing.”)