To the connectivists such as myself, your point 0 has seemed obvious for a while, so the EY/MIRI/LW anti-neural net groupthink was/is a strong sign of faulty beliefs. And saying “oh but EY/etc didn’t really think neural nets wouldn’t work, they just thought other paradigms would be safer” doesn’t really help much if no other paradigms ever had a chance. Underlying much of the rationalist groupthink on AI safety is a set of correlated incorrect anti-connectivist beliefs which undermines much of the standard conclusions.
back in the day I went around saying like, I do not think that just stacking more layers of transformers is going to get you all the way to AGI, and I think that GPT-4 is past where I thought this paradigm is going to take us, and I, you know, you want to notice when that happens, you want to say like “oops, I guess I was incorrect about what happens if you keep on stacking more transformer layers”
and then Fridman asks him whether he’d say that his intuition was wrong, and Eliezer says yes.
I think you should quote the bit you think shows that. Which ‘neural nets wouldn’t work’, exactly? I realize that everyone now thinks there’s only one kind (the kind which works and which we have now), but there’s not.
The Fridman transcript I skimmed was him being skeptical that deep learning, one of several different waves of connectionism, would go from early successes like AlphaGo all the way to AGI, and consistent with what I had always understood him to believe, which was that connectionism could work someday but that would be bad because it would be unsafe (which I agreed then and still do agree with now, and to the extent that Eliezer says I was right to pickup on ‘holy shit guys this may be it, after three-quarters of a century of failure, this time really is different, it’s Just Working™’ and he was wrong, I don’t think it’s because I was specially immune to ‘groupthink’ or somehow escaped ‘faulty beliefs’, but it was because I was paying much closer attention to the DL literature and evaluating whether progress favored Cannell & Moravec or critics, and cataloguing examples of the blessings of scale & evidence for the scaling hypothesis).
Yeah—of course the brain was always an example of a big neural net that worked, the question was how accessible that design is/was. The core of the crucial update for me—which I can’t pinpoint precisely but I’d guess was somewhere between 2010 to 2014 - was the realization that GD with a few simple tricks really is a reasonable general approximation of bayesian inference, and a perfectly capable global optimizer in the overcomplete regime (the latter seems obvious now in retrospect, but apparently wasn’t so obvious when nets were small: it was just sort of known/assumed that local optima were a major issue). Much else just falls out from that. The ‘groupthink’ I was referring to is that some here are still deriving much of their core AI/ML beliefs from reading the old sequences/lore rather than the DL literature and derivations.
Fair. Ok, I edited the original post, see there for the quote.
One reason I felt comfortable just stating the point is that Eliezer himself framed it as a wrong prediction. (And he actually refers to you as having been more correct, though I don’t have the timestamp.)
To the connectivists such as myself, your point 0 has seemed obvious for a while, so the EY/MIRI/LW anti-neural net groupthink was/is a strong sign of faulty beliefs. And saying “oh but EY/etc didn’t really think neural nets wouldn’t work, they just thought other paradigms would be safer” doesn’t really help much if no other paradigms ever had a chance. Underlying much of the rationalist groupthink on AI safety is a set of correlated incorrect anti-connectivist beliefs which undermines much of the standard conclusions.
(Eliezer did think neural nets wouldn’t work; he explicitly said it on the Lex Fridman podcast.)
Edit @request from gwern: at 11:30 in the podcast, Eliezer says,
and then Fridman asks him whether he’d say that his intuition was wrong, and Eliezer says yes.
I think you should quote the bit you think shows that. Which ‘neural nets wouldn’t work’, exactly? I realize that everyone now thinks there’s only one kind (the kind which works and which we have now), but there’s not.
The Fridman transcript I skimmed was him being skeptical that deep learning, one of several different waves of connectionism, would go from early successes like AlphaGo all the way to AGI, and consistent with what I had always understood him to believe, which was that connectionism could work someday but that would be bad because it would be unsafe (which I agreed then and still do agree with now, and to the extent that Eliezer says I was right to pickup on ‘holy shit guys this may be it, after three-quarters of a century of failure, this time really is different, it’s Just Working™’ and he was wrong, I don’t think it’s because I was specially immune to ‘groupthink’ or somehow escaped ‘faulty beliefs’, but it was because I was paying much closer attention to the DL literature and evaluating whether progress favored Cannell & Moravec or critics, and cataloguing examples of the blessings of scale & evidence for the scaling hypothesis).
Yeah—of course the brain was always an example of a big neural net that worked, the question was how accessible that design is/was. The core of the crucial update for me—which I can’t pinpoint precisely but I’d guess was somewhere between 2010 to 2014 - was the realization that GD with a few simple tricks really is a reasonable general approximation of bayesian inference, and a perfectly capable global optimizer in the overcomplete regime (the latter seems obvious now in retrospect, but apparently wasn’t so obvious when nets were small: it was just sort of known/assumed that local optima were a major issue). Much else just falls out from that. The ‘groupthink’ I was referring to is that some here are still deriving much of their core AI/ML beliefs from reading the old sequences/lore rather than the DL literature and derivations.
Fair. Ok, I edited the original post, see there for the quote.
One reason I felt comfortable just stating the point is that Eliezer himself framed it as a wrong prediction. (And he actually refers to you as having been more correct, though I don’t have the timestamp.)