Remember, this is a small scale experiment, and you often have to go big in order to make use of your new findings,
Ok… but are you updating on hypothetical / fictional evidence? BTW to clarify, the whole sample efficiency thing is kind of a sideline to me. If someone got GPT4 level performance by training on human data that is like 10x the size of the books that a well-read human would read in 50 years, that would be really really weird and confusing to me, and would probably shorten my timelines somewhat; in contrast, what would really shorten my timelines would be observations of LP2X creating novel interesting concepts (or at least originary interesting concepts, as in Hänni’s “Cantor’s Diagonal from scratch” thing).
My main response is that once we condition on LLMs not having weight level continual learning as well as them not having a long-term memory, there’s little mystery left to explain for LLM capabilities, so there’s no other machinery that I’ve missed that is very important.
Yes, this is a good example of mysteriously assuming for no reason that “if we just get X, then I don’t see what’s stopping my learning program from being an AGI, so therefore it is an AGI”, which makes absolutely zero sense and you should stop.
And this is because human neurons constantly update, and there’s no deployment phase where all your neurons stop updating.
No it’s not. I mean it is a little bit. But it’s also “because” “neurons implement Bayesian learning”, and it’s also “because” “neurons implement a Turing machine”. Going from this sort of “because” to “because my thing is also a Turing machine and therefore it’s smart, just like neurons, which are also Turing machines” makes zero sense.
relative to you is that I think LLMs are in fact a little bit creative/can sometimes form insights (though with caveats)
What considerations (observations, arguments, etc.) most strongly contributed to convincing you of the strongest form of this proposition that you believe?
I do not think that Noosphere’s comment did not contain an argument. The rest of the comment after the passage you cited tries to lay out a model for why continual learning and long-term memory might be the only remaining bottlenecks. Perhaps you think that this argument is very bad, but it is an argument, and I did not think that your reply to it was helpful for the discussion.
Ok… but are you updating on hypothetical / fictional evidence? BTW to clarify, the whole sample efficiency thing is kind of a sideline to me. If someone got GPT4 level performance by training on human data that is like 10x the size of the books that a well-read human would read in 50 years, that would be really really weird and confusing to me, and would probably shorten my timelines somewhat; in contrast, what would really shorten my timelines would be observations of LP2X creating novel interesting concepts (or at least originary interesting concepts, as in Hänni’s “Cantor’s Diagonal from scratch” thing).
Yes, this is a good example of mysteriously assuming for no reason that “if we just get X, then I don’t see what’s stopping my learning program from being an AGI, so therefore it is an AGI”, which makes absolutely zero sense and you should stop.
No it’s not. I mean it is a little bit. But it’s also “because” “neurons implement Bayesian learning”, and it’s also “because” “neurons implement a Turing machine”. Going from this sort of “because” to “because my thing is also a Turing machine and therefore it’s smart, just like neurons, which are also Turing machines” makes zero sense.
What considerations (observations, arguments, etc.) most strongly contributed to convincing you of the strongest form of this proposition that you believe?
@Lucius Bushnaq It’s not too combative, you’re wrong. My previous comment laid out what’s wrong with the reasoning. Then Noosphere89 wrote a big long comment that makes all the same lines of reasoning, still without giving any arguments. This is really bad epistemics, and people going around vibing hard about this have been poisoning (or rather, hijacking https://www.lesswrong.com/posts/dAz45ggdbeudKAXiF/a-regime-change-power-vacuum-conjecture-about-group-belief) the discourse for 5 years.
I do not think that Noosphere’s comment did not contain an argument. The rest of the comment after the passage you cited tries to lay out a model for why continual learning and long-term memory might be the only remaining bottlenecks. Perhaps you think that this argument is very bad, but it is an argument, and I did not think that your reply to it was helpful for the discussion.