“Dwarkesh chose excellent questions throughout, displaying an excellent sense of when to follow up and how, and when to pivot.”
This is the basic essence of why Dwarkesh does such good interviews. He does the groundwork to be able to ask relevant and interesting questions, ie. actually reading their books/works, and seems to consistently have put actual thought into analysing the worldview of his subjects.
Sutskever’s response to Dwarkesh in their interview was a convincing refutation of this argument for me:
Dwarkesh Patel
So you could argue that next-token prediction can only help us match human performance and maybe not surpass it? What would it take to surpass human performance?
Ilya Sutskever
I challenge the claim that next-token prediction cannot surpass human performance. On the surface, it looks like it cannot. It looks like if you just learn to imitate, to predict what people do, it means that you can only copy people. But here is a counter argument for why it might not be quite so. If your base neural net is smart enough, you just ask it — What would a person with great insight, wisdom, and capability do? Maybe such a person doesn’t exist, but there’s a pretty good chance that the neural net will be able to extrapolate how such a person would behave. Do you see what I mean?
Dwarkesh Patel
Yes, although where would it get that sort of insight about what that person would do? If not from…
Ilya Sutskever
From the data of regular people. Because if you think about it, what does it mean to predict the next token well enough? It’s actually a much deeper question than it seems. Predicting the next token well means that you understand the underlying reality that led to the creation of that token. It’s not statistics. Like it is statistics but what is statistics? In order to understand those statistics to compress them, you need to understand what is it about the world that creates this set of statistics? And so then you say — Well, I have all those people. What is it about people that creates their behaviors? Well they have thoughts and their feelings, and they have ideas, and they do things in certain ways. All of those could be deduced from next-token prediction. And I’d argue that this should make it possible, not indefinitely but to a pretty decent degree to say — Well, can you guess what you’d do if you took a person with this characteristic and that characteristic? Like such a person doesn’t exist but because you’re so good at predicting the next token, you should still be able to guess what that person who would do. This hypothetical, imaginary person with far greater mental ability than the rest of us