Okay so even though I’ve already written a full-length post about timelines, I thought I should make a shortform putting my model into a less eloquent and far more speculative-sounding and capricuous format. Also I think the part I was hedging the most on in the post is probably the most important aspect of the model.
I propose that the ability to make progress on...
well-defined problems with verifiable solutions; vs.
murky problems where the solution criterion is unclear and no one can ever prove anything
… are two substantially different dimensions of intelligence, and IQ is almost entirely about the first one. The second one isn’t in-principle impossible to measure, it’s probably not even difficult, but extremely difficult to make a socially respected test for it because you could almost only include questions where the right answer is up for debate. I called this philosophical intelligence in my post because philosophical problems are usually great examples, but it’s not restricted to those. You could also things like
Is neoliberalism or progressivism a better governing philosophy?
Should we ship weapons to Ukraine?
What’s the best way to teach {insert topic here}?
Of course you can’t put those onto a test any more than you can ask “does liberterian free will exist?” on a test, so the existence of non-philosophical questions here doesn’t make measuring this ability any easier.
People often point to someone famous saying something they think is stupid and then say things like “this again proves that being an expert in one domain doesn’t translate into being smart anywhere else!” This always rubbed me the wrong way because intelligence in one area should transfer to other areas! It’s all general problem-solving capability! But in fact, those people do exist, and I’ve talked to some of them. People who have genuine intellectual horsepower on narrow problems, but as I ask them anything about a more fuzzy topic, their take is just so surface level and dumb that my immediate reaction is always this sense of disbelief, like, “it shouldn’t be possible for your thoughts here to be this shallow given how smart you are!”
… but conversely, there clearly is such a thing as expertise in a narrow area correlating with smart philosophical/political views. So sometimes intelligence does transfer and sometimes it doesn’t...
Well, I think it’s obvious what point I’m going to make here; I think sometimes people are experts in their field due to #1 and sometimes #2, and the extent that it’s #2 this tends to transfer into making sense on other questions, whereas to the extent it’s #1, it’s in fact almost meaningless. (And some people become famous without either #1 and #2, but less so if they’re experts in technical fields.)
I think #2 has outsized importance for progress on many things related to AI alignment and rationality. For example, I think Eliezer is quite high in both #1 and #2, but the reason he has produced a more useful body of work than the average genius has much more to do with #2. Almost nothing in the sequences seems to require genius level IQ; I think he could be a SD lower in IQ and still have written most of them. It would make a difference, don’t get me wrong, but I don’t think it would be the bottleneck. (None of this depends on what Eliezer is up to nowadays btw, you can ignore the last 15 years for this paragraph.)
Now what about dangerous capability advances and takeover scenarios from LLMs; can those happen without #2? Imo, absolutely not. Not even a little bit. You can have all sorts of negative effects of the kind that are already happening—job loss, increased social isolation, information silos, misinformation, maybe even some extent of capability enhancement, stuff like that—but the classical superingelligence-ian scenarios require the ability to make progress on problems with murky and unverifiable solutions.
I think the entire notion that LLMs can’t really come up with novel concepts—one of the less stupid criticisms of LLMs, imo—is a direct result of this (coming up with a novel concept is exactly the kind of thing you need #2 for because there’s no way to verify whether any one idea for a new concept does or doesn’t make sense). Although this is not absolute because sometimes they can spit out new ideas at random; the “inability to derive new concepts” framing doesn’t quite point at the right thing since creativity isn’t the issue, it’s the ability to reliably figure out whether a new concept is actually useful. The disconnect between stuff like METR’s supposed exponential growth in LLM’s capabilities on long-horizon tasks and actual job replacement on those tasks is another. There is just a really fundamental problem here where metrics for AI progress are biased towards things you can measure—duh! -- which systematically biases toward #1 over #2. (Although METR has actually acknowledged this at least a little bit, I feel like they’ve actually been very epistemically virtuous from what I could see, so I don’t wanna trash them.)
Or to just put it all very bluntly, if LLMs cannot answer questions as easy as “does libertarian free will exist” or “what’s the right interpretation of quantum mechanics?”—and they can’t—then clearly they’re not very smart. And I think they’re not very smart in a way that is necessary for basically all of the doom-y scenarios.
I’m not expecting anyone to agree with any of this, but in a nutshell, much of my real skepticism about LLM scaling is about the above, especially lately. I don’t think we’re particularly close to AGI… and consequently, I also don’t think much of the classical superintelligence-ian views have actually been tested, one way or another.
Okay so even though I’ve already written a full-length post about timelines, I thought I should make a shortform putting my model into a less eloquent and far more speculative-sounding and capricuous format. Also I think the part I was hedging the most on in the post is probably the most important aspect of the model.
I propose that the ability to make progress on...
well-defined problems with verifiable solutions; vs.
murky problems where the solution criterion is unclear and no one can ever prove anything
… are two substantially different dimensions of intelligence, and IQ is almost entirely about the first one. The second one isn’t in-principle impossible to measure, it’s probably not even difficult, but extremely difficult to make a socially respected test for it because you could almost only include questions where the right answer is up for debate. I called this philosophical intelligence in my post because philosophical problems are usually great examples, but it’s not restricted to those. You could also things like
Is neoliberalism or progressivism a better governing philosophy?
Should we ship weapons to Ukraine?
What’s the best way to teach {insert topic here}?
Of course you can’t put those onto a test any more than you can ask “does liberterian free will exist?” on a test, so the existence of non-philosophical questions here doesn’t make measuring this ability any easier.
People often point to someone famous saying something they think is stupid and then say things like “this again proves that being an expert in one domain doesn’t translate into being smart anywhere else!” This always rubbed me the wrong way because intelligence in one area should transfer to other areas! It’s all general problem-solving capability! But in fact, those people do exist, and I’ve talked to some of them. People who have genuine intellectual horsepower on narrow problems, but as I ask them anything about a more fuzzy topic, their take is just so surface level and dumb that my immediate reaction is always this sense of disbelief, like, “it shouldn’t be possible for your thoughts here to be this shallow given how smart you are!”
… but conversely, there clearly is such a thing as expertise in a narrow area correlating with smart philosophical/political views. So sometimes intelligence does transfer and sometimes it doesn’t...
Well, I think it’s obvious what point I’m going to make here; I think sometimes people are experts in their field due to #1 and sometimes #2, and the extent that it’s #2 this tends to transfer into making sense on other questions, whereas to the extent it’s #1, it’s in fact almost meaningless. (And some people become famous without either #1 and #2, but less so if they’re experts in technical fields.)
I think #2 has outsized importance for progress on many things related to AI alignment and rationality. For example, I think Eliezer is quite high in both #1 and #2, but the reason he has produced a more useful body of work than the average genius has much more to do with #2. Almost nothing in the sequences seems to require genius level IQ; I think he could be a SD lower in IQ and still have written most of them. It would make a difference, don’t get me wrong, but I don’t think it would be the bottleneck. (None of this depends on what Eliezer is up to nowadays btw, you can ignore the last 15 years for this paragraph.)
Now what about dangerous capability advances and takeover scenarios from LLMs; can those happen without #2? Imo, absolutely not. Not even a little bit. You can have all sorts of negative effects of the kind that are already happening—job loss, increased social isolation, information silos, misinformation, maybe even some extent of capability enhancement, stuff like that—but the classical superingelligence-ian scenarios require the ability to make progress on problems with murky and unverifiable solutions.
I think the entire notion that LLMs can’t really come up with novel concepts—one of the less stupid criticisms of LLMs, imo—is a direct result of this (coming up with a novel concept is exactly the kind of thing you need #2 for because there’s no way to verify whether any one idea for a new concept does or doesn’t make sense). Although this is not absolute because sometimes they can spit out new ideas at random; the “inability to derive new concepts” framing doesn’t quite point at the right thing since creativity isn’t the issue, it’s the ability to reliably figure out whether a new concept is actually useful. The disconnect between stuff like METR’s supposed exponential growth in LLM’s capabilities on long-horizon tasks and actual job replacement on those tasks is another. There is just a really fundamental problem here where metrics for AI progress are biased towards things you can measure—duh! -- which systematically biases toward #1 over #2. (Although METR has actually acknowledged this at least a little bit, I feel like they’ve actually been very epistemically virtuous from what I could see, so I don’t wanna trash them.)
Or to just put it all very bluntly, if LLMs cannot answer questions as easy as “does libertarian free will exist” or “what’s the right interpretation of quantum mechanics?”—and they can’t—then clearly they’re not very smart. And I think they’re not very smart in a way that is necessary for basically all of the doom-y scenarios.
I’m not expecting anyone to agree with any of this, but in a nutshell, much of my real skepticism about LLM scaling is about the above, especially lately. I don’t think we’re particularly close to AGI… and consequently, I also don’t think much of the classical superintelligence-ian views have actually been tested, one way or another.