I suspect the crux here is whether or not you believe it’s possible to have a “simple” model of intelligence. Intuitively, the question here is something like, “Does intelligence ultimately boil down to some kind of fancy logic? Or does it boil down to some kind of fancy linear algebra?”
The “fancy logic” view has a long history. When I started working as a programmer, my coworkers were veterans of the 80s AI boom and the following “AI winter.” The key hope of those 80s expert systems was that you could encode knowledge using definitions and rules. This failed.
But the “fancy linear algebra” view pull ahead long ago. In the 90s, researchers in computational linguistics, computer vision and classification realized that linear algebra worked far better than fancy collections of rules. Many of these subfields leaped ahead. There were dissenters: Cyc continued to struggle off in a corner somewhere, and the semantic web tried to badly reinvent Prolog. The dissenters failed.
The dream of Cyc-like systems is eternal, and each new generation reinvents it. But it has systematically lost on nearly every benchmark of intelligence.
Fundamentally, real world intelligence has a number of properties:
The input is a big pile of numbers. Images are a pile of numbers. Sound is a pile of numbers.
Processing that input requires weighing many different pieces of evidence in complex ways.
The output of intelligence is a probability distribution. This is most obvious for tasks like speech recognition (“Did they say X? Probably. But they might have said Y.”)
When you have a giant pile of numbers as input, a complex system for weighing those numbers, and a probability distribution as output, then your system is inevitably something very much like a giant matrix. (In practice, it turns out you need a bunch of smaller matrices connected by non-linearities.)
Before 2022, it appeared to me that Yudkowsky was trapped in the same mirage that trapped the creators of Cyc and the Semantic Web and 80s expert systems.
But in 2025, Yudkowsky appears to believe that the current threat absolutely comes from giant inscrutable matrices. And as far as I can tell, he has become very pessimistic about any kind of robust “alignment”.
Personally, this is also my viewpoint: There is almost certainly no robust version of alignment, and even “approximate alignment will come under vast strain if we develop superhuman systems with goals. So I would answer your question in the affirmative: As far as I can see, inscrutability was always inevitable.
Logic- and rule-based systems fell behind in the 90s. And I don’t see any way that they are ever likely to work, even if we had decades to work on them.
Systems with massive numbers of numeric parameters have worked exceptionally well, in many forms. Unfortunately, they’re opaque and unpredictable, and therefore unsafe.
Given these two assumptions, the only two safety strategies are:
a. A permanent, worldwide halt, almost certainly within the next 5-10 years.
b. Build something smarter and eventually more powerful than us, and hope it likes keeping humans as pets, and does a reasonable job of it.
I strongly support (3a). But this is a hard argument to make, because the key step of the argument is that “almost every successful AI algorithm of the past 30 years has been an opaque mass of numbers, and it has gotten worse with each generation.”
Anyway, thank you for giving me an opportunity to try to explain my argument a bit better!
I suspect the crux here is whether or not you believe it’s possible to have a “simple” model of intelligence. Intuitively, the question here is something like, “Does intelligence ultimately boil down to some kind of fancy logic? Or does it boil down to some kind of fancy linear algebra?”
The “fancy logic” view has a long history. When I started working as a programmer, my coworkers were veterans of the 80s AI boom and the following “AI winter.” The key hope of those 80s expert systems was that you could encode knowledge using definitions and rules. This failed.
But the “fancy linear algebra” view pull ahead long ago. In the 90s, researchers in computational linguistics, computer vision and classification realized that linear algebra worked far better than fancy collections of rules. Many of these subfields leaped ahead. There were dissenters: Cyc continued to struggle off in a corner somewhere, and the semantic web tried to badly reinvent Prolog. The dissenters failed.
The dream of Cyc-like systems is eternal, and each new generation reinvents it. But it has systematically lost on nearly every benchmark of intelligence.
Fundamentally, real world intelligence has a number of properties:
The input is a big pile of numbers. Images are a pile of numbers. Sound is a pile of numbers.
Processing that input requires weighing many different pieces of evidence in complex ways.
The output of intelligence is a probability distribution. This is most obvious for tasks like speech recognition (“Did they say X? Probably. But they might have said Y.”)
When you have a giant pile of numbers as input, a complex system for weighing those numbers, and a probability distribution as output, then your system is inevitably something very much like a giant matrix. (In practice, it turns out you need a bunch of smaller matrices connected by non-linearities.)
Before 2022, it appeared to me that Yudkowsky was trapped in the same mirage that trapped the creators of Cyc and the Semantic Web and 80s expert systems.
But in 2025, Yudkowsky appears to believe that the current threat absolutely comes from giant inscrutable matrices. And as far as I can tell, he has become very pessimistic about any kind of robust “alignment”.
Personally, this is also my viewpoint: There is almost certainly no robust version of alignment, and even “approximate alignment will come under vast strain if we develop superhuman systems with goals. So I would answer your question in the affirmative: As far as I can see, inscrutability was always inevitable.
I appreciate the clear argument as to why “fancy linear algebra” works better than “fancy logic”.
And I understand why things that work better tend to get selected.
I do challenge “inevitable” though. It doesn’t help us to survive.
If linear algebra probably kills everyone but logic probably doesn’t, tell everyone and agree to prefer to use the thing that works worse.
Thank you for your response!
To clarify, my argument is that:
Logic- and rule-based systems fell behind in the 90s. And I don’t see any way that they are ever likely to work, even if we had decades to work on them.
Systems with massive numbers of numeric parameters have worked exceptionally well, in many forms. Unfortunately, they’re opaque and unpredictable, and therefore unsafe.
Given these two assumptions, the only two safety strategies are: a. A permanent, worldwide halt, almost certainly within the next 5-10 years. b. Build something smarter and eventually more powerful than us, and hope it likes keeping humans as pets, and does a reasonable job of it.
I strongly support (3a). But this is a hard argument to make, because the key step of the argument is that “almost every successful AI algorithm of the past 30 years has been an opaque mass of numbers, and it has gotten worse with each generation.”
Anyway, thank you for giving me an opportunity to try to explain my argument a bit better!