Not irrelevant, just insufficient. And that’s not a dig against the field, it’s true of every human endeavor that requires quick decision making based on incomplete data, or does not permit multiple attempts to succeed at something. Science has plenty of things to say before and after, about training for such tasks or understanding what happened and why. And that’s true here, too. There’s plenty science can tell us now, in advance, that we can use in AI and AI alignment research. But the problem is, aligning the first AGI or ASI may be a one-shot opportunity: succeed the first time or everyone dies. In that scenario, the way the scientific method is usually carried out (iterative hypothesis testing) is obviously insufficient, in the same way that such a method is insufficient for testing defenses against Earth-bound mile-wide asteroids.
I disagree with this, but note that this is a lot saner than the original response I was focusing on. The point I was trying to make was that it was either very surprising or indicated an example of very bad epistemics that science was not relevant to making us not dying, which is much stronger than your claim, and it would definitely need to be way better defended than this post did.
I disagree with your comment, but the claim you make is way less surprising than Zvi’s claim on science.
I don’t think it is much stronger, I think Zvi is shorthanding an idea that has been discussed many times and at much greater length on this site and elsewhere. The fact that scientists usually know which clusters of hypotheses are worth testing, long before our scientific institutions would consider them justified in claiming that anyone knows the answer, is already sufficiently strong evidence that “the scientific method,” as it is currently instantiated in the world, has much stricter standards of evidence than what epistemology fundamentally allows. Things like the replication crisis are similarly strong evidence that its standards are somewhat misaligned with epistemology, in that they can lead scientists astray for a long time before evidence builds up that forces it back on track.
The specific claim here is not “science as a whole and scientific reasoning are irrelevant.” It’s “If we rely on getting hard, reliable scientific evidence to align AGI, that will generally require many failed experiments and disproven hypotheses, because that’s how Science accumulates knowledge. But in a context where a single failed experiment can result in human extinction, that’s just not going to be a process that makes survival likely.” Which, we can disagree on the premise about whether we’re facing such a scenario, but I really don’t understand how to meaningfully disagree with the conclusion given the premise.
As an example: If the Manhattan Project physicists had been wrong in their calculations and the Trinity test had triggered a self-sustaining atmospheric nitrogen fission/fusion reaction, humanity would have gone extinct in seconds. This would have been the only experimental evidence in favor of the hypothesis, and it would have arrived too late to save humanity. In that case we were triply lucky: the physicists thought of the possibility, took it seriously enough to do the calculations, and were correct in their conclusion that it would not happen. Years later, they were wrong about similar (but more complicated) calculations on whether lithium-6 would contribute significantly to H-bomb yields, but thankfully this error was not the existential one.
Similarly, there were extremely strong reasons why we knew the LHC was not going to destroy the planet with tiny black holes or negative strangelets or whatever other nonsense was thrown around in popular media before it started up, and the scientists involved thought carefully about the possibilities anyway. But, the whole point of experiments is to look for the places where our models and predictions are wrong, and AI doesn’t have anywhere near enough of a theoretical basis to make the strong predictions that particle physics does.
I have sort of changed my mind on this, in that while I still disagree with Zvi and you, I now think that my response to Zvi was way too uncharitable, and as a consequence I’ll probably retract my first comment.
I disagree with the premise, and one of the assumptions used very often in EA/LW analyses isn’t enough to show it, without other assumptions, though
Not irrelevant, just insufficient. And that’s not a dig against the field, it’s true of every human endeavor that requires quick decision making based on incomplete data, or does not permit multiple attempts to succeed at something. Science has plenty of things to say before and after, about training for such tasks or understanding what happened and why. And that’s true here, too. There’s plenty science can tell us now, in advance, that we can use in AI and AI alignment research. But the problem is, aligning the first AGI or ASI may be a one-shot opportunity: succeed the first time or everyone dies. In that scenario, the way the scientific method is usually carried out (iterative hypothesis testing) is obviously insufficient, in the same way that such a method is insufficient for testing defenses against Earth-bound mile-wide asteroids.
I disagree with this, but note that this is a lot saner than the original response I was focusing on. The point I was trying to make was that it was either very surprising or indicated an example of very bad epistemics that science was not relevant to making us not dying, which is much stronger than your claim, and it would definitely need to be way better defended than this post did.
I disagree with your comment, but the claim you make is way less surprising than Zvi’s claim on science.
I don’t think it is much stronger, I think Zvi is shorthanding an idea that has been discussed many times and at much greater length on this site and elsewhere. The fact that scientists usually know which clusters of hypotheses are worth testing, long before our scientific institutions would consider them justified in claiming that anyone knows the answer, is already sufficiently strong evidence that “the scientific method,” as it is currently instantiated in the world, has much stricter standards of evidence than what epistemology fundamentally allows. Things like the replication crisis are similarly strong evidence that its standards are somewhat misaligned with epistemology, in that they can lead scientists astray for a long time before evidence builds up that forces it back on track.
The specific claim here is not “science as a whole and scientific reasoning are irrelevant.” It’s “If we rely on getting hard, reliable scientific evidence to align AGI, that will generally require many failed experiments and disproven hypotheses, because that’s how Science accumulates knowledge. But in a context where a single failed experiment can result in human extinction, that’s just not going to be a process that makes survival likely.” Which, we can disagree on the premise about whether we’re facing such a scenario, but I really don’t understand how to meaningfully disagree with the conclusion given the premise.
As an example: If the Manhattan Project physicists had been wrong in their calculations and the Trinity test had triggered a self-sustaining atmospheric nitrogen fission/fusion reaction, humanity would have gone extinct in seconds. This would have been the only experimental evidence in favor of the hypothesis, and it would have arrived too late to save humanity. In that case we were triply lucky: the physicists thought of the possibility, took it seriously enough to do the calculations, and were correct in their conclusion that it would not happen. Years later, they were wrong about similar (but more complicated) calculations on whether lithium-6 would contribute significantly to H-bomb yields, but thankfully this error was not the existential one.
Similarly, there were extremely strong reasons why we knew the LHC was not going to destroy the planet with tiny black holes or negative strangelets or whatever other nonsense was thrown around in popular media before it started up, and the scientists involved thought carefully about the possibilities anyway. But, the whole point of experiments is to look for the places where our models and predictions are wrong, and AI doesn’t have anywhere near enough of a theoretical basis to make the strong predictions that particle physics does.
I have sort of changed my mind on this, in that while I still disagree with Zvi and you, I now think that my response to Zvi was way too uncharitable, and as a consequence I’ll probably retract my first comment.
I disagree with the premise, and one of the assumptions used very often in EA/LW analyses isn’t enough to show it, without other assumptions, though
I might respond to the rest later on.
I look forward to reading it if you do!