I think the framing “alignment research is preparadigmatic” might be heavily misunderstood. The term “preparadigmatic” of course comes from Thomas Kuhn’s The Structure of Scientific Revolutions. My reading of this says that a paradigm is basically an approach to solving problems which has been proven to work, and that the correct goal of preparadigmatic research should be to do research generally recognized as impressive.
For example, Kuhn says in chapter 2 that “Paradigms gain their status because they are more successful than their competitors in solving a few problems that the group of practitioners has come to recognize as acute.” That is, lots of researchers have different ontologies/approaches, and paradigms are the approaches that solve problems that everyone, including people with different approaches, agrees to be important. This suggests that to the extent alignment is still preparadigmatic, we should try to solve problems recognized as important by, say, people in each of the five clusters of alignment researchers (e.g. Nate Soares, Dan Hendrycks, Paul Christiano, Jan Leike, David Bau).
I think this gets twisted in some popular writings on LessWrong. John Wentworth writes that a researcher in a preparadigmatic field should spend lots of time explaining their approaches:
Because the field does not already have a set of shared frames—i.e. a paradigm—you will need to spend a lot of effort explaining your frames, tools, agenda, and strategy. For the field, such discussion is a necessary step to spreading ideas and eventually creating a paradigm.
I think this is misguided. A paradigm is not established by ideas diffusing between researchers with different frames until they all agree on some weighted average of the frames. A paradigm is established by research generally recognized as impressive, which proves the correctness of (some aspect of) someone’s frames. So rather than trying to communicate one’s frame to everyone, one should communicate with other researchers to get an accurate sense of what problems they think are important, and then work on those problems using one’s own frames. (edit: of course, before this is possible one should develop one’s frames to solve some problems)
If the point you’re trying to make is: “the way we go from preparadigmatic to paradigmatic is by solving some hard problems, not by communicating initial frames and idea”, I think this points to an important point indeed.
Still, two caveats:
First, Kuhn’s concept of paradigm is quite an oversimplification of what actually happens in the history of science (and the history of most fields). More recent works that go through history in much more detail realize that at any point in fields there are often many different pieces of paradigms, or some strong paradigm for a key “solved” part of the field and then a lot of debated alternative for more concrete specific details.
Generally, I think the discourse on history and philosophy of science on LW would improve a lot if it didn’t mostly rely on one (influential) book published in the 60s, before much of the strong effort to really understand history of science and practices.
Second, to steelman John’s point, I don’t think he means that you should only communicate your frame. He’s the first to actively try to apply his frames to some concrete problems, and to argue for their impressiveness. Instead, I read him as pointing to a bunch of different needs in a preparadigmatic field (which maybe he could separate better ¯\_(ツ)_/¯)
That in a preparadigmatic field, there is no accepted way of tackling the problems/phenomena. So if you want anyone else to understand you, you need to bridge a bigger inferential distance than in a paradigmatic field (or even a partially paradigmatic field), because you don’t even see the problem in the same way, at a fundamental level.
That if your goal is to create a paradigm, almost by definition you need to explain and communicate your paradigm. There is a part of propaganda in defending any proposed paradigm, especially when the initial frame is alien to most people, and even the impressiveness require some level of interpretation.
That one way (not the only way) by which a paradigm emerges is by taking different insights from different clunky frames, and unifying them (for a classic example, Newton relied on many previous basic frames, from Kepler’s laws to Galileo’s interpretation of force as causing acceleration). But this requires that the clunky frames are at least communicated clearly.
I think the framing “alignment research is preparadigmatic” might be heavily misunderstood. The term “preparadigmatic” of course comes from Thomas Kuhn’s The Structure of Scientific Revolutions. My reading of this says that a paradigm is basically an approach to solving problems which has been proven to work, and that the correct goal of preparadigmatic research should be to do research generally recognized as impressive.
For example, Kuhn says in chapter 2 that “Paradigms gain their status because they are more successful than their competitors in solving a few problems that the group of practitioners has come to recognize as acute.” That is, lots of researchers have different ontologies/approaches, and paradigms are the approaches that solve problems that everyone, including people with different approaches, agrees to be important. This suggests that to the extent alignment is still preparadigmatic, we should try to solve problems recognized as important by, say, people in each of the five clusters of alignment researchers (e.g. Nate Soares, Dan Hendrycks, Paul Christiano, Jan Leike, David Bau).
I think this gets twisted in some popular writings on LessWrong. John Wentworth writes that a researcher in a preparadigmatic field should spend lots of time explaining their approaches:
I think this is misguided. A paradigm is not established by ideas diffusing between researchers with different frames until they all agree on some weighted average of the frames. A paradigm is established by research generally recognized as impressive, which proves the correctness of (some aspect of) someone’s frames. So rather than trying to communicate one’s frame to everyone, one should communicate with other researchers to get an accurate sense of what problems they think are important, and then work on those problems using one’s own frames. (edit: of course, before this is possible one should develop one’s frames to solve some problems)
If the point you’re trying to make is: “the way we go from preparadigmatic to paradigmatic is by solving some hard problems, not by communicating initial frames and idea”, I think this points to an important point indeed.
Still, two caveats:
First, Kuhn’s concept of paradigm is quite an oversimplification of what actually happens in the history of science (and the history of most fields). More recent works that go through history in much more detail realize that at any point in fields there are often many different pieces of paradigms, or some strong paradigm for a key “solved” part of the field and then a lot of debated alternative for more concrete specific details.
Generally, I think the discourse on history and philosophy of science on LW would improve a lot if it didn’t mostly rely on one (influential) book published in the 60s, before much of the strong effort to really understand history of science and practices.
Second, to steelman John’s point, I don’t think he means that you should only communicate your frame. He’s the first to actively try to apply his frames to some concrete problems, and to argue for their impressiveness. Instead, I read him as pointing to a bunch of different needs in a preparadigmatic field (which maybe he could separate better ¯\_(ツ)_/¯)
That in a preparadigmatic field, there is no accepted way of tackling the problems/phenomena. So if you want anyone else to understand you, you need to bridge a bigger inferential distance than in a paradigmatic field (or even a partially paradigmatic field), because you don’t even see the problem in the same way, at a fundamental level.
That if your goal is to create a paradigm, almost by definition you need to explain and communicate your paradigm. There is a part of propaganda in defending any proposed paradigm, especially when the initial frame is alien to most people, and even the impressiveness require some level of interpretation.
That one way (not the only way) by which a paradigm emerges is by taking different insights from different clunky frames, and unifying them (for a classic example, Newton relied on many previous basic frames, from Kepler’s laws to Galileo’s interpretation of force as causing acceleration). But this requires that the clunky frames are at least communicated clearly.
Strong agree. 👍