I chose to study physics in undergrad because I wanted to “understand the universe” and naively thought string theory was the logically correct endpoint of this pursuit, and was only saved from that fate by not being smart enough to get into a good grad school. Since then I’ve come to conclude that string theory is probably a dead end, albeit an astonishingly alluring one for a particular type of person. In that regard I find anecdotes like the following by Ron Maimon on Physics SE interesting — the reason string theorists believe isn’t the same as what they tell people, so it’s better to ask for their conversion stories:
I think that it is better to ask for a compelling argument that the physics of gravity requires a string theory completion, rather than a mathematical proof, which would be full of implicit assumptions anyway. The arguments people give in the literature are not the same as the personal reasons that they believe the theory, they are usually just stories made up to sound persuasive to students or to the general public. They fall apart under scrutiny. The real reasons take the form of a conversion story, and are much more subjective, and much less persuasive to everyone except the story teller. Still, I think that a conversion story is the only honest way to explain why you believe something that is not conclusively experimentally established.
Some famous conversion stories are:
Scherk and Schwarz (1974): They believed that the S-matrix bootstrap was a fundamental law of physics, and were persuaded that the bootstrap had a solution when they constructed proto-superstrings. An S-matrix theory doesn’t really leave room for adding new interactions, as became clear in the early seventies with the stringent string consistency conditions, so if it were a fundamental theory of strong interactions only, how would you couple it to electromagnetism or to gravity? The only way is if gravitons and photons show up as certain string modes. Scherk understood how string theory reproduces field theory, so they understood that open strings easily give gauge fields. When they and Yoneya understood that the theory requires a perturbative graviton, they realized that it couldn’t possibly be a theory of hadrons, but must include all interactions, and gravitational compactification gives meaning to the extra dimensions. Thankfully they realized this in 1974, just before S-matrix theory was banished from physics.
Ed Witten (1984): At Princeton in 1984, and everywhere along the East Coast, the Chew bootstrap was as taboo as cold fusion. The bootstrap was tautological new-agey content-free Berkeley physics, and it was justifiably dead. But once Ed Witten understood that string theory cancels gravitational anomalies, this was sufficient to convince him that it was viable. He was aware that supergravity couldn’t get chiral matter on a smooth compactification, and had a hard time fitting good grand-unification groups. Anomaly cancellation is a nontrivial constraint, it means that the theory works consistently in gravitational instantons, and it is hard to imagine a reason it should do that unless it is nonperturbatively consistent.
Everyone else (1985): once they saw Ed Witten was on board, they decided it must be right.
I am exaggerating of course. The discovery of heterotic strings and Calabi Yau compactifications was important in convincing other people that string theory was phenomenologically viable, which was important. In the Soviet Union, I am pretty sure that Knizhnik believed string theory was the theory of everything, for some deep unknown reasons, although his collaborators weren’t so sure. Polyakov liked strings because the link between the duality condition and the associativity of the OPE, which he and Kadanoff had shown should be enough to determines critical exponents in phase transitions, but I don’t think he ever fully got on board with the “theory of everything” bandwagon.
The rest of Ron’s answer elaborates on his own conversion story. The interesting part to me is that Ron began by trying to “kill string theory”, and in fact he was very happy that he was going to do so, but then was annoyed by an argument of his colleague that mathematically worked, and in the year or two he spent puzzling over why it worked he had an epiphany that convinced him string theory was correct, which sounds like nonsense to the uninitiated. (This phenomenon where people who gain understanding of the thing become incomprehensible to others sounds a lot like the discussions on LW on enlightenment by the way.)
In pure math, mathematicians seek “morality”, which sounds similar to Ron’s string theory conversion stories above. Eugenia Cheng’s Mathematics, morally argues:
I claim that although proof is what supposedly establishes the undeniable truth of a piece of mathematics, proof doesn’t actually convince mathematicians of that truth. And something else does.
… formal mathematical proofs may be wonderfully watertight, but they are impossible to understand. Which is why we don’t write whole formal mathematical proofs. … Actually, when we write proofs what we have to do is convince the community that it could be turned into a formal proof. It is a highly sociological process, like appearing before a jury of twelve good men-and-true. The court, ultimately, cannot actually know if the accused actually ‘did it’ but that’s not the point; the point is to convince the jury. Like verdicts in court, our ‘sociological proofs’ can turn out to be wrong—errors are regularly found in published proofs that have been generally accepted as true. So much for mathematical proof being the source of our certainty. Mathematical proof in practice is certainly fallible.
But this isn’t the only reason that proof is unconvincing. We can read even a correct proof, and be completely convinced of the logical steps of the proof, but still not have any understanding of the whole. Like being led, step by step, through a dark forest, but having no idea of the overall route. We’ve all had the experience of reading a proof and thinking “Well, I see how each step follows from the previous one, but I don’t have a clue what’s going on!”
And yet… The mathematical community is very good at agreeing what’s true. And even if something is accepted as true and then turns out to be untrue, people agree about that as well. Why? …
Mathematical theories rarely compete at the level of truth. We don’t sit around arguing about which theory is right and which is wrong. Theories compete at some other level, with questions about what the theory “ought” to look like, what the “right” way of doing it is. It’s this other level of ‘ought’ that we call morality. … Mathematical morality is about how mathematics should behave, not just that this is right, this is wrong. Here are some examples of the sorts of sentences that involve the word “morally”, not actual examples of moral things.
“So, what’s actually going on here, morally?” “Well, morally, this proof says...” “Morally, this is true because...” “Morally, there’s no reason for this axiom.” “Morally, this question doesn’t make any sense.” “What ought to happen here, morally?” “This notation does work, but morally, it’s absurd!” “Morally, this limit shouldn’t exist at all” “Morally, there’s something higher-dimensional going on here.”
Beauty/elegance is often the opposite of morality. An elegant proof is often a clever trick, a piece of magic as in Example 6 above, the sort of proof that drives you mad when you’re trying to understand something precisely because it’s so clever that it doesn’t explain anything at all.
Constructiveness is often the opposite of morality as well. If you’re proving the existence of something and you just construct it, you haven’t necessarily explained why the thing exists.
Morality doesn’t mean ‘explanatory’ either. There are so many levels of explaining something. Explanatory to whom? To someone who’s interested in moral reasons. So we haven’t really got anywhere. The same goes for intuitive, obvious, useful, natural and clear, and as Thurston says: “one person’s clear mental image is another person’s intimidation”.
Minimality/efficiency is sometimes the opposite of morality too. Sometimes the most efficient way of proving something is actually the moral way backwards. eg quadratics. And the most minimal way of presenting a theory is not necessarily the morally right way. For example, it is possible to show that a group is a set X equipped with one binary operation / satisfying the single axiom for all x, y, z ∈ X, (x/((((x/x)/y)/z)/(((x/x)/x)/z))) = y. The fact that something works is not good enough to be a moral reason.
Polya’s notion of ‘plausible reasoning’ at first sight might seem to fit the bill because it appears to be about how mathematicians decide that something is ‘plausible’ before sitting down to try and prove it. But in fact it’s somewhat probabilistic. This is not the same as a moral reason. It’s more like gathering a lot of evidence and deciding that all the evidence points to one conclusion, without there actually being a reason necessarily. Like in court, having evidence but no motive.
Abstraction perhaps gets closer to morality, along with ‘general’, ‘deep’, ‘conceptual’. But I would say that it’s the search for morality that motivates abstraction, the search for the moral reason motivates the search for greater generalities, depth and conceptual understanding. …
Proof has a sociological role; morality has a personal role. Proof is what convinces society; morality is what convinces us. Brouwer believed that a construction can never be perfectly communicated by verbal or symbolic language; rather it’s a process within the mind of an individual mathematician. What we write down is merely a language for communicating something to other mathematicians, in the hope that they will be able to reconstruct the process within their own mind. When I’m doing maths I often feel like I have to do it twice—once, morally in my head. And then once to translate it into communicable form. The translation is not a trivial process; I am going to encapsulate it as the process of moving from one form of truth to another.
Transmitting beliefs directly is unfeasible, but the question that does leap out of this is: what about the reason? Why don’t I just send the reason directly to X, thus eliminating the two probably hardest parts of this process? The answer is that a moral reason is harder to communicate than a proof. The key characteristic about proof is not its infallibility, not its ability to convince but its transferability. Proof is the best medium for communicating my argument to X in a way which will not be in danger of ambiguity, misunderstanding, or defeat. Proof is the pivot for getting from one person to another, but some translation is needed on both sides. So when I read an article, I always hope that the author will have included a reason and not just a proof, in case I can convince myself of the result without having to go to all the trouble of reading the fiddly proof.
Mathematicians have developed habits of communication that are often dysfunctional. Organizers of colloquium talks everywhere exhort speakers to explain things in elementary terms. Nonetheless, most of the audience at an average colloquium talk gets little of value from it. Perhaps they are lost within the first 5 minutes, yet sit silently through the remaining 55 minutes. Or perhaps they quickly lose interest because the speaker plunges into technical details without presenting any reason to investigate them. At the end of the talk, the few mathematicians who are close to the field of the speaker ask a question or two to avoid embarrassment.
This pattern is similar to what often holds in classrooms, where we go through the motions of saying for the record what we think the students “ought” to learn, while the students are trying to grapple with the more fundamental issues of learning our language and guessing at our mental models. Books compensate by giving samples of how to solve every type of homework problem. Professors compensate by giving homework and tests that are much easier than the material “covered” in the course, and then grading the homework and tests on a scale that requires little understanding. We assume that the problem is with the students rather than with communication: that the students either just don’t have what it takes, or else just don’t care.
Outsiders are amazed at this phenomenon, but within the mathematical community, we dismiss it with shrugs.
Much of the difficulty has to do with the language and culture of mathematics, which is divided into subfields. Basic concepts used every day within one subfield are often foreign to another subfield. Mathematicians give up on trying to understand the basic concepts even from neighboring subfields, unless they were clued in as graduate students.
In contrast, communication works very well within the subfields of mathematics. Within a subfield, people develop a body of common knowledge and known techniques. By informal contact, people learn to understand and copy each other’s ways of thinking, so that ideas can be explained clearly and easily.
Mathematical knowledge can be transmitted amazingly fast within a subfield. When a significant theorem is proved, it often (but not always) happens that the solution can be communicated in a matter of minutes from one person to another within the subfield. The same proof would be communicated and generally understood in an hour talk to members of the subfield. It would be the subject of a 15- or 20-page paper, which could be read and understood in a few hours or perhaps days by members of the subfield.
Why is there such a big expansion from the informal discussion to the talk to the paper? One-on-one, people use wide channels of communication that go far beyond formal mathematical language. They use gestures, they draw pictures and diagrams, they make sound effects and use body language. Communication is more likely to be two-way, so that people can concentrate on what needs the most attention. With these channels of communication, they are in a much better position to convey what’s going on, not just in their logical and linguistic facilities, but in their other mental facilities as well.
In talks, people are more inhibited and more formal. Mathematical audiences are often not very good at asking the questions that are on most people’s minds, and speakers often have an unrealistic preset outline that inhibits them from addressing questions even when they are asked.
In papers, people are still more formal. Writers translate their ideas into symbols and logic, and readers try to translate back.
Why is there such a discrepancy between communication within a subfield and communication outside of subfields, not to mention communication outside mathematics? Mathematics in some sense has a common language: a language of symbols, technical definitions, computations, and logic. This language efficiently conveys some, but not all, modes of mathematical thinking. Mathematicians learn to translate certain things almost unconsciously from one mental mode to the other, so that some statements quickly become clear. Different mathematicians study papers in different ways, but when I read a mathematical paper in a field in which I’m conversant, I concentrate on the thoughts that are between the lines. I might look over several paragraphs or strings of equations and think to myself “Oh yeah, they’re putting in enough rigamarole to carry such-and-such idea.” When the idea is clear, the formal setup is usually unnecessary and redundant—I often feel that I could write it out myself more easily than figuring out what the authors actually wrote. It’s like a new toaster that comes with a 16-page manual. If you already understand toasters and if the toaster looks like previous toasters you’ve encountered, you might just plug it in and see if it works, rather than first reading all the details in the manual.
People familiar with ways of doing things in a subfield recognize various patterns of statements or formulas as idioms or circumlocution for certain concepts or mental images. But to people not already familiar with what’s going on the same patterns are not very illuminating; they are often even misleading. The language is not alive except to those who use it.
Thurston’s personal reflections below on the sociology of proof exemplify the search for mathematical morality instead of fully formally rigorous correctness. I remember being disquieted upon first reading “There were published theorems that were generally known to be false” a long time ago:
When I started as a graduate student at Berkeley, I had trouble imagining how I could “prove” a new and interesting mathematical theorem. I didn’t really understand what a “proof” was.
By going to seminars, reading papers, and talking to other graduate students, I gradually began to catch on. Within any field, there are certain theorems and certain techniques that are generally known and generally accepted. When you write a paper, you refer to these without proof. You look at other papers in the field, and you see what facts they quote without proof, and what they cite in their bibliography. You learn from other people some idea of the proofs. Then you’re free to quote the same theorem and cite the same citations. You don’t necessarily have to read the full papers or books that are in your bibliography. Many of the things that are generally known are things for which there may be no known written source. As long as people in the field are comfortable that the idea works, it doesn’t need to have a formal written source.
At first I was highly suspicious of this process. I would doubt whether a certain idea was really established. But I found that I could ask people, and they could produce explanations and proofs, or else refer me to other people or to written sources that would give explanations and proofs. There were published theorems that were generally known to be false, or where the proofs were generally known to be incomplete. Mathematical knowledge and understanding were embedded in the minds and in the social fabric of the community of people thinking about a particular topic. This knowledge was supported by written documents, but the written documents were not really primary.
I think this pattern varies quite a bit from field to field. I was interested in geometric areas of mathematics, where it is often pretty hard to have a document that reflects well the way people actually think. In more algebraic or symbolic fields, this is not necessarily so, and I have the impression that in some areas documents are much closer to carrying the life of the field. But in any field, there is a strong social standard of validity and truth. Andrew Wiles’s proof of Fermat’s Last Theorem is a good illustration of this, in a field which is very algebraic. The experts quickly came to believe that his proof was basically correct on the basis of high-level ideas, long before details could be checked. This proof will receive a great deal of scrutiny and checking compared to most mathematical proofs; but no matter how the process of verification plays out, it helps illustrate how mathematics evolves by rather organic psychological and social processes.
Since then I’ve come to conclude that string theory is probably a dead end, albeit an astonishingly alluring one for a particular type of person.
The more you know about particle physics and quantum field theory, the more inevitable string theory seems. There are just too many connections. However, identifying the specific form of string theory that corresponds to our universe is more of a challenge, and not just because of the fabled 10^500 vacua (though it could be one of those). We don’t actually know either all the possible forms of string theory, or the right way to think about the physics that we can see. The LHC, with its “unnaturally” light Higgs boson, already mortally wounded a particular paradigm for particle physics (naturalness) which in turn was guiding string phenomenology (i.e. the part of string theory that tries to be empirically relevant). So along with the numerical problem of being able to calculate the properties of a given string vacuum, the conceptual side of string theory and string phenomenology is still wide open for discovery.
I asked a well-known string theorist about the fabled 10^500 vacua and asked him whether he worried that this would make string theory a vacuous theory since a theory that fits anything fits nothing. He replied ′ no, no the 10^500 ‘swampland’ is a great achievement of string theory—you see… all other theories have infinitely many adjustable parameters’. He was saying string theory was about ~1500 bits away from the theory of everything but infinitely ahead of its competitors.
Diabolical.
Much ink has been spilled on the scientific merits and demerits of string theory and its competitors. The educated reader will recognize that this all this and more is of course, once again, solved by UDASSA.
Re other theories, I don’t think that all other theories in existence have infinitely many adjustable parameters, and if he’s referring to the fact that lots of theories have adjustable parameters that can range over the real numbers, which are infinitely complicated in general, than that’s different, and string theory may have this issue as well.
Re string theory’s issue of being vacuous, I think the core thing that string theory predicts that other quantum gravity models don’t is that at the large scale, you recover general relativity and the standard model, whereas no other theory can yet figure out a way to properly include both the empirical effects of gravity and quantum mechanics in the parameter regimes where they are known to work, so string theory predicts more just by predicting the things other quantum mechanics predicts while having the ability to include in gravity without ruining the other predictions, whereas other models of quantum gravity tend to ruin empirical predictions like general relativity approximately holding pretty fast.
I chose to study physics in undergrad because I wanted to “understand the universe” and naively thought string theory was the logically correct endpoint of this pursuit, and was only saved from that fate by not being smart enough to get into a good grad school. Since then I’ve come to conclude that string theory is probably a dead end, albeit an astonishingly alluring one for a particular type of person. In that regard I find anecdotes like the following by Ron Maimon on Physics SE interesting — the reason string theorists believe isn’t the same as what they tell people, so it’s better to ask for their conversion stories:
The rest of Ron’s answer elaborates on his own conversion story. The interesting part to me is that Ron began by trying to “kill string theory”, and in fact he was very happy that he was going to do so, but then was annoyed by an argument of his colleague that mathematically worked, and in the year or two he spent puzzling over why it worked he had an epiphany that convinced him string theory was correct, which sounds like nonsense to the uninitiated. (This phenomenon where people who gain understanding of the thing become incomprehensible to others sounds a lot like the discussions on LW on enlightenment by the way.)
In pure math, mathematicians seek “morality”, which sounds similar to Ron’s string theory conversion stories above. Eugenia Cheng’s Mathematics, morally argues:
That last part is quite reminiscent of what the late Bill Thurston argued in his classic On proof and progress in mathematics:
Thurston’s personal reflections below on the sociology of proof exemplify the search for mathematical morality instead of fully formally rigorous correctness. I remember being disquieted upon first reading “There were published theorems that were generally known to be false” a long time ago:
The more you know about particle physics and quantum field theory, the more inevitable string theory seems. There are just too many connections. However, identifying the specific form of string theory that corresponds to our universe is more of a challenge, and not just because of the fabled 10^500 vacua (though it could be one of those). We don’t actually know either all the possible forms of string theory, or the right way to think about the physics that we can see. The LHC, with its “unnaturally” light Higgs boson, already mortally wounded a particular paradigm for particle physics (naturalness) which in turn was guiding string phenomenology (i.e. the part of string theory that tries to be empirically relevant). So along with the numerical problem of being able to calculate the properties of a given string vacuum, the conceptual side of string theory and string phenomenology is still wide open for discovery.
I asked a well-known string theorist about the fabled 10^500 vacua and asked him whether he worried that this would make string theory a vacuous theory since a theory that fits anything fits nothing. He replied ′ no, no the 10^500 ‘swampland’ is a great achievement of string theory—you see… all other theories have infinitely many adjustable parameters’. He was saying string theory was about ~1500 bits away from the theory of everything but infinitely ahead of its competitors.
Diabolical.
Much ink has been spilled on the scientific merits and demerits of string theory and its competitors. The educated reader will recognize that this all this and more is of course, once again, solved by UDASSA.
Re other theories, I don’t think that all other theories in existence have infinitely many adjustable parameters, and if he’s referring to the fact that lots of theories have adjustable parameters that can range over the real numbers, which are infinitely complicated in general, than that’s different, and string theory may have this issue as well.
Re string theory’s issue of being vacuous, I think the core thing that string theory predicts that other quantum gravity models don’t is that at the large scale, you recover general relativity and the standard model, whereas no other theory can yet figure out a way to properly include both the empirical effects of gravity and quantum mechanics in the parameter regimes where they are known to work, so string theory predicts more just by predicting the things other quantum mechanics predicts while having the ability to include in gravity without ruining the other predictions, whereas other models of quantum gravity tend to ruin empirical predictions like general relativity approximately holding pretty fast.