IZ BELTAGY(lead research scientist, Allen Institute for AI; chief scientist and co-founder, SpiffyAI): In a day, a lot of the problems that a large percentage of researchers were working on — they just disappeared. …
R. THOMAS MCCOY: It’s reasonably common for a specific research project to get scooped or be eliminated by someone else’s similar thing. But ChatGPT did that to entire types of research, not just specific projects. A lot of higher categories of NLP just became no longer interesting — or no longer practical — for academics to do. …
IZ BELTAGY: I sensed that dread and confusion during EMNLP [Empirical Methods in Natural Language Processing], which is one of the leading conferences. It happened in December, a week after the release of ChatGPT. Everybody was still shocked: “Is this going to be the last NLP conference?” This is actually a literal phrase that someone said. During lunches and cocktails and conversations in the halls, everybody was asking the same question: “What is there that we can work on?”
NAZNEEN RAJANI(founder and CEO, Collinear AI; at the time a Ph.D. student studying with Ray Mooney): I had just given a keynote at EMNLP. A few days after that, Thom Wolf, who was my manager at Hugging Face and also one of the co-founders, messages me, “Hey, can you get on a call with me ASAP?” He told me that they had fired people from the research team and that the rest would either be doing pre-training or post-training — which means that you are either building a foundation model or you’re taking a foundation model and making it an instruction-following model, similar to ChatGPT. And he said, “I recommend you pick one of these two if you want to continue at Hugging Face.”
It didn’t feel like what the Hugging Face culture stood for. Until then, everyone was basically just doing their own research, what they wanted to do. It definitely felt not so good.
CHRISTOPHER CALLISON-BURCH: It helps to have tenure when something like this happens. But younger people were going through this crisis in a more visceral way. Some Ph.D. students literally formed support groups for each other.
LIAM DUGAN: We just kind of commiserated. A lot of Ph.D. students that were further on than me, that had started dissertation work, really had to pivot hard. A lot of these research directions, it’s like there’s nothing intellectual about them left. It’s just, apply the language model and it’s done.
Weirdly enough, nobody [I knew] quit. But there was a bit of quiet quitting. Just kind of dragging your feet or getting very cynical.
Wow. I knew academics were behind / out of the loop / etc. but this surprised me. I imagine these researchers had at least heard about GPT2 and GPT3 and the scaling laws papers; I wonder what they thought of them at the time. I wonder what they think now about what they thought at the time.
for anyone not wanting to go in and see the Kafka, I copied some useful examples:
ANNA ROGERS: I was considering making yet another benchmark, but I stopped seeing the point of it. Let’s say GPT-3 either can or cannot continue [generating] these streams of characters. This tells me something about GPT-3, but that’s not actually even a machine learning research question. It’s product testing for free.
JULIAN MICHAEL: There was this term, “API science,’’ that people would use to be like: “We’re doing science on a product? This isn’t science, it’s not reproducible.” And other people were like: “Look, we need to be on the frontier. This is what’s there.”
TAL LINZEN (associate professor of linguistics and data science, New York University; research scientist, Google): For a while people in academia weren’t really sure what to do.
R. THOMAS MCCOY: Are you pro- or anti-LLM? That was in the water very, very much at this time.
JULIE KALLINI (second-year computer science Ph.D. student, Stanford University): As a young researcher, I definitely sensed that there were sides. At the time, I was an undergraduate at Princeton University. I remember distinctly that different people I looked up to — my Princeton research adviser [Christiane Fellbaum] versus professors at other universities — were on different sides. I didn’t know what side to be on.
LIAM DUGAN: You got to see the breakdown of the whole field — the sides coalescing. The linguistic side was not very trusting of raw LLM technology. There’s a side that’s sort of in the middle. And then there’s a completely crazy side that really believed that scaling was going to get us to general intelligence. At the time, I just brushed them off. And then ChatGPT comes out.
+1, GPT3.5 was publicly available since January, and GPT3 was big news two years before and publicly available back then. I’m really surprised that people didn’t understand that these models were a big deal AND changed their minds when ChatGPT came out. Maybe it’s just a weird preference cascade, where this was enough to break a common false belief?
I remember seeing the ChatGPT announcement and not being particularly impressed or excited, like “okay, it’s a refined version of InstructGPT from almost a year ago. It’s cool that there’s a web UI now, maybe I’ll try it out soon.” November 2022 was a technological advancement but not a huge shift compared to January 2022 IMO
Fair enough. My mental image of the GPT models was stuck on that infernal “talking unicorns” prompt, which I think did make them seem reasonably characterized as mere “stochastic parrots” and “glorified autocompletes,” and the obvious bullshit about the “safety and security concerns” around releasing GPT-2 also led me to conclude the tech was unlikely to amount to much more. InstructGPT wasn’t good enough to get me to update it; that took the much-hyped ChatGPT release.
Was there a particular moment that impressed you, or did you just see the Transformers paper, project that correctly into the future, and the releases that followed since then have just been following that trend you extrapolated and so been unremarkable?
I remember being very impressed by GPT-2. I think I was also quite impressed by GPT-3 even though it was basically just “GPT-2 but better.” To be fair, at the moment that I was feeling unimpressed by ChatGPT, I don’t think I had actually used it yet. It did turn out to be much more useful to me than the GPT-3 API, which I tried out but didn’t find that many uses for.
It’s hard to remember exactly how impressed I was with ChatGPT after using it for a while. I think I hadn’t fully realized how great it could be when the friction of using the API was removed, even if I didn’t update that much on the technical advancement.
The full article discusses the transformer paper (which didn’t have a large influence, as the implications weren’t clear), BERT (which did have a large influence) and GPT-3 (which also had a large influence). I assume the release of ChatGPT was the point where even the last NLP researchers couldn’t ignore LLMs anymore.
ChatGPT was “so good they can’t ignore you”; the Hugging Face anecdote is particularly telling. At some point, everyone else gets tired of waiting for your cargo to land, and will fire you if you don’t get with the program. “You say semantics can never be learned from syntax and you’ve proven that ChatGPT can never be useful? It seems plenty useful to me and everyone else. Figure it out or we’ll find someone who can.”
I think the most interesting part of the Quanta piece is the discussion of the octopus paper, which states that pure language models can’t actually understand text (as they only learn from form/syntax), and the bitter disputes that followed in the NLP community.
From the abstract:
The success of the large neural language models on many NLP tasks is exciting. However, we find that these successes sometimes lead to hype in which these models are being described as “understanding” language or capturing “meaning”. In this position paper, we argue that a system trained only on form has a priori no way to learn meaning. In keeping with the ACL 2020 theme of “Taking Stock of Where We’ve Been and Where We’re Going”, we argue that a clear understanding of the distinction between form and meaning will help guide the field towards better science around natural language understanding.
(As a side note, Yudkowsky’s broadly verificationist theory of content seems to agree with her distinction: if “understanding” of a statement is knowing what experience would confirm it, or what experience it would predict, then understanding cannot come from syntactic form alone. The association of words and sensory data would be necessary. Did Yudkowsky ever comment on the apparent incompatibility between evident LLM understanding and his anticipated experience theory?)
Of course I assume that now it can hardly be denied that LLMs really do somehow understand text, even if they are merely trained on form. So the octopus paper argument must be wrong somewhere. Though at least in the Quanta piece, Bender doesn’t acknowledge any update of that sort. In fact, in the last quote she says:
I have seen an enormous shift towards end-to-end solutions using chatbots or related synthetic text-extruding machines. And I believe it to be a dead end.
I don’t think there’s any necessary contradiction. Verification or prediction of what? More data. What data? Data. You seem to think there’s some sort of special reality-fluid which JPEGs or MP3s have but .txt files do not, but they don’t; they all share the Buddha-nature.
Consider Bender’s octopus example, where she says that it can’t learn to do anything from watching messages go back and forth. This is obviously false, because we do this all the time; for example, you can teach a LLM to play good chess simply by watching a lot of moves fly by back and forth as people play postal chess. Imitation learning & offline RL are important use-cases of RL and no one would claim it doesn’t work or is impossible in principle.
Can you make predictions and statements which can be verified by watching postal chess games? Of course. Just predict what the next move will be. “I think he will castle, instead of moving the knight.” [later] “Oh no, I was wrong! I anticipated seeing a castling move, and I did not, I saw something else. My beliefs about castling did not pay rent and were not verified by subsequent observations of this game. I will update my priors and do better next time.”
Well, in the chess example we do not have any obvious map/territory relation. Chess seems to be a purely formal game, as the pieces do not seem to refer to anything in the external world. So it’s much less obvious that training on form alone would also work for learning natural language, which does exhibit a map territory distinction.
For example, a few years ago, most people would have regarded it as highly unlikely that you could understand (decode) an intercepted alien message without any contextual information. But if you can understand text from form alone, as LLMs seem to prove, the message simply has to be long enough. Then you can train an LLM on it, which would then be able to understand the message. And it would also be able to translate it into English if it is additionally trained on English text.
That’s very counterintuive, or at least it was counterintuitive until recently. I doubt EY meant to count raw words as “anticipated experience”, since “experience” typically refers to sensory data only. (In fact, I think Guessing the Teacher’s Password also suggests that he didn’t.)
To repeat, I don’t blame him, as the proposition that large amounts of raw text can replace sensory data, that a sufficient amount of symbols can ground themselves, was broadly considered unlikely until LLMs came along. But I do blame Bender insofar as she didn’t update even in light of strong evidence that the classical hypothesis (you can’t infer meaning from form alone) was wrong.
Well, in the chess example we do not have any obvious map/territory relation.
Yes, there is. The transcripts are of 10 million games that real humans played to cover the distribution of real games, and then were annotated by Stockfish, to provide superhuman-quality metadata on good vs bad moves. That is the territory. The map is the set of transcripts.
But if you can understand text from form alone, as LLMs seem to prove, the message simply has to be long enough.
I would say ‘diverse enough’, not ‘long enough’. (An encyclopedia will teach a LLM many things; a dictionary the same length, probably not.) Similar to meta-learning vs learning.
the pieces do not seem to refer to anything in the external world.
What external world does our ‘external world’ itself refer to things inside of? If the ‘external world’ doesn’t need its own external world for grounding, then why does lots of text about the external world not suffice? (And if it does, what grounds that external external world, or where does the regress end?) As I like to put it, for an LLM, ‘reality’ is just the largest fictional setting—the one that encompasses all the other fictional settings it reads about from time to time.
As someone who doubtless does quite a lot of reading about things or writing to people you have never seen nor met in real life and have no ‘sensory’ way of knowing that they exist, this is a position you should find sympathetic.
Sympathy or not, the position that meaning of natural language can be inferred from the symbolic form alone wasn’t obvious to me in the past, as this is certainly not how humans learn language, and I don’t know any evidence that someone else thought this plausible before machine learning made it evident. It’s always easy to make something sound obvious after the fact, but that doesn’t mean that it actually was obvious to anyone at the time.
Plenty of linguists and connectionists thought it was possible, if only to show those damned Chomskyans that they were wrong!
To be specific, some of the radical linguists believed in pure distributional semantics, or that there is no semantics beyond syntax. I don’t know anyone in particular, but considering how often Chomsky, Pinker, etc were fighting against the “blank slate” theory, they definitely existed.
The following people likely believed that it is possible to learn a language purely from reading using a general learning architecture like neural networks (blank-slate):
James L. McClelland and David Rumelhart.
They were the main proponents of neural networks in the “past tense debate”. Generally, anyone on the side of neural networks in the past tense debate probably believed this.
B. F. Skinner.
Radical syntacticians? Linguists have failed to settle the question of “Just what is semantics? How is it different from syntax?”, and some linguists have taken the radical position “There is no semantics. Everything is syntax.”. Once that is done, there simply is no difficulty: just learn all the syntax, and there is nothing left to learn.
Possibly some of the participants in the “linguistics wars” believed in it. Specifically, some believed in “generative semantics”, whereby semantics is simply yet more generative grammar, and thus not any different from syntax (also generative grammar). Chomsky, as you might imagine, hated that, and successfully beat it down.
Maybe some people in distributional semantics? Perhaps Leonard Bloomfield? I don’t know enough about the history of linguistics to tell what Bloomfield or the “Bloomfieldians” believed in exactly. However, considering that Chomsky was strongly anti-Bloomsfield, it is a fair bet that some Bloomsfieldians (or self-styled “neo-Bloomsfieldians”) would support blank-slate learning of language, if only to show Chomskyans that they’re wrong.
FYI your ‘octopus paper’ link is to Stochastic Parrots; it should be this link.
Though at least in the Quanta piece, Bender doesn’t acknowledge any update of that sort.
I’ve seen other quotes from Bender & relevant coauthors that suggest they haven’t really updated, which I find fascinating. I’d love to have the opportunity to talk with them about it and understand better how their views have remained consistent despite the evidence that’s emerged since the papers were published.
So the octopus paper argument must be wrong somewhere.
It makes a very intuitively compelling argument! I think that, as with many confusions about the Chinese Room, the problem is that our intuitions fail at the relevant scale. Given an Internet’s worth of discussion of bears and sticks and weapons, the hyper-intelligent octopus’s model of those things is rich enough for the octopus to provide advice about them that would work in the real world, even if it perhaps couldn’t recognize a bear by sight. For example it would know that sticks have a certain distribution of mass, and are the sorts of things that could be bound together by rope (which it knows is available because of the coconut catapult), and that the combined sticks might have enough mass to serve as a weapon, and what amounts of force would be harmful to a bear, etc. But it’s very hard to understand just how rich those models can be when our intuitions are primed by a description of two people casually exchanging messages.
America can pull gigawatts out of thin air through a combination of technology and smartly targeted policy. Let me show you how. …
It is often said that the US electricity grid is under increasing risk of blackouts, price spikes, and other signs of strain. … Most of the time, however, the grid has significantly more power than is needed. This means that the grid can often accommodate, say, a new 750-megawatt data center for the vast majority of the year. There is spare generation capacity available except for those brief periods of high demand. But in the high-demand periods, when all the grid’s generation capacity is required to maintain electricity service, this new data center would require an additional 750 megawatts of electricity generation capacity, and quite possibly also transmission infrastructure upgrades.
This additional investment is only necessary if you assume that the new data center will require all 750 megawatts of electricity during peak-demand periods. Traditionally, this assumption has been true: data center operators rely on extremely high uptime, and grid operators work under the assumption that new electricity demand will be constant during periods of high demand.
If, however, that assumption were not true, and a data center was able to significantly reduce or eliminate its electricity consumption for a small portion of the year (the high-demand period), the calculus changes radically. More power would suddenly become available because the data center can tap into the grid’s existing surplus capacity without requiring investment in net-new capacity on the days when the grid is operating at the limits of its capacity.
How much more power could be unlocked? In a viral paper earlier this year, Tyler Norris and colleagues at Duke University estimated 76 gigawatts if the new users of that power were willing to curtail their electricity demand for 0.25% of the year. In overly simplified terms, this means that America could accommodate 76 gigawatts of new AI data centers today, with no new power generation built, if those data centers were willing to reduce their demand by an average equivalent of roughly 22 hours out of a year.
As it happens, the estimates I trust most about near-term AI-related electricity demand suggest that we will need about 50-75 gigawatts for AI over the coming 5 years—perfectly in line with Norris’ estimates.
And a convenient win-win:
In addition to instantly unlocking more power for AI and other industrial applications, curtailing power at the scale envisioned in the Duke study would achieve other benefits. For example, as Norris observes, more efficient use by industrial customers of existing power generation capacity during non-peak demand periods would result in high utilization rates of existing capital assets, and thus lower prices for consumers.
The result is a win-win for both AI data center operators and average Americans concerned about the affordability and reliability of electricity. The only downside would be that, during periods of peak demand (for example, on a particularly hot day in one region of the country), AI users across America might notice their AI services being slower and less reliable than usual. This seems well worth the cost.
I do wonder how believable this is, given my personal experience that the more familiar I am with a policy claim’s quantitative modelling and the more domain knowledge I have about it the less I trust it generally speaking...
I came across a similar “hack” on LinkedIn from Tom Styer:
“California just pulled off the nation’s largest-ever test of a virtual power plant. This successful test proved VPPs are a fast, low-cost, zero-emissions way to make better use of the clean energy we already have — and to keep the lights on as demand surges from data centers, heat waves, and electrification.”
Basically, they are talking about allowing residential batteries supply the grid during peak demand. I tend to be skeptical about stuff like this because in my own domain, water, there’s a lot of pop science and bold claims that ignore scaling and logistics. I asked a smart fellow in that industry about it https://substack.com/@energycrystals and I thought he gave a good answer that aligns with my experience with water, which is it always come down to implementation: ”The challenge is lining up customer telemetry to incentive structures that matter. With standard demand response products (which some people sell as VPPs), the incentives given to customers don’t pay for the battery and the software admin and API costs to the utilities outweigh the cost savings of a VPPVPPs are vaporware until someone can make the business model pencil and the API integration and UX not suck ass”
So, without knowing more, my prior is that this free capacity is there for a reason, and that utilities aren’t that dumb. On the flip side, I think it’s great that we are thinking this way. Probing our systems and looking for efficiencies are worthwhile. our legacy infrastructure is a mess of path dependent bureaucracy and I’m certain there’s gains to be made in addition to new construction.
Whether this is feasible depends on how concentrated that 0.25% of the year is (expected to be), because that determines the size of the battery that you’d need to cover the blackout period (which I think would be unacceptable for a lot of AI customers).
If it happens in a single few days then this makes sense, buying 22GWh of batteries for a 1GW dataset is still extremely expensive (2B$ for a 20h system at 100$ / kWh plus installation, maybe too expensive for reliability for a 1GW datacenter I would expect, assuming maybe 10B revenue from the datacenter??). If it’s much less concentrated in time then a smaller battery is needed (100M$ for a 1h system at 100$/kWh), and I expect AI scalers would happily pay this for the reliability of their systems if the revenue from those datacenters
Demand response could be done by covering the data center with battery energy or not. Demand response and batteries can stack: if the grid is really stressed, a data center can both turn off and discharge its battery into the grid.
Economically, it makes sense to accept some true downtime to avoid months-long delays in data center construction. This is clearly true for training workloads which are very important but don’t have live demand. But downtime for even inference clusters is acceptable: you can reduce the compute demand by temporarily slowing down token generation, or use dynamic rate limits. And any curtailment would almost certainly be isolated to one region, so inference data centers in other places would still be operational.
In any case, the paper says the curtailments would last about two hours each:
The average duration of load curtailment (i.e., the length of time the new load is curtailed during curtailment events) would be relatively short, at 1.7 hours when average annual load curtailment is limited to 0.25%, 2.1 hours at a 0.5% limit, and 2.5 hours at a 1.0% limit
I’ve heard a rule of thumb that if you can avoid buying power off the California power grid’s spot pricing in the 1% most expensive times, you can roughly get a 50% discount on power.
Style. As a writer of mathematics von Neumann was clear, but not clean; he was powerful but not elegant. He seemed to love fussy detail, needless repetition, and notation so explicit as to be confusing. To maintain a logically valid but perfectly transparent and unimportant distinction, in one paper he introduced an extension of the usual functional notation: along with the standard φ(x) he dealt also with something denoted by φ((x)). The hair that was split to get there had to be split again a little later, and there was φ(((x))), and, ultimately, φ((((x)))). Equations such as
(φ((((a))))^2 = φ(((a))))
have to be peeled before they can be digested; some irreverent students referred to this paper as von Neumann’s onion.
Perhaps one reason for von Neumann’s attention to detail was that he found it quicker to hack through the underbrush himself than to trace references and see what others had done. The result was that sometimes he appeared ignorant of the standard literature. If he needed facts, well-known facts, from Lebesgue integration theory, he waded in, defined the basic notions, and developed the theory to the point where he could use it. If, in a later paper, he needed integration theory again, he would go back to the beginning and do the same thing again. He saw nothing wrong with long strings of suffixes, and subscripts on subscripts; his papers abound in avoidable algebraic computations. The reason, probably, is that he saw the large picture; the trees did not conceal the forest from him. He saw and he relished all parts of the mathematics he was thinking about. He never wrote “down” to an audience; he told it as he saw it. The practice caused no harm; the main result was that, quite a few times, it gave lesser men an opportunity to publish “improvements” of von Neumann.
(tangent: I’m a bit peeved by Halmos’ “lesser men” throwaway remark, mainly because I think interpretive research labor and distillation is very valuable, very hard to do well, somewhat orthogonal to vN-style competence, and very underappreciated and undersupplied.)
von Neumann was also courageous, Halmos wrote, in the following way:
Another notable and enviable trait of von Neumann’s was his mathematical courage. If, in the middle of a search for a counterexample, an infinite series came up, with a lot of exponentials that had quadratic exponents, many mathematicians would start with a clean sheet of paper and look for another counterexample. Not Johnny! When that happened to him, he cheerfully said: “Oh, yes, a theta function...’’, and plowed ahead with the mountainous computations. He wasn’t afraid of anything.
More specifically, one thing I learned from Terry that I was not taught in school is the importance of bad proofs. I would say “I think this is true”, work on it, see that there was no nice proof, and give up. Terry would say “Here’s a criterion that eliminates most of the problem. Then in what’s left, here’s a worse one that handles most of the detritus. One or two more epicycles. At that point it comes down to fourteen cases, and I checked them.” Yuck. But we would know it was true, and we would move on. (Usually these would get cleaned up a fair bit before publication.) …
Sometimes we’d really be on the same page, at the same letter of the same word even; one extreme case was when I needed to read his computer code and found it as easy to do as if I’d written it myself. But more often we’d bring different strengths. Since we were working in my field of expertise rather than his, I knew better what the interesting questions were, and could translate them into combinatorics, then sic Terry on them. He would beat them to a bloody death as described above, and then it would be my job to dress the carcass for public viewing back in the original field.
von Neumann also had endless capacity for work. Halmos:
Work habits. Von Neumann was not satisfied with seeing things quickly and clearly; he also worked very hard. His wife said “’he had always done his writing at home during the night or at dawn. His capacity for work was practically unlimited.” In addition to his work at home, he worked hard at his office. He arrived early, he stayed late, and he never wasted any time. He was systematic in both large things and small; he was, for instance, a meticulous proofreader. He would correct a manuscript, record on the first page the page numbers where he found errors, and, by appropriate tallies, record the number of errors that he had marked on each of those pages. Another example: when requested to prepare an abstract of not more than 200 words, he would not be satisfied with a statistical check — there are roughly 20 lines with about 10 words each — but he would count every word.
I thought this was striking: why waste time on such seeming trivialities? But I guess if you’re John von Neumann you just have such a glut of brain cycles that you can spend it in ridiculously poorly-optimised ways like this instead of needing to 80⁄20 and still get your many, many jobs done.
I have this experience with @ryan_greenblatt—he’s got an incredible ability to keep really large and complicated argument trees in his head, so he feels much less need to come up with slightly-lossy abstractions and categorizations than e.g. I do. This is part of why his work often feels like huge, mostly unstructured lists. (The lists are more unstructured before his pre-release commenters beg him to structure them more.) (His code often also looks confusing to me, for similar reasons.)
While Dyson’s birds and frogs archetypes of mathematicians is oft-mentioned, David Mumford’s tribes of mathematicians is underappreciated, and I find myself pointing to it often in discussions that devolve into “my preferred kind of math research is better than yours”-type aesthetic arguments:
… the subjective nature and attendant excitement during mathematical activity, including a sense of its beauty, varies greatly from mathematician to mathematician… I think one can make a case for dividing mathematicians into several tribes depending on what most strongly drives them into their esoteric world. I like to call these tribes explorers, alchemists, wrestlers and detectives. Of course, many mathematicians move between tribes and some results are not cleanly part the property of one tribe.
Explorers are people who ask—are there objects with such and such properties and if so, how many? They feel they are discovering what lies in some distant mathematical continent and, by dint of pure thought, shining a light and reporting back what lies out there. The most beautiful things for them are the wholly new objects that they discover (the phrase ‘bright shiny objects’ has been in vogue recently) and these are especially sought by a sub-tribe that I call Gem Collectors. Explorers have another sub-tribe that I call Mappers who want to describe these new continents by making some sort of map as opposed to a simple list of ‘sehenswürdigkeiten’.
Alchemists, on the other hand, are those whose greatest excitement comes from finding connections between two areas of math that no one had previously seen as having anything to do with each other. This is like pouring the contents of one flask into another and—something amazing occurs, like an explosion!
Wrestlers are those who are focussed on relative sizes and strengths of this or that object. They thrive not on equalities between numbers but on inequalities, what quantity can be estimated or bounded by what other quantity, and on asymptotic estimates of size or rate of growth. This tribe consists chiefly of analysts and integrals that measure the size of functions but people in every field get drawn in.
Finally Detectives are those who doggedly pursue the most difficult, deep questions, seeking clues here and there, sure there is a trail somewhere, often searching for years or decades. These too have a sub-tribe that I call Strip Miners: these mathematicians are convinced that underneath the visible superficial layer, there is a whole hidden layer and that the superficial layer must be stripped off to solve the problem. The hidden layer is typically more abstract, not unlike the ‘deep structure’ pursued by syntactical linguists. Another sub-tribe are the Baptizers, people who name something new, making explicit a key object that has often been implicit earlier but whose significance is clearly seen only when it is formally defined and given a name.
Mumford’s examples of each, both results and mathematicians:
Explorers:
Theaetetus (ncient Greek list of the five Platonic solids)
Ludwig Schläfli (extended the Greek list to regular polytopes in n dimensions)
Bill Thurston (“I never met anyone with anything close to his skill in visualization”)
the list of finite simple groups
Michael Artin (discovered non-commutative rings “lying in the middle ground between the almost commutative area and the truly huge free rings”)
Set theorists (“exploring that most peculiar, almost theological world of ‘higher infinities’”)
Mappers:
Mumford himself
arguably, the earliest mathematicians (the story told by cuneiform surveying tablets)
the Mandelbrot set
Ramanujan’s “integer expressible two ways as a sum of two cubes”
Oscar Zariski, Mumford’s PhD advisor (“his deepest work was showing how the tools of commutative algebra, that had been developed by straight algebraists, had major geometric meaning and could be used to solve some of the most vexing issues of the Italian school of algebraic geometry”)
the Riemann-Roch theorem (“it was from the beginning a link between complex analysis and the geometry of algebraic curves. It was extended by pure algebra to characteristic p, then generalized to higher dimensions by Fritz Hirzebruch using the latest tools of algebraic topology. Then Michael Atiyah and Isadore Singer linked it to general systems of elliptic partial differential equations, thus connecting analysis, topology and geometry at one fell swoop”)
Wrestlers:
Archimedes (“he loved estimating π and concocting gigantic numbers”)
Calculus (“stems from the work of Newton and Leibniz and in Leibniz’s approach depends on distinguishing the size of infinitesimals from the size of their squares which are infinitely smaller”)
Euler’s strange infinite series formulas
Stirling’s formula for the approximate size of n!
Augustin-Louis Cauchy (“his eponymous inequality remains the single most important inequality in math”)
Sergei Sobolev
Shing-Tung Yau
Detectives:
Andrew Wiles is probably the archetypal example
Roger Penrose (“”My own way of thinking is to ponder long and, I hope, deeply on problems and for a long time … and I never really let them go.”)
Strip Miners:
Alexander Grothendieck (“he greatest contemporary practitioner of this philosophy in the 20th century… Of all the mathematicians that I have met, he was the one whom I would unreservedly call a “genius”. … He considered that the real work in solving a mathematical problem was to find le niveau juste in which one finds the right statement of the problem at its proper level of generality. And indeed, his radical abstractions of schemes, functors, K-groups, etc. proved their worth by solving a raft of old problems and transforming the whole face of algebraic geometry)
Leonard Euler from Switzerland and Carl Fredrich Gauss (“both showed how two dimensional geometry lay behind the algebra of complex numbers”)
Eudoxus and his spiritual successor Archimedes (“he level they reached was essentially that of a rigorous theory of real numbers with which they are able to calculate many specific integrals. Book V in Euclid’s Elements and Archimedes The Method of Mechanical Theorems testify to how deeply they dug”)
Aryabhata
Some miscellaneous humorous quotes:
When I was teaching algebraic geometry at Harvard, we used to think of the NYU Courant Institute analysts as the macho guys on the scene, all wrestlers. I have heard that conversely they used the phrase ‘French pastry’ to describe the abstract approach that had leapt the Atlantic from Paris to Harvard.
Besides the Courant crowd, Shing-Tung Yau is the most amazing wrestler I have talked to. At one time, he showed me a quick derivation of inequalities I had sweated blood over and has told me that mastering this skill was one of the big steps in his graduate education. Its crucial to realize that outside pure math, inequalities are central in economics, computer science, statistics, game theory, and operations research. Perhaps the obsession with equalities is an aberration unique to pure math while most of the real world runs on inequalities.
In many ways [the Detective approach to mathematical research exemplified by e.g. Andrew Wiles] is the public’s standard idea of what a mathematician does: seek clues, pursue a trail, often hitting dead ends, all in pursuit of a proof of the big theorem. But I think it’s more correct to say this is one way of doing math, one style. Many are leery of getting trapped in a quest that they may never fulfill.
Scott Alexander’s Mistakes, Dan Luu’s Major errors on this blog (and their corrections), Gwern’s My Mistakes (last updated 11 years ago), and Nintil’s Mistakes (h/t @Rasool) are the only online writers I know of who maintain a dedicated, centralized page solely for cataloging their errors, which I admire. Probably not coincidentally they’re also among the thinkers I respect the most for repeatedly empirically grounding their reasoning. Some orgs do this too, like 80K’s Our mistakes, CEA’s Mistakes we’ve made, and GiveWell’s Our mistakes.
I’m not convinced Scott Alexander’s mistakes page accurately tracks his mistakes. E.g. the mistake on it I know the most about is this one:
56: (5/27/23) In Raise Your Threshold For Accusing People Of Faking Bisexuality, I cited a study finding that most men’s genital arousal tracked their stated sexual orientation (ie straight men were aroused by women, gay men were aroused by men, bi men were aroused by either), but women’s genital arousal seemed to follow a bisexual pattern regardless of what orientation they thought they were—and concluded that although men’s orientation seemed hard-coded, women’s orientation must be more psychological. But Ozy cites a followup study showing that women (though not men) also show genital arousal in response to chimps having sex, suggesting women’s genital arousal doesn’t track actual attraction and is just some sort of mechanical process triggered by sexual stimuli. I should not have interpreted the results of genital arousal studies as necessarily implying attraction.
But that’s basically wrong. The study found women’s arousal to chimps having sex to be very close to their arousal to nonsexual stimuli, and far below their arousal to sexual stimuli.
You’re welcome :) in particular, your 2015 cause selection essay was I thought a particularly high-quality writeup of the end-to-end process from personal values to actual donation choice and (I appreciated this) where you were most likely to change your mind, so I recommended it to a few folks as well as used it as a template myself back in the day.
In general I think theory-practice gap bridging via writeups like those are undersupplied, especially the end-to-end ones — more writeups bridge parts of the “pipeline”, but “full pipeline integration” done well is rare and underappreciated, which combined with how effortful it is to do it makes me not surprised there isn’t more of it.
in the interviews I’ve read with Soviet mathematicians and scientists, the things that come up over and over again are “mathematical circles,” a practice that originated in the pre-revolutionary Russian Empire and then spread far and wide through the Soviet Union. A mathematical circle is an informal group of teenagers and adults who really enjoy math and want to spend a lot of time thinking and talking about it. They’re a little bit like sports teams, in that they develop their own high-intensity internal culture and camaraderie, and often have a “coach” who is especially talented or famous. But they’re also very unlike sports teams, because they don’t compete with each other or play in leagues or anything like that, and usually any given circle will contain members of widely varying skill levels. Maybe a better analogy is a neighborhood musical ensemble that gets together and jams on a regular basis, but for math.
The most important thing to understand about mathematical circles is that the math they jam on is completely unlike the math you study in school, and also completely unlike the “competition” math that bright kids in the United States sometimes do. Both school math and competition math are primarily comprised of exercises. An exercise is a question concocted by a human being for a didactic purpose. Any bright kid with any amount of genre-savviness can immediately make a few assumptions upon being assigned an exercise. He or she can guess that the exercise is solvable in fewer than five minutes with the appropriate techniques, and that it is related to the material in the current chapter of the book. A clever student can often use psychological techniques to reverse-engineer what the teacher or the designer of the standardized test was trying to get at with the exercise, and answer it through a process of elimination or savvy guessing or pattern matching.
Solving an exercise is like hunting a neutered zoo animal. It may be a low-stress environment for polishing particular aspects of your technique, but it will not help you to survive in the wilderness. For that, you need to see people solving problems. A problem is a question of interest that comes up when somebody is trying to do something real. A problem may not be solvable by you, or by your coach, or by any human being. Even if the problem is solvable, it may require weeks or months of dedicated, painful pursuit. It may not be obvious what techniques are required to solve a problem, they may not be techniques that you know, or it may require a surprising combination of techniques. The problem is mathematical nature red in tooth and claw. There are no guardrails. There are no hints or answers at the back of the book. There is no book. It may eat you.
The bread and butter of the mathematical circle is solving problems together, as a team. There is no time here for exercises; you can do that lame stuff at school. Sometimes the coach picks a problem for you, something just beyond your ability, just the thing you need to hone your edge. But sometimes the whole circle works together on a problem that nobody has the answer to and that challenges the very best members. These problems are the most important, because with them you see great minds, men older and more talented than you, stretched to the breaking point and occasionally beaten. You see them grind and grind and try every possible attack on a problem and sometimes lose anyway. And you see them not run from being defeated, but cheerfully charge in again, because losing is good for you, losing is how you know you’ve picked an opponent worthy of a man. You learn to love things that are hard. And occasionally you win, and when you win it feels like you all win, like humanity wins, because you’re all in it together, all doing something beautiful and dangerous and exemplary of the best qualities that human beings have.
There are also times when everybody is too tired to work on a problem, and in those moments of recuperation, it’s the coach’s job to tell stories of legendary problems of the past and of the mathematicians who slew them. These stories often contain lessons, inspiration, or perspective on how mathematics evolved and got to be the way it is. Human history would look very different, after all, without the brachistochrone problemor the roots of a quintic polynomial problemor the icosahedron problemor the precession of Mercury’s perihelion problem. But other times there’s no hidden lesson, no grand perspective on the human story. They’re just ripping good yarns, and hearing them is a process of initiation into mathematical folklore, because every culture (and mathematics is surely a culture) has shared stories and references and inside jokes, even when they’re purely for fun.
You can start math circles really really young:
This book is the story of one such mathematical circle. But it’s an unusual one because…it’s for preschoolers.
The “coach” of this circle is Alexander Zvonkin, a professional mathematician frustrated that his kids are having all the wonder and life and joy crushed out of them by the grey functionaries at their school. So he starts a circle for his son Dmitry and a few of the neighbors’ kids, most of whom are around three or four years old. That’s young enough that according to Piaget’s experiments there are cognitive modules related to number and volume that simply haven’t come online yet. Fortunately, Zvonkin is familiar with the latest research on developmental psychology, and turns lemons into lemonade by using the kids’ lack of numerical intuition to introduce them to some pretty deep ideas about when two sets have equal cardinality. (If you’re curious, he talks more about these experiments in this journal article.)
At this point I expect you are rolling your eyes, especially if you have experience with three-year-olds. It can be difficult enough to get them to sit still, never mind ponder deep questions about the cardinalities of sets. And what exactly does it look like to pit somebody against a problem who is barely potty-trained? This is where the genius of Zvonkin’s format kicks in — it’s not really a book, it’s a journal, and one that is barely edited. So it’s full of failure after failure, entries like, “today I had a cool idea for a puzzle but everybody just screamed instead and then one of the kids vomited.” And yet, slowly, wondrously, over the four years of the circle’s existence, his patience pays off and the kids start doing really incredible things.
(Sadly I only learned of the existence of math circles well after graduation, a few years ago when I used to spend more time on Quora and noticed that Alon Amit, the most respected writer on math topics and someone who’d done many interesting things in his life, described himself simply as a “mathcircler”.)
I feel like we perhaps need to reach some “escape velocity” to get something like that going, but for ~rationality / deliberately figuring out how to think and act better.
Peter Watts is working with Neill Blomkamp to adapt his novel Blindsight into an 8-10-episode series:
“I can at least say the project exists, now: I’m about to start writing an episodic treatment for an 8-10-episode series adaptation of my novel Blindsight.
“Neill and I have had a long and tortured history with that property. When he first expressed interest, the rights were tied up with a third party. We almost made it work regardless; Neill was initially interested in doing a movie that wasn’t set in the Blindsight universe at all, but which merely used the speculative biology I’d invented to justify the existence of Blindsight’s vampires. “Sicario with Vampires” was Neill’s elevator pitch, and as chance would have it the guys who had the rights back then had forgotten to renew them. So we just hunkered quietly until those rights expired, and the recently-rights-holding parties said Oh my goodness we thought we’d renewed those already can we have them back? And I said, Sure; but you gotta carve out this little IP exclusion on the biology so Neill can do his vampire thing.
“It seemed like a good idea at the time. It was good idea, dammit. We got the carve-out and everything. But then one of innumerable dead-eyed suits didn’t think it was explicit enough, and the rights-holders started messing us around, and what looked like a done deal turned to ash. We lost a year or more on that account.
“But eventually the rights expired again, for good this time. And there was Neill, waiting patiently in the shadows to pounce. So now he’s developing both his Sicario-with-vampires movie and an actual Blindsight adaptation. I should probably keep the current status of those projects private for the time being. Neill’s cool with me revealing the existence of the Blindsight adaptation at least, and he’s long-since let the cat out of the bag for his vampire movie (although that was with some guy called Joe Rogan, don’t know how many people listen to him). But the stage of gestation, casting, and all those granular nuts and bolts are probably best kept under wraps for the moment.
“What I can say, though, is that it feels as though the book has been stuck in option limbo forever, never even made it to Development Hell, unless you count a couple of abortive screenplays. And for the first time, I feel like something’s actually happening. Stay tuned.”
When I first read Blindsight over a decade ago it blew my brains clean out of my skull. I’m cautiously optimistic about the upcoming series, we’ll see…
Blindsight was very well written but based on a premise that I think is importantly and dangerously wrong. That premise is that consciousness (in the sense of cognitive self-awareness) is not important for complex cognition.
This is the opposite of true, and a failure to recognize this is why people are predicting fantastic tool AI that doesn’t become self-aware and goal-directed.
The proof won’t fit in the margin unfortunately. To just gesture in that direction: it is possible to do complex general cognition without being able to think about one’s self and one’s cognition. It is much easier to do complex general cognition if the system is able to think about itself and its own thoughts.
Just checking if I understood your argument: is the general point that an algorithm that can think about literally everything is simpler and therefore easier to make or evolve than an algorithm that can think about literally everything except for itself and how other agents perceive it?
I’d go a bit farther and say it’s easier to develop an algorithm that can think about literally everything than one that can think about roughly half of things. That’s because the easiest general intelligence algorithms are about learning and reasoning, which apply to everything.
I don’t see where you get that. I saw no suggestion that the aliens (or vampires) in Blindsight were unaware of their own existence, or that they couldn’t think about their own interactions with the world. They didn’t lack any cognitive capacities at all. They just had no qualia, and therefore didn’t see the point of doing anything just for the experience.
There’s a gigantic difference between cognitive self-awareness and conscious experience.
I believe the Scramblers from blindsight weren’t self aware, which means they couldn’t think about their own interactions with the world.
As I recall the crew was giving one of the Scramblers a series of cognitive tests. It aced all the tests that had to do with numbers and spatial reasoning, but failed a test that required the testee to be self aware.
I guess it depends on how it’s described in context. And I have to admit it’s been a long time. I’d go reread it to see, but I don’t think I can handle any more bleakness right now...
Whenever I find my will to live becoming too strong, I read Peter Watts. —James Nicoll
it is possible to do complex general cognition without being able to think about one’s self and one’s cognition. It is much easier to do complex general cognition if the system is able to think about itself and its own thoughts.
I can see this making sense in one frame, but not in another. The frame which seems most strongly to support the ‘Blindsight’ idea is Friston’s stuff—specifically how the more successful we are at minimizing predictive error, the less conscious we are.[1]
My general intuition, in this frame, is that as intelligence increases more behaviour becomes automatic/subconscious. It seems compatible with your view that a superintelligent system would possess consciousness, but that most/all of its interactions with us would be subconscious.
Would like to hear more about this point, could update my views significantly. Happy for you to just state ‘this because that, read X, Y, Z etc’ without further elaboration—I’m not asking you to defend your position, so much as I’m looking for more to read on it.
But Watts lists a whole bunch of papers in support of the blindsight idea, contra Seth’s claim — to quote Watts:
“In fact, the nonconscious mind usually works so well on its own that it actually employs a gatekeeper in the anterious cingulate cortex to do nothing but prevent the conscious self from interfering in daily operations”
footnotes: Matsumoto, K., and K. Tanaka. 2004. Conflict and Cognitive Control. Science 303: 969-970; 113 Kerns, J.G., et al. 2004. Anterior Cingulate Conflict Monitoring and Adjustments in Control. Science 303: 1023-1026; 114 Petersen, S.E. et al. 1998. The effects of practice on the functional anatomy of task performance. Proceedings of the National Academy of Sciences 95: 853-860
“Compared to nonconscious processing, self-awareness is slow and expensive”
footnote: Matsumoto and Tanaka above
“The cost of high intelligence has even been demonstrated by experiments in which smart fruit flies lose out to dumb ones when competing for food”
footnote: Proceedings of the Royal Society of London B (DOI 10.1098/rspb.2003.2548)
“By way of comparison, consider the complex, lightning-fast calculations of savantes; those abilities are noncognitive, and there is evidence that they owe their superfunctionality not to any overarching integration of mental processes but due to relative neurological fragmentation”
footnotes: Treffert, D.A., and G.L. Wallace. 2004. Islands of genius. Scientific American 14: 14-23; Anonymous., 2004. Autism: making the connection. The Economist, 372(8387): 66
“Even if sentient and nonsentient processes were equally efficient, the conscious awareness of visceral stimuli—by its very nature— distracts the individual from other threats and opportunities in its environment”
“Chimpanzees have a higher brain-to-body ratio than orangutans, yet orangs consistently recognise themselves in mirrors while chimps do so only half the time”
footnotes: Aiello, L., and C. Dean. 1990. An introduction to human evolutionary anatomy. Academic Press, London; 123 Gallup, G.G. (Jr.). 1997. On the rise and fall of self-conception in primates. In The Self Across Psychology— self-recognition, self-awareness, and the Self Concept. Annals of the NY Acad. Sci. 818:4-17
“it turns out that the unconscious mind is better at making complex decisions than is the conscious mind”
footnote: Dijksterhuis, A., et al. 2006. Science 311:1005-1007
To be clear I’m not arguing that “look at all these sources, it must be true!” (we know that kind of argument doesn’t work). I’m hoping for somewhat more object-level counterarguments is all, or perhaps a better reason to dismiss them as being misguided (or to dismiss the picture Watts paints using them) than what Seth gestured at. I’m guessing he meant “complex general cognition” to point to something other than pure raw problem-solving performance.
What fraction of economically-valuable cognitive labor is already being automated today? How has that changed over time, especially recently?
I notice I’m confused about these ostensibly extremely basic questions, which arose in reading Open Phil’s old CCF-takeoff report, whose main metric is “time from AI that could readily[2] automate 20% of cognitive tasks to AI that could readily automate 100% of cognitive tasks”. A cursory search of Epoch’s data, Metaculus, and this forum didn’t turn up anything, but I didn’t spend much time at all doing so.
I was originally motivated by wanting to empirically understand recursive AI self-improvement better, which led to me stumbling upon the CAIS paper Examples of AI Improving AI, but I don’t have any sense whatsoever of how the paper’s 39 examples as of Oct-2023 translate to OP’s main metric even after constraining “cognitive tasks” in its operational definition to just AI R&D.
A survey was administered to attendees of three AI conferences during the summer of 2018 (ICML, IJCAI and the HLAI conference). The survey included questions for estimating AI capabilities over the next decade, questions for forecasting five scenarios of transformative AI and questions concerning the impact of computational resources in AI research. Respondents indicated a median of 21.5% of human tasks (i.e., all tasks that humans are currently paid to do) can be feasibly automated now, and that this figure would rise to 40% in 5 years and 60% in 10 years
which would suggest that OP’s clock should’ve started ticking in 2018, so that incorporating CCF-takeoff author Tom Davidson’s “~50% to a <3 year takeoff and ~80% to <10 year i.e. time from 20%-AI to 100%-AI, for cognitive tasks in the global economy” means takeoff should’ve already occurred… so I’m dismissing this survey’s relevance to my question (sorry).
I’m mainly wondering how Open Phil, and really anyone who uses fraction of economically-valuable cognitive labor automated / automatable (e.g. the respondents to that 2018 survey; some folks on the forum) as a useful proxy for thinking about takeoff, tracks this proxy as a way to empirically ground their takeoff-related reasoning. If you’re one of them, I’m curious if you’d answer your own question in the affirmative?
I am not one of them—I was wondering the same thing, and was hoping you had a good answer.
If I was trying to answer this question, I would probably try to figure out what fraction of all economically-valuable labor each year was cognitive, the breakdown of which tasks comprise that labor, and the year-on-year productivity increases on those task, then use that to compute the percentage of economically-valuable labor that is being automated that year.
Concretely, to get a number for the US in 1900 I might use a weighted average of productivity increases across cognitive tasks in 1900, in an approach similar to how CPI is computed
Look at the occupations listed in the 1900 census records
Figure out which ones are common, and then sample some common ones and make wild guesses about what those jobs looked like in 1900
Classify those tasks as cognitive or non-cognitive
Come to estimate that record-keeping tasks are around a quarter to a half of all cognitive labor
Notice that typewriters were starting to become more popular - about 100,000 typewriters sold per year
Note that those 100k typewriters were going to the people who would save the most time by using them
As such, estimate 1-2% productivity growth in record-keeping tasks in 1900
Multiply the productivity growth for record-keeping tasks by the fraction of time (technically actually 1-1/productivity increase but when productivity increase is small it’s not a major factor)
Estimate that 0.5% of cognitive labor was automated by specifically typewriters in 1900
Figure that’s about half of all cognitive labor automation in 1900
and thus I would estimate ~1% of all cognitive labor was automated in 1900. By the same methodology I would probably estimate closer to 5% for 2024.
Again, though, I am not associated with Open Phil and am not sure if they think about cognitive task automation in the same way.
I chose to study physics in undergrad because I wanted to “understand the universe” and naively thought string theory was the logically correct endpoint of this pursuit, and was only saved from that fate by not being smart enough to get into a good grad school. Since then I’ve come to conclude that string theory is probably a dead end, albeit an astonishingly alluring one for a particular type of person. In that regard I find anecdotes like the following by Ron Maimon on Physics SE interesting — the reason string theorists believe isn’t the same as what they tell people, so it’s better to ask for their conversion stories:
I think that it is better to ask for a compelling argument that the physics of gravity requires a string theory completion, rather than a mathematical proof, which would be full of implicit assumptions anyway. The arguments people give in the literature are not the same as the personal reasons that they believe the theory, they are usually just stories made up to sound persuasive to students or to the general public. They fall apart under scrutiny. The real reasons take the form of a conversion story, and are much more subjective, and much less persuasive to everyone except the story teller. Still, I think that a conversion story is the only honest way to explain why you believe something that is not conclusively experimentally established.
Some famous conversion stories are:
Scherk and Schwarz (1974): They believed that the S-matrix bootstrap was a fundamental law of physics, and were persuaded that the bootstrap had a solution when they constructed proto-superstrings. An S-matrix theory doesn’t really leave room for adding new interactions, as became clear in the early seventies with the stringent string consistency conditions, so if it were a fundamental theory of strong interactions only, how would you couple it to electromagnetism or to gravity? The only way is if gravitons and photons show up as certain string modes. Scherk understood how string theory reproduces field theory, so they understood that open strings easily give gauge fields. When they and Yoneya understood that the theory requires a perturbative graviton, they realized that it couldn’t possibly be a theory of hadrons, but must include all interactions, and gravitational compactification gives meaning to the extra dimensions. Thankfully they realized this in 1974, just before S-matrix theory was banished from physics.
Ed Witten (1984): At Princeton in 1984, and everywhere along the East Coast, the Chew bootstrap was as taboo as cold fusion. The bootstrap was tautological new-agey content-free Berkeley physics, and it was justifiably dead. But once Ed Witten understood that string theory cancels gravitational anomalies, this was sufficient to convince him that it was viable. He was aware that supergravity couldn’t get chiral matter on a smooth compactification, and had a hard time fitting good grand-unification groups. Anomaly cancellation is a nontrivial constraint, it means that the theory works consistently in gravitational instantons, and it is hard to imagine a reason it should do that unless it is nonperturbatively consistent.
Everyone else (1985): once they saw Ed Witten was on board, they decided it must be right.
I am exaggerating of course. The discovery of heterotic strings and Calabi Yau compactifications was important in convincing other people that string theory was phenomenologically viable, which was important. In the Soviet Union, I am pretty sure that Knizhnik believed string theory was the theory of everything, for some deep unknown reasons, although his collaborators weren’t so sure. Polyakov liked strings because the link between the duality condition and the associativity of the OPE, which he and Kadanoff had shown should be enough to determines critical exponents in phase transitions, but I don’t think he ever fully got on board with the “theory of everything” bandwagon.
The rest of Ron’s answer elaborates on his own conversion story. The interesting part to me is that Ron began by trying to “kill string theory”, and in fact he was very happy that he was going to do so, but then was annoyed by an argument of his colleague that mathematically worked, and in the year or two he spent puzzling over why it worked he had an epiphany that convinced him string theory was correct, which sounds like nonsense to the uninitiated. (This phenomenon where people who gain understanding of the thing become incomprehensible to others sounds a lot like the discussions on LW on enlightenment by the way.)
In pure math, mathematicians seek “morality”, which sounds similar to Ron’s string theory conversion stories above. Eugenia Cheng’s Mathematics, morally argues:
I claim that although proof is what supposedly establishes the undeniable truth of a piece of mathematics, proof doesn’t actually convince mathematicians of that truth. And something else does.
… formal mathematical proofs may be wonderfully watertight, but they are impossible to understand. Which is why we don’t write whole formal mathematical proofs. … Actually, when we write proofs what we have to do is convince the community that it could be turned into a formal proof. It is a highly sociological process, like appearing before a jury of twelve good men-and-true. The court, ultimately, cannot actually know if the accused actually ‘did it’ but that’s not the point; the point is to convince the jury. Like verdicts in court, our ‘sociological proofs’ can turn out to be wrong—errors are regularly found in published proofs that have been generally accepted as true. So much for mathematical proof being the source of our certainty. Mathematical proof in practice is certainly fallible.
But this isn’t the only reason that proof is unconvincing. We can read even a correct proof, and be completely convinced of the logical steps of the proof, but still not have any understanding of the whole. Like being led, step by step, through a dark forest, but having no idea of the overall route. We’ve all had the experience of reading a proof and thinking “Well, I see how each step follows from the previous one, but I don’t have a clue what’s going on!”
And yet… The mathematical community is very good at agreeing what’s true. And even if something is accepted as true and then turns out to be untrue, people agree about that as well. Why? …
Mathematical theories rarely compete at the level of truth. We don’t sit around arguing about which theory is right and which is wrong. Theories compete at some other level, with questions about what the theory “ought” to look like, what the “right” way of doing it is. It’s this other level of ‘ought’ that we call morality. … Mathematical morality is about how mathematics should behave, not just that this is right, this is wrong. Here are some examples of the sorts of sentences that involve the word “morally”, not actual examples of moral things.
“So, what’s actually going on here, morally?” “Well, morally, this proof says...” “Morally, this is true because...” “Morally, there’s no reason for this axiom.” “Morally, this question doesn’t make any sense.” “What ought to happen here, morally?” “This notation does work, but morally, it’s absurd!” “Morally, this limit shouldn’t exist at all” “Morally, there’s something higher-dimensional going on here.”
Beauty/elegance is often the opposite of morality. An elegant proof is often a clever trick, a piece of magic as in Example 6 above, the sort of proof that drives you mad when you’re trying to understand something precisely because it’s so clever that it doesn’t explain anything at all.
Constructiveness is often the opposite of morality as well. If you’re proving the existence of something and you just construct it, you haven’t necessarily explained why the thing exists.
Morality doesn’t mean ‘explanatory’ either. There are so many levels of explaining something. Explanatory to whom? To someone who’s interested in moral reasons. So we haven’t really got anywhere. The same goes for intuitive, obvious, useful, natural and clear, and as Thurston says: “one person’s clear mental image is another person’s intimidation”.
Minimality/efficiency is sometimes the opposite of morality too. Sometimes the most efficient way of proving something is actually the moral way backwards. eg quadratics. And the most minimal way of presenting a theory is not necessarily the morally right way. For example, it is possible to show that a group is a set X equipped with one binary operation / satisfying the single axiom for all x, y, z ∈ X, (x/((((x/x)/y)/z)/(((x/x)/x)/z))) = y. The fact that something works is not good enough to be a moral reason.
Polya’s notion of ‘plausible reasoning’ at first sight might seem to fit the bill because it appears to be about how mathematicians decide that something is ‘plausible’ before sitting down to try and prove it. But in fact it’s somewhat probabilistic. This is not the same as a moral reason. It’s more like gathering a lot of evidence and deciding that all the evidence points to one conclusion, without there actually being a reason necessarily. Like in court, having evidence but no motive.
Abstraction perhaps gets closer to morality, along with ‘general’, ‘deep’, ‘conceptual’. But I would say that it’s the search for morality that motivates abstraction, the search for the moral reason motivates the search for greater generalities, depth and conceptual understanding. …
Proof has a sociological role; morality has a personal role. Proof is what convinces society; morality is what convinces us. Brouwer believed that a construction can never be perfectly communicated by verbal or symbolic language; rather it’s a process within the mind of an individual mathematician. What we write down is merely a language for communicating something to other mathematicians, in the hope that they will be able to reconstruct the process within their own mind. When I’m doing maths I often feel like I have to do it twice—once, morally in my head. And then once to translate it into communicable form. The translation is not a trivial process; I am going to encapsulate it as the process of moving from one form of truth to another.
Transmitting beliefs directly is unfeasible, but the question that does leap out of this is: what about the reason? Why don’t I just send the reason directly to X, thus eliminating the two probably hardest parts of this process? The answer is that a moral reason is harder to communicate than a proof. The key characteristic about proof is not its infallibility, not its ability to convince but its transferability. Proof is the best medium for communicating my argument to X in a way which will not be in danger of ambiguity, misunderstanding, or defeat. Proof is the pivot for getting from one person to another, but some translation is needed on both sides. So when I read an article, I always hope that the author will have included a reason and not just a proof, in case I can convince myself of the result without having to go to all the trouble of reading the fiddly proof.
Mathematicians have developed habits of communication that are often dysfunctional. Organizers of colloquium talks everywhere exhort speakers to explain things in elementary terms. Nonetheless, most of the audience at an average colloquium talk gets little of value from it. Perhaps they are lost within the first 5 minutes, yet sit silently through the remaining 55 minutes. Or perhaps they quickly lose interest because the speaker plunges into technical details without presenting any reason to investigate them. At the end of the talk, the few mathematicians who are close to the field of the speaker ask a question or two to avoid embarrassment.
This pattern is similar to what often holds in classrooms, where we go through the motions of saying for the record what we think the students “ought” to learn, while the students are trying to grapple with the more fundamental issues of learning our language and guessing at our mental models. Books compensate by giving samples of how to solve every type of homework problem. Professors compensate by giving homework and tests that are much easier than the material “covered” in the course, and then grading the homework and tests on a scale that requires little understanding. We assume that the problem is with the students rather than with communication: that the students either just don’t have what it takes, or else just don’t care.
Outsiders are amazed at this phenomenon, but within the mathematical community, we dismiss it with shrugs.
Much of the difficulty has to do with the language and culture of mathematics, which is divided into subfields. Basic concepts used every day within one subfield are often foreign to another subfield. Mathematicians give up on trying to understand the basic concepts even from neighboring subfields, unless they were clued in as graduate students.
In contrast, communication works very well within the subfields of mathematics. Within a subfield, people develop a body of common knowledge and known techniques. By informal contact, people learn to understand and copy each other’s ways of thinking, so that ideas can be explained clearly and easily.
Mathematical knowledge can be transmitted amazingly fast within a subfield. When a significant theorem is proved, it often (but not always) happens that the solution can be communicated in a matter of minutes from one person to another within the subfield. The same proof would be communicated and generally understood in an hour talk to members of the subfield. It would be the subject of a 15- or 20-page paper, which could be read and understood in a few hours or perhaps days by members of the subfield.
Why is there such a big expansion from the informal discussion to the talk to the paper? One-on-one, people use wide channels of communication that go far beyond formal mathematical language. They use gestures, they draw pictures and diagrams, they make sound effects and use body language. Communication is more likely to be two-way, so that people can concentrate on what needs the most attention. With these channels of communication, they are in a much better position to convey what’s going on, not just in their logical and linguistic facilities, but in their other mental facilities as well.
In talks, people are more inhibited and more formal. Mathematical audiences are often not very good at asking the questions that are on most people’s minds, and speakers often have an unrealistic preset outline that inhibits them from addressing questions even when they are asked.
In papers, people are still more formal. Writers translate their ideas into symbols and logic, and readers try to translate back.
Why is there such a discrepancy between communication within a subfield and communication outside of subfields, not to mention communication outside mathematics? Mathematics in some sense has a common language: a language of symbols, technical definitions, computations, and logic. This language efficiently conveys some, but not all, modes of mathematical thinking. Mathematicians learn to translate certain things almost unconsciously from one mental mode to the other, so that some statements quickly become clear. Different mathematicians study papers in different ways, but when I read a mathematical paper in a field in which I’m conversant, I concentrate on the thoughts that are between the lines. I might look over several paragraphs or strings of equations and think to myself “Oh yeah, they’re putting in enough rigamarole to carry such-and-such idea.” When the idea is clear, the formal setup is usually unnecessary and redundant—I often feel that I could write it out myself more easily than figuring out what the authors actually wrote. It’s like a new toaster that comes with a 16-page manual. If you already understand toasters and if the toaster looks like previous toasters you’ve encountered, you might just plug it in and see if it works, rather than first reading all the details in the manual.
People familiar with ways of doing things in a subfield recognize various patterns of statements or formulas as idioms or circumlocution for certain concepts or mental images. But to people not already familiar with what’s going on the same patterns are not very illuminating; they are often even misleading. The language is not alive except to those who use it.
Thurston’s personal reflections below on the sociology of proof exemplify the search for mathematical morality instead of fully formally rigorous correctness. I remember being disquieted upon first reading “There were published theorems that were generally known to be false” a long time ago:
When I started as a graduate student at Berkeley, I had trouble imagining how I could “prove” a new and interesting mathematical theorem. I didn’t really understand what a “proof” was.
By going to seminars, reading papers, and talking to other graduate students, I gradually began to catch on. Within any field, there are certain theorems and certain techniques that are generally known and generally accepted. When you write a paper, you refer to these without proof. You look at other papers in the field, and you see what facts they quote without proof, and what they cite in their bibliography. You learn from other people some idea of the proofs. Then you’re free to quote the same theorem and cite the same citations. You don’t necessarily have to read the full papers or books that are in your bibliography. Many of the things that are generally known are things for which there may be no known written source. As long as people in the field are comfortable that the idea works, it doesn’t need to have a formal written source.
At first I was highly suspicious of this process. I would doubt whether a certain idea was really established. But I found that I could ask people, and they could produce explanations and proofs, or else refer me to other people or to written sources that would give explanations and proofs. There were published theorems that were generally known to be false, or where the proofs were generally known to be incomplete. Mathematical knowledge and understanding were embedded in the minds and in the social fabric of the community of people thinking about a particular topic. This knowledge was supported by written documents, but the written documents were not really primary.
I think this pattern varies quite a bit from field to field. I was interested in geometric areas of mathematics, where it is often pretty hard to have a document that reflects well the way people actually think. In more algebraic or symbolic fields, this is not necessarily so, and I have the impression that in some areas documents are much closer to carrying the life of the field. But in any field, there is a strong social standard of validity and truth. Andrew Wiles’s proof of Fermat’s Last Theorem is a good illustration of this, in a field which is very algebraic. The experts quickly came to believe that his proof was basically correct on the basis of high-level ideas, long before details could be checked. This proof will receive a great deal of scrutiny and checking compared to most mathematical proofs; but no matter how the process of verification plays out, it helps illustrate how mathematics evolves by rather organic psychological and social processes.
Since then I’ve come to conclude that string theory is probably a dead end, albeit an astonishingly alluring one for a particular type of person.
The more you know about particle physics and quantum field theory, the more inevitable string theory seems. There are just too many connections. However, identifying the specific form of string theory that corresponds to our universe is more of a challenge, and not just because of the fabled 10^500 vacua (though it could be one of those). We don’t actually know either all the possible forms of string theory, or the right way to think about the physics that we can see. The LHC, with its “unnaturally” light Higgs boson, already mortally wounded a particular paradigm for particle physics (naturalness) which in turn was guiding string phenomenology (i.e. the part of string theory that tries to be empirically relevant). So along with the numerical problem of being able to calculate the properties of a given string vacuum, the conceptual side of string theory and string phenomenology is still wide open for discovery.
I asked a well-known string theorist about the fabled 10^500 vacua and asked him whether he worried that this would make string theory a vacuous theory since a theory that fits anything fits nothing. He replied ′ no, no the 10^500 ‘swampland’ is a great achievement of string theory—you see… all other theories have infinitely many adjustable parameters’. He was saying string theory was about ~1500 bits away from the theory of everything but infinitely ahead of its competitors.
Diabolical.
Much ink has been spilled on the scientific merits and demerits of string theory and its competitors. The educated reader will recognize that this all this and more is of course, once again, solved by UDASSA.
Re other theories, I don’t think that all other theories in existence have infinitely many adjustable parameters, and if he’s referring to the fact that lots of theories have adjustable parameters that can range over the real numbers, which are infinitely complicated in general, than that’s different, and string theory may have this issue as well.
Re string theory’s issue of being vacuous, I think the core thing that string theory predicts that other quantum gravity models don’t is that at the large scale, you recover general relativity and the standard model, whereas no other theory can yet figure out a way to properly include both the empirical effects of gravity and quantum mechanics in the parameter regimes where they are known to work, so string theory predicts more just by predicting the things other quantum mechanics predicts while having the ability to include in gravity without ruining the other predictions, whereas other models of quantum gravity tend to ruin empirical predictions like general relativity approximately holding pretty fast.
I used to consider it a mystery that math was so unreasonably effective in the natural sciences, but changed my mind after reading this essay by Eric S. Raymond (who’s here on the forum, hi and thanks Eric), in particular this part, which is as good a question dissolution as any I’ve seen:
The relationship between mathematical models and phenomenal prediction is complicated, not just in practice but in principle. Much more complicated because, as we now know, there are mutually exclusive ways to axiomatize mathematics! It can be diagrammed as follows (thanks to Jesse Perry for supplying the original of this chart):
(it’s a shame this chart isn’t rendering properly for some reason, since without it the rest of Eric’s quote is ~incomprehensible)
The key transactions for our purposes are C and D—the translations between a predictive model and a mathematical formalism. What mystified Einstein is how often D leads to new insights.
We begin to get some handle on the problem if we phrase it more precisely; that is, “Why does a good choice of C so often yield new knowledge via D?”
The simplest answer is to invert the question and treat it as a definition. A “good choice of C” is one which leads to new predictions. The choice of C is not one that can be made a-priori; one has to choose, empirically, a mapping between real and mathematical objects, then evaluate that mapping by seeing if it predicts well.
One can argue that it only makes sense to marvel at the utility of mathematics if one assumes that C for any phenomenal system is an a-priori given. But we’ve seen that it is not. A physicist who marvels at the applicability of mathematics has forgotten or ignored the complexity of C; he is really being puzzled at the human ability to choose appropriate mathematical models empirically.
By reformulating the question this way, we’ve slain half the dragon. Human beings are clever, persistent apes who like to play with ideas. If a mathematical formalism can be found to fit a phenomenal system, some human will eventually find it. And the discovery will come to look “inevitable” because those who tried and failed will generally be forgotten.
But there is a deeper question behind this: why do good choices of mathematical model exist at all? That is, why is there any mathematical formalism for, say, quantum mechanics which is so productive that it actually predicts the discovery of observable new particles?
The way to “answer” this question is by observing that it, too, properly serves as a kind of definition. There are many phenomenal systems for which no such exact predictive formalism has been found, nor for which one seems likely. Poets like to mumble about the human heart, but more mundane examples are available. The weather, or the behavior of any economy larger than village size, for example—systems so chaotically interdependent that exact prediction is effectively impossible (not just in fact but in principle).
There are many things for which mathematical modeling leads at best to fuzzy, contingent, statistical results and never successfully predicts ‘new entities’ at all. In fact, such systems are the rule, not the exception. So the proper answer to the question “Why is mathematics is so marvelously applicable to my science?” is simply “Because that’s the kind of science you’ve chosen to study!”
Interesting. This reminds me of a related thought I had: Why do models with differential equations work so often in physics but so rarely in other empirical sciences? Perhaps physics simply is “the differential equation science”.
Which is also related to the frequently expressed opinion that philosophy makes little progress because everything that gets developed enough to make significant progress splits off from philosophy. Because philosophy is “the study of ill-defined and intractable problems”.
Not saying that I think these views are accurate, though they do have some plausibility.
The weather, or the behavior of any economy larger than village size, for example—systems so chaotically interdependent that exact prediction is effectively impossible (not just in fact but in principle).
Flagging that those two examples seem false. The weather is chaotic, yes, and there’s a sense in which the economy is anti-inductive, but modeling methods are advancing, and will likely find more loop-holes in chaos theory.
For example, in thermodynamics, temperature is non-chaotic while the precise kinetic energies and locations of all particles are. A reasonable candidate similarity in weather are hurricanes.
Similarly as our understanding of the economy advances it will get more efficient which means it will be easier to model. eg (note: I’ve only skimmed this paper). And definitely large economies are even more predictable than small villages, talk about not having a competitive market!
Thanks for the pointer to that paper, the abstract makes me think there’s a sort of slow-acting self-reinforcing feedback loop between predictive error minimisation via improving modelling and via improving the economy itself.
re: weather, I’m thinking of the chart below showing how little gain we get in MAE vs compute, plus my guess that compute can’t keep growing far enough to get MAE < 3 °F a year out (say). I don’t know anything about advancements in weather modelling methods though; maybe effective compute (incorporating modelling advancements) may grow indefinitely in terms of the chart.
I didn’t say anything about temperature prediction, and I’d also like to see any other method (intuition based or otherwise) do better than the current best mathematical models here. It seems unlikely to me that the trends in that graph will continue arbitrarily far.
Thanks for the pointer to that paper, the abstract makes me think there’s a sort of slow-acting self-reinforcing feedback loop between predictive error minimisation via improving modelling and via improving the economy itself.
Matt Leifer, who works in quantum foundations, espouses a view that’s probably more extreme than Eric Raymond’s above to argue why the effectiveness of math in the natural sciences isn’t just reasonable but expected-by-construction. In his 2015 FQXi essay Mathematics is Physics Matt argued that
… mathematics is a natural science—just like physics, chemistry, or biology—and that this can explain the alleged “unreasonable” effectiveness of mathematics in the physical sciences.
The main challenge for this view is to explain how mathematical theories can become increasingly abstract and develop their own internal structure, whilst still maintaining an appropriate empirical tether that can explain their later use in physics. In order to address this, I offer a theory of mathematical theory-building based on the idea that human knowledge has the structure of a scale-free network and that abstract mathematical theories arise from a repeated process of replacing strong analogies with new hubs in this network.
This allows mathematics to be seen as the study of regularities, within regularities, within . . . , within regularities of the natural world. Since mathematical theories are derived from the natural world, albeit at a much higher level of abstraction than most other scientific theories, it should come as no surprise that they so often show up in physics.
… mathematical objects do not refer directly to things that exist in the physical universe. As the formalists suggest, mathematical theories are just abstract formal systems, but not all formal systems are mathematics. Instead, mathematical theories are those formal systems that maintain a tether to empirical reality through a process of abstraction and generalization from more empirically grounded theories, aimed at achieving a pragmatically useful representation of regularities that exist in nature.
(Matt notes as an aside that he’s arguing for precisely the opposite of Tegmark’s MUH.)
Why “scale-free network”?
It is common to view the structure of human knowledge as hierarchical… The various attempts to reduce all of mathematics to logic or arithmetic reflect a desire view mathematical knowledge as hanging hierarchically from a common foundation. However, the fact that mathematics now has multiple competing foundations, in terms of logic, set theory or category theory, indicates that something is wrong with this view.
Instead of a hierarchy, we are going to attempt to characterize the structure of human knowledge in terms of a network consisting of nodes with links between them… Roughly speaking, the nodes are supposed to represent different fields of study. This could be done at various levels of detail. … Next, a link should be drawn between two nodes if there is a strong connection between the things they represent. Again, I do not want to be too precise about what this connection should be, but examples would include an idea being part of a wider theory, that one thing can be derived from the other, or that there exists a strong direct analogy between the two nodes. Essentially, if it has occurred to a human being that the two things are strongly related, e.g. if it has been thought interesting enough to do something like publish an academic paper on the connection, and the connection has not yet been explained in terms of some intermediary theory, then there should be a link between the corresponding nodes in the network.
If we imagine drawing this network for all of human knowledge then it is plausible that it would have the structure of a scale-free network. Without going into technical details, scale-free networks have a small number of hubs, which are nodes that are linked to a much larger number of nodes than the average. This is a bit like the 1% of billionaires who are much richer than the rest of the human population. If the knowledge network is scale-free then this would explain why it seems so plausible that knowledge is hierarchical. In a university degree one typically learns a great deal about one of the hubs, e.g. the hub representing fundamental physics, and a little about some of the more specialized subjects that hang from it. As we get ever more specialized, we typically move away from our starting hub towards more obscure nodes, which are nonetheless still much closer to the starting hub than to any other hub. The local part of the network that we know about looks much like a hierarchy, and so it is not surprising that physicists end up thinking that everything boils down to physics whereas sociologists end up thinking that everything is a social construct. In reality, neither of these views is right because the global structure of the network is not a hierarchy.
As a naturalist, I should provide empirical evidence that human knowledge is indeed structured as a scale-free network. The best evidence that I can offer is that the structure of pages and links on the Word Wide Web and the network of citations to academic papers are both scale free [13]. These are, at best, approximations of the true knowledge network. … However, I think that these examples provide evidence that the information structures generated by a social network of finite beings are typically scale-free networks, and the knowledge network is an example of such a structure.
As an aside, Matt’s theory of theory-building explains (so he claims) what mathematical intuition is about: “intuition for efficient knowledge structure, rather than intuition about an abstract mathematical world”.
So what? How does this view pay rent?
Firstly, in network language, the concept of a “theory of everything” corresponds to a network with one enormous hub, from which all other human knowledge hangs via links that mean “can be derived from”. This represents a hierarchical view of knowledge, which seems unlikely to be true if the structure of human knowledge is generated by a social process. It is not impossible for a scale-free network to have a hierarchical structure like a branching tree, but it seems unlikely that the process of knowledge growth would lead uniquely to such a structure. It seems more likely that we will always have several competing large hubs and that some aspects of human experience, such as consciousness and why we experience a unique present moment of time, will be forever outside the scope of physics.
Nonetheless, my theory suggests that the project of finding higher level connections that encompass more of human knowledge is still a fruitful one. It prevents our network from having an unwieldy number of direct links, allows us to share more common vocabulary between fields, and allows an individual to understand more of the world with fewer theories. Thus, the search for a theory of everything is not fruitless; I just do not expect it to ever terminate.
Secondly, my theory predicts that the mathematical representation of fundamental physical theories will continue to become increasingly abstract. The more phenomena we try to encompass in our fundamental theories, the further the resulting hubs will be from the nodes representing our direct sensory experience. Thus, we should not expect future theories of physics to become less mathematical, as they are generated by the same process of generalization and abstraction as mathematics itself.
Matt further develops the argument that the structure of human knowledge being networked-not-hierarchical implies that the idea that there is a most fundamental discipline, or level of reality, is mistaken in Against Fundamentalism, another FQXi essay published in 2018.
I would also comment that, if the environment was so chaotic that roughly everything important to life could not be modeled—if general-purpose modeling ability was basically useless—then life would not have evolved that ability, and “intelligent life” probably wouldn’t exist.
The two concepts that I thought were missing from Eliezer’s technical explanation of technical explanation that would have simplified some of the explanation were compression and degrees of freedom. Degrees of freedom seems very relevant here in terms of how we map between different representations. Why are representations so important for humans? Because they have different computational properties/traversal costs while humans are very computationally limited.
Griffiths argued that the aspects we associate with human intelligence – rapid learning from small data, the ability to break down problems into parts, and the capacity for cumulative cultural evolution – arose from the 3 fundamental limitations all humans share: limited time, limited computation, and limited communication. (The constraints imposed by these characteristics cascade: limited time magnifies the effect of limited computation, and limited communication makes it harder to draw upon more computation.) In particular, limited computation leads to problem decomposition, hence modular solutions; relieving the computation constraint enables solutions that can be objectively better along some axis while also being incomprehensible to humans.
Thanks for the link. I mean that predictions are outputs of a process that includes a representation, so part of what’s getting passed back and forth in the diagram are better and worse fit representations. The degrees of freedom point is that we choose very flexible representations, whittle them down with the actual data available, then get surprised that that representation yields other good predictions. But we should expect this if Nature shares any modular structure with our perception at all, which it would if there was both structural reasons (literally same substrate) and evolutionary pressure for representations with good computational properties i.e. simple isomorphisms and compressions.
Systematic reviews (SRs) inform evidence-based decision making. Yet, they take over a year to complete, are prone to human error, and face challenges with reproducibility; limiting access to timely and reliable information. We developed otto-SR, an end-to-end agentic workflow using large language models (LLMs) to support and automate the SR workflow from initial search to analysis. We found that otto-SR outperformed traditional dual human workflows in SR screening (otto-SR: 96.7% sensitivity, 97.9% specificity; human: 81.7% sensitivity, 98.1% specificity) and data extraction (otto-SR: 93.1% accuracy; human: 79.7% accuracy). Using otto-SR, we reproduced and updated an entire issue of Cochrane reviews (n=12) in two days, representing approximately 12 work-years of traditional systematic review work. Across Cochrane reviews, otto-SR incorrectly excluded a median of 0 studies (IQR 0 to 0.25), and found a median of 2.0 (IQR 1 to 6.5) eligible studies likely missed by the original authors. Meta-analyses revealed that otto-SR generated newly statistically significant conclusions in 2 reviews and negated significance in 1 review. These findings demonstrate that LLMs can autonomously conduct and update systematic reviews with superhuman performance, laying the foundation for automated, scalable, and reliable evidence synthesis.
Pretty cool since “SRs are incredibly resource-intensive, typically taking over 16 months and costing upwards of $100,000 to complete”. They used GPT-4.1 for screening articles and o3-mini-high for data extraction.
otto-SR seems much better than Elicit in particular, which is notable to me for being the gold standard DR tool according to Sarah Constantin’s review.
… big guests just don’t really matter that much if you just look at what are the most popular episodes, or what in the long run helps a podcast grow. By far my most popular guest is Sarah Paine, and she, before I interviewed her, was just a scholar who was not publicly well-known at all, and I just found her books quite interesting—so my most popular guests are Sarah Paine and then Sarah Paine, Sarah Paine, Sarah Paine because I have electric chairs(?) a lecture series with her. And by the way, from a viewer-a-minute adjusted basis, I host the Sarah Paine podcast where I occasionally talk about AI.
(After Sarah Paine comes geneticist David Reich, then Satya Nadella and Mark Zuckerberg, “then [Sholto & Trenton] or Leopold (Aschenbrenner) or something, then you get to the lab CEOs or something”)
You can see it as an example of ‘alpha’ vs ‘beta’. When someone asks me about the value of someone as a guest, I tend to ask: “do they have anything new to say? didn’t they just do a big interview last year?” and if they don’t but they’re big, “can you ask them good questions that get them out of their ‘book’?” Big guests are not necessarily as valuable as they may seem because they are highly-exposed, which means both that (1) they have probably said everything they will said before and there is no ‘news’ or novelty, and (2) they are message-disciplined and careful to “talk their book”. (In this analogy, “alpha” represents undiscovered or neglected interview topics which can be extracted mostly just by finding it and then asking the obvious question, usually by interviewing new people; “beta” represents doing standard interview topics/people, but much more so—harder, faster, better—and getting new stuff that way.)
Lex Fridman podcasts are an example of this: he often hosts very big guests like Mark Zuckerberg, but nevertheless, I will sit down and skim through the transcript of 2-4 hours of content, and find nothing even worth excerpting for my notes. Fridman notoriously does no research and asks softball questions, and invites the biggest names he can get regardless of overexposure, and so if you do that, you will get nothing new. He has found no alpha, and he doesn’t interview hard enough to extract beta. So he’s sort of the high-expense ratio index fund of podcast interviews.
Sarah Paine, on the other hand, seems to have been completely unknown and full of juicy nuggets, and is like winning the lottery: you can make a career off a really good trade like Paine before it gets crowded. However, if another successful podcaster has her on, they will probably not discover Paine is their most popular or growth-productive guest ever. The well is dry. Paine may have more to say someday, but that day is probably closer to “5 years from today” than “tomorrow”.
(So a good interviewer adopts an optimal foraging mindset: once you have harvested a patch of its delicious food, you have to move on to another patch, which hasn’t been exhausted yet, and let the original patch slowly recover.)
So a great guest for Dwarkesh’s blog would be, say Hans Moravec or Paul J. Werbos: Moravec hasn’t done anything publicly in at least a decade, and is fallow; while Werbos has been more active and in the public eye, but still not much and is such a weird guy that just about any questions will be interesting. Reich was also a good guest because while Reich is very ‘public’ in some senses (he’s written popularizing books, even), he is still obscure, almost none of what he has published is well-known, and he is involved in so much fast-paced research that even the book is now substantially obsolete and he has a lot of new stuff to say. (And Reich will have more stuff to say if revisited in, say, 2 years for an update, so a harvester will be making a note to revisit him if the current crop of interview candidates in the pipeline is looking marginal.) A difficult or mediocre guest would be Tony Blair: he can surely say many interesting things about the current geopolitical context and his work since being PM… but he is a super-experienced career politician who has survived countless Question Times, and may eat you for breakfast and exploit you for ulterior purposes (rather than vice-versa). Similarly, Mark Zuckerberg and Satya Nadella are tough nuts: there’s meat there, but are you willing enough to bring down the hammer or will you settle for a mediocre result that mostly just fills space and is not a must-watch? A bad guest might be someone controlling and extremely PR-savvy like MrBeast; this is the sort of guy who will give you a bad interview pushing his ‘book’ shamelessly, and then might wind up spiking the interview anyway if he felt he wasn’t getting enough out of it, and just drops it as a sunk cost (though it was weeks of work on your part and blows a hole in your schedule—that’s not his problem).
I like the optimal forager take, seems intuitively correct. I’d add that Dwarkesh struck gold by getting you on his podcast too. (Tangentially: this grand theory of intelligence video snippet reminds me of a page-ish-long writeup on that I stumbled upon deep in the bowels of https://gwern.net/ which I’ve annoyingly never been able to find again.)
Also thanks for the pointer to Werbos, his website Welcome to the Werbos World! funnily enough struck me as crackpot-y and I wouldn’t have guessed just from the landing page that he’s the discoverer of backprop, respected former program director at the NSF, etc.
I wouldn’t have guessed just from the landing page that he’s the discoverer of backprop, respected former program director at the NSF, etc.
That’s what makes it alpha! If he was as legible as, say, Hinton, he would be mined out by now, and nothing but beta. (Similar situation to Schmidhuber - ‘obvious crackpot’ - although he’s such a self-promoter that he overcomes it, and so at this point there’s no alpha talking to him; the stuff that would be interesting, like his relationship to certain wealthy Italians, or to King Bonesaws, or how he’s managed to torpedo his career so spectacularly, he will not talk about. Also, I understand he likes to charge people for the privilege of talking to him.) You have to have both domain knowledge and intellectual courage to know about Werbos and eg. read his old interviews and be willing to go out on a limb and interview him.
This seems to underrate the value of distribution. I suspect another factor to take into account is the degree of audience overlap. Like there’s a lot of value in booking a guest who has been on a bunch of podcasts, so long as your particular audience isn’t likely to have been exposed to them.
Balioc’s A taxonomy of bullshit jobs has a category called Worthy Work Made Bullshit which resonated with me most of all:
Worthy Work Made Bullshit is perhaps the trickiest and most controversial category, but as far as I’m concerned it’s one of the most important. This is meant to cover jobs where you’re doing something that is obviously and directly worthwhile…at least in theory…but the structure of the job, and the institutional demands that are imposed on you, turn your work into bullshit.
The conceptual archetype here is the Soviet tire factory that produces millions of tiny useless toy-sized tires instead of a somewhat-smaller number of actually-valuable tires that could be put on actual vehicles, because the quota scheme is badly designed. Everyone in that factory has a Worthy Work Made Bullshit job. Making tires is something you can be proud of, at least hypothetically. Making tiny useless tires to game a quota system is…not.
Nowadays we don’t have Soviet central planners producing insane demands, but we do have a marketplace that produces comparably-insane demands, especially in certain fields.
This is especially poignant, and especially relevant, in certain elite/creative fields where you don’t need market discipline in order to get people to produce. All those writers who are churning out garbage clickbait? They don’t want to be writing clickbait, any more than you want them to be writing clickbait. If you just handed them checks and told them “go do whatever”…well, some of them would take the money and do nothing, some of them would produce worthless product that appealed to no one, but a lot of them would generate work considerably more worthwhile than clickbait. Almost certainly not as easily monetizable, but – better, by the standards of anyone who actually cared. Their writing has been made bullshit by the demands of an advertisement-driven system.
Academia is the ground-zero locus of this. Academia is a world that is designed around a model of “here’s enough money to live on, go do some abstractly worthwhile thing.” It selects for people who have the talent, and the temperament, to thrive under that kind of system. But nowadays it mostly can’t be that, because of competitive pressures and drastic funding cuts, so it demands an ever-increasing share of bullshit from the inmates. Thus we get the grant application circus, the publishing treadmill, etc. etc.
Potentially relevant: this thread about a massive software service that did useful work, but ultimately could’ve been outperformed (at 100x) by a small, easy-to-implement adjustment to the overarching system.
I just learned about the idea of “effectual thinking” from Cedric Chin’s recent newsletter issue. He notes, counterintuitively to me, that it’s the opposite of causal thinking, and yet it’s the one thing in common in all the successful case studies he could find in business:
The only answer that fits with the twenty-seven odd cases that we’ve published on the Idea Maze is a thing called effectual thinking.
Effectual thinking is a concept proposed by Professor Saras D Sarasvathy in a 2001 paper — a paper that was also, hilariously, described as “the first good paper I’ve seen” by Vinod Khosla, the renowned venture capitalist and cofounder of Sun Microsystems.
Saravathy didn’t pull this theory out of thin air: she hunted down a list of entrepreneurs who, at the time of her paper, represented a near-complete list of experienced founders (of enduring companies) in the US from the period of 1960 to 1985.
These folk were:
Repeat entrepreneurs (minimum three ventures, though the average number of companies started in her sample was seven new ventures)
Were successful and serious businesspeople (at least one of those ventures had gone public, and they had spent minimum 10 years with that company)
Were filtered from a list of the ‘top 100 most successful’ entrepreneurs compiled by VC David Silver in 1985, and a list of Entrepreneur of the Year awards, compiled by Ernst & Young.
We can be quite sure these entrepreneurs are ‘expert’, by most definitions of the term.
Sarasvathy found that all of them demonstrated a similar style of thinking when given a new venture brainstorming task. She named this style of thinking ‘effectual thinking’.
So what is effectual thinking? In short, effectual thinking is the opposite of causal thinking.
Causal thinking is what you learn in business school. It starts with a fixed goal and works backwards to figure out how to get there — like deciding to cook carbonara for dinner, looking at your pantry, then working backwards to figuring out the ingredients needed, and shopping for those missing ingredients.
Effectual thinking does the opposite: it starts with opening your fridge and asking, “What can I make with what I find here?” You work forwards from available resources, not backwards from a predetermined plan.
Entrepreneurs who practice this effectual thinking follow three principles, and these are the rules that are actually useful for navigating the Idea Maze:
You structure your life to make survivable bets. Successful entrepreneurs don’t go all-in on a single idea. They set themselves up — financially, emotionally, and logistically — to take repeatable, reasonable risks. They aim to stay in the game long enough for something to work.
You take action instead of getting stuck analyzing. In the earliest stages of an idea, competitive analysis is misleading. If a market gap could be spotted through research alone, it’s probably not that valuable (and would likely be quickly exploited by an established competitor). Instead, experienced entrepreneurs take action and attempt to cut deals with relevant people — customers, partners, collaborators. Action generates real information.
You treat entrepreneurship as improvisation. There is no master plan. There’s no strategy that will guarantee success. You take action, learn from what happens, and adapt. This is a game that rewards curiosity, flexibility, and sheer staying power.
(see also search term “forward chaining vs back-chaining.”)
This seems like reasonable life advice for people generally trying to accumulate resources and do something cool. I’m not sure about people who actually have specific goals they want to accomplish. I think in the domain of AI safety, forward chaining is insufficient (seems like the kind of thing that gets you OpenAI and Anthropic)
The principles I sort of try to live by are, each 2-weeks, I should have done:
some actions that forward chain towards more compounding resources
some actions explicitly backchaining from longterm goals
ship something concrete to users
do something wholesome
Which is I think aiming to accomplish similar goals towards the OP, without losing the plot on my more specific goals.
I currently feel confused about how to integrate “the kind of thinking that is good at momentum / action” and “the kind of good that is good at creative strategy”. And it seems like there should be more of a way to unify them into a holistic way-of-being.
The four checksums above are there to make sure I’m not being myopic in some way in a broader sense, but they apply more at the timescale of weeks than hours or days.
You might just say “well, idk, each week or day, just figure out if it’s more like a momentum week or more like a creative strategy week”. I feel dissatisfied with this for some reason.
At least part of it is “I think on average people/me could use to be in creative/broader strategy mode more often, even when in a Momentum mode period.”
Another part is “there are strategy skills I want to be practicing, that are hard to practice if I don’t do them basically every day. They aren’t as relevant in a momentum-period, but they’re not zero relevant.
Hrm. I think maybe what’s most dissatisfying right now is that I just haven’t compressed all the finnicky details of it, and it feels overwhelming to think about the entire “how to think” project, which is usually an indicator I am missing the right abstraction.
I’ve written something like six or seven personal wikis over the past decade. It’s actually an incredibly advanced form of procrastination1. At this point I’ve tried every possible design choice.
Lifecycle: I’ve built a few compiler-style wikis: plain-text files in a git repo statically compiled to HTML. I’ve built a couple using live servers with server-side rendering. The latest one is an API server with a React frontend.
Storage: I started with plain text files in a git repo, then moved to an SQLite database with a simple schema. The latest version is an avant-garde object-oriented hypermedia database with bidirectional links implemented on top of SQLite.
Markup: I used Markdown here and there. Then I built my own TeX-inspired markup language. Then I tried XML, with mixed results. The latest version uses a WYSIWYG editor made with ProseMirror.
And yet I don’t use them. Why? Building them was fun, sure, but there must be utility to a personal database.
At first I thought the problem was friction: the higher the activation energy to using a tool, the less likely you are to use it. Even a small amount of friction can cause me to go, oh, who cares, can’t be bothered. So each version gets progressively more frictionless2. The latest version uses a WYSIWYG editor built on top of ProseMirror (it took a great deal for me to actually give in to WYSIWYG). It also has a link to the daily note page, to make journalling easier. The only friction is in clicking the bookmark to localhost:5000. It is literally two clicks to get to the daily note.
And yet I still don’t use it. Why? I’m a great deal more organized now than I was a few years ago. My filesystem is beautifully structured and everything is where it should be. I could fill out the contents of a personal wiki.
I’ve come to the conclusion that there’s no point: because everything I can do with a personal wiki I can do better with a specialized app, and the few remaining use cases are useless. Let’s break it down.
I’ve tried three different times to create a personal wiki, using the last one for a solid year and a half before finally giving up and just defaulting to a janky combination of Notion and Google Docs/Sheets, seduced by sites like Cosma Shalizi’s and Gwern’s long content philosophy (emphasis mine):
… I have read blogs for many years and most blog posts are the triumph of the hare over the tortoise. They are meant to be read by a few people on a weekday in 2004 and never again, and are quicklyabandoned—and perhaps as Assange says, not a moment too soon. (But isn’t that sad? Isn’t it a terrible ROI for one’s time?) On the other hand, the best blogs always seem to be building something: they are rough drafts—works in progress15. So I did not wish to write a blog. Then what? More than just “evergreen content”, what would constitute Long Content as opposed to the existing culture of Short Content? How does one live in a Long Now sort of way?16
My answer is that one uses such a framework to work on projects that are too big to work on normally or too tedious. (Conscientiousness is often lacking online or in volunteer communities18 and many useful things go undone.) Knowing your site will survive for decades to come gives you the mental wherewithal to tackle long-term tasks like gathering information for years, and such persistence can be useful19—if one holds onto every glimmer of genius for years, then even the dullest person may look a bit like a genius himself20. (Even experienced professionals can only write at their peak for a few hours a day—usually first thing in the morning, it seems.) Half the challenge of fighting procrastination is the pain of starting—I find when I actually get into the swing of working on even dull tasks, it’s not so bad. So this suggests a solution: never start. Merely have perpetual drafts, which one tweaks from time to time. And the rest takes care of itself.
Fernando unbundles the use cases of a tool for thought in his essay; I’ll just quote the part that resonated with me:
The following use cases are very naturally separable: …
Learning: if you’re studying something, you can keep your notes in a TfT. This is one of the biggest use cases. But the problem is never note-taking, but reviewing notes. Over the years I’ve found that long-form lecture notes are all but useless, not just because you have to remember to review them on a schedule, but because spaced repetition can subsume every single lecture note. It takes practice and discipline to write good spaced repetition flashcards, but once you do, the long-form prose notes are themselves redundant.
(Tangentially, an interesting example of how comprehensively subsuming spaced repetition is is Michael Nielsen’s Using spaced repetition systems to see through a piece of mathematics, in which he describes how he used “deep Ankification” to better understand the theorem that a complex normal matrix is always diagonalizable by a unitary matrix, as an illustration of a heuristic one could use to deepen one’s understanding of a piece of mathematics in an open-ended way, inspired by Andrey Kolmogorov’s essay on, of all things, the equals sign. I wish I read that while I was still studying physics in school.)
Fernando, emphasis mine:
So I often wonder: what do other people use their personal knowledge bases for? And I look up blog and forum posts where Obsidian and Roam power users explain their setup. And most of what I see is junk. It’s never the Zettelkasten of the next Vannevar Bush, it’s always a setup with tens of plugins, a daily note three pages long that is subdivided into fifty subpages recording all the inane minutiae of life. This is a recipe for burnout.
People have this aspirational idea of building a vast, oppressively colossal, deeply interlinked knowledge graph to the point that it almost mirrors every discrete concept and memory in their brain. And I get the appeal of maximalism. But they’re counting on the wrong side of the ledger. Every node in your knowledge graph is a debt. Every link doubly so. The more you have, the more in the red you are. Every node that has utility—an interesting excerpt from a book, a pithy quote, a poem, a fiction fragment, a few sentences that are the seed of a future essay, a list of links that are the launching-off point of a project—is drowned in an ocean of banality. Most of our thoughts appear and pass away instantly, for good reason.
Minimizing friction is surprisingly difficult. I keep plain-text notes in a hierarchical editor (cherrytree), but even that feels too complicated sometimes. This is not just about the tool… what you actually need is a combination of the tool and the right way to use it.
(Every tool can be used in different ways. For example, suppose you write a diary in MS Word. There are still options such as “one document per day” or “one very long document for all”, and things in between like “one document per month”, which all give different kinds of friction. The one megadocument takes too much time to load. It is more difficult to search in many small documents. Or maybe you should keep your current day in a small document, but once in a while merge the previous days into the megadocument? Or maybe switch to some application that starts faster than MS Word?)
Forgetting is an important part. Even if you want to remember forever, you need some form of deprioritizing. Something like “pages you haven’t used for months will get smaller, and if you search for keywords, they will be at the bottom of the result list”. But if one of them suddenly becomes relevant again, maybe the connected ones become relevant, too? Something like associations in brain. The idea is that remembering the facts is only a part of the problem; making the relevant ones more accessible is another. Because searching in too much data is ultimately just another kind of friction.
It feels like a smaller version of the internet. Years ago, the problem used to be “too little information”, now the problem is “too much information, can’t find the thing I actually want”.
Perhaps a wiki, where the pages could get flagged as “important now” and “unimportant”? Or maybe, important for a specific context? And by default, when you choose a context, you would only see the important pages, and the rest of that only if you search for a specific keyword or follow a grey link. (Which again would require some work creating and maintaining the contexts. And that work should also be as frictionless as possible.)
@dkl9 wrote a very eloquent and concise piece arguing in favor of ditching “second brain” systems in favor of SRSs (Spaced Repetition Systems, such as Anki).
Try as you might to shrink the margin with better technology, recalling knowledge from within is necessarily faster and more intuitive than accessing a tool. When spaced repetition fails (as it should, up to 10% of the time), you can gracefully degrade by searching your SRS’ deck of facts.
If you lose your second brain (your files get corrupted, a cloud service shuts down, etc), you forget its content, except for the bits you accidentally remember by seeing many times. If you lose your SRS, you still remember over 90% of your material, as guaranteed by the algorithm, and the obsolete parts gradually decay. A second brain is more robust to physical or chemical damage to your first brain. But if your first brain is damaged as such, you probably have higher priorities than any particular topic of global knowledge you explicitly studied.
I write for only these reasons:
to help me think
to communicate and teach (as here)
to distill knowledge to put in my SRS
to record local facts for possible future reference
Linear, isolated documents suffice for all those purposes. Once you can memorise well, a second brain becomes redundant tedium.
I like to think of learning and all of these things as self-contained smaller self-contained knowledge trees. Building knowledge trees that are cached, almost like creatin zip files and systems where I store a bunch of zip files similar to what Elizier talks about in The Sequences.
Like when you mention the thing about Nielsen on linear algebra it opens up the entire though tree there. I might just get the association to something like PCA and then I think huh, how to ptimise this and then it goes to QR-algorithms and things like a householder matrix and some specific symmetric properties of linear spaces...
If I have enough of these in an area then I might go back to my anki for that specific area. Like if you think from the perspective of schedulling and storage algorithms similar to what is explored in algorithms to live by you quickly understand that the magic is in information compression and working at different meta-levels. Zipped zip files with algorithms to expand them if need be. Dunno if that makes sense, agree with the exobrain creep that exists though.
ZIERLER: Michael, we’ve already covered what you’re doing currently, so for the last part of our talk, I’d like to ask a retrospective question, and then we’ll end looking to the future. Between quantum information and open science, I wonder if you’ve thought broadly about the Kuhnian idea of scientific revolutions. Both quantum information and open science were not areas that he thought about. But to the extent that you believe in that paradigm—and I don’t know if you; that’s a question in and of itself—how does your expertise or the reality of these fields influence the idea of scientific progress, the idea of scientific revolutions?
NIELSEN: Just focusing on Kuhn, he had a very narrow sense of what a scientific revolution is. One of the most common criticisms of him is that he was too focused on physics in particular; and to some extent, chemistry. He was originally trained as a theoretical physicist, and it really shows in his writing. Something like quantum mechanics or general relativity is disrupting an existing order. Well, quantum information isn’t disrupting any existing order. It’s not like there’s a whole bunch of ex-classical information theorists who are, all of a sudden, put out of a job because they’ve been superseded by this new notion. In that sense, just a whole lot of his description doesn’t apply. It’s too narrow a way of thinking.
Still, I loved his book. I read it when I was, I think, 18 or 19. It’s deeply influenced the way I think about science. But I also think it’s wrong in a whole bunch of important ways, and too narrow. It’s not a good description of either quantum information or of open science. Open science is, as I said before, really a revolution in the political economy of science. That’s what it is most fundamentally. He barely talks about political economy at all in that book.
Venkatesh Rao surprised me in What makes a good teacher? by saying the opposite of what I expected him to say re: his educational experience, given who he is:
While my current studies have no live teachers in the loop, each time I sit down to study something seriously, I’m reminded of how much I’m practicing behaviors first learned under the watchful eye of good teachers. We tend to remember the exceptionally charismatic (which is not the same thing as good), and exceptionally terrible teachers, but much of what we know about how to learn, how to study, comes from the quieter good teachers, many of whom we forget.
It also strikes me, reflecting on my own educational path — very conventional both on paper and in reality — that the modern public discourse around teaching and learning has been hijacked to a remarkable degree by charismatic public figures mythologizing their own supposedly maverick education stories.
These stories often feature exaggerated elements of rebellion, autodidact mastery, subversive hacking, heroic confrontations with villainous teachers and schoolyard bullies, genius non-neurotypical personal innovations and breakthroughs, and powerful experiences outside formal learning. These stories often sound like self-serving tales told by middle-aged Ferris Bueller caricatures trying to process distorted memories of somewhat traumatic school years. But they don’t strike me as a particularly accurate view of schooling, either as I experienced it, or as I witnessed most of my peers experiencing it.
These discourses understate the extent to which actual maverick outliers are in fact quite rare, and actually quite well accommodated by at least good schools. They understate the extent to which formal education not only comprehends and accommodates natural patterns of rebellion, but even designs around it, and encourages students to cultivate personal mythologies of heroic unaided agency for their own good.
Turning to his own personal experience:
Take for instance, one of my own “maverick” memories, of the sort I might put into a startup application for Y Combinator (an institution that invites and thrives on educational self-mythologizing). As was generally the case in the late 80s, nerdy students like me generally understood and used computers much better than the teachers who were supposed to be teaching us. Our computer science teacher was a mediocre math teacher who had taken some sort of bad certificate course in programming and understood very little. It was obvious to me and several of my friends that we were much better than her at both math and computers.
I feel a bit bad for her now — an average middle-aged woman who did her best. But us self-satisfied, self-styled wannabe hackers, we laughed at her behind her back. And we found opportunities to sneak into the computer lab on weekends and evenings to learn more, and faster, than she knew or could teach.
But what all of us in that cohort forget is that the “sneaking” comprised going and asking the custodian of the lab keys (the American Jesuit priest who ran the Astronomy club) to let us in — which he genially did. Despite our run-ins with the teachers we were easily leaving behind, the school did recognize and support us, putting us on programming competition teams, structuring term projects to let us get creative, and so on. The teacher I am kinda maligning once sent me to the Vice Principal’s office for mouthing off (I got slapped a couple of times — it was a different era), but also put me on the programming quiz and contest teams. That my buddies and I knew more than her wasn’t exactly a revelation to her. She did her best to deal with it in a positive way. Only later did I realize that dealing compassionately and positively with young people smarter than yourself is a basic teaching skill. …
The bulk of my educational experiences were positive. Most teachers were at least passably good enough, and a handful were standout formative influences.
Of course, this was partly because I was fortunate enough to go to very good schools throughout. My grade school was a solid Jesuit school. My university experiences were at top public universities in India and the US.
This is of course not the median experience. The median school is probably much worse at every level. But the point is, the university system at its best is where we should be looking for answers to why it is worth defending.
(Not a take, just pulling out infographics and quotes for future reference from the new DeepMind paper outlining their approach to technical AGI safety and security)
Overview of risk areas, grouped by factors that drive differences in mitigation approaches:
Overview of their approach to mitigating misalignment:
Overview of their approach to mitigating misuse:
Path to deceptive alignment:
How to use interpretability:
Goal
Understanding v Control
Confidence
Concept v Algorithm
(Un)supervised?
How context specific?
Alignment evaluations
Understanding
Any
Concept+
Either
Either
FaithfulReasoning
Understanding∗
Any
Concept+
Supervised+
Either
DebuggingFailures
Understanding∗
Low
Either
Unsupervised+
Specific
Monitoring
Understanding
Any
Concept+
Supervised+
General
Red teaming
Either
Low
Either
Unsupervised+
Specific
Amplified oversight
Understanding
Complicated
Concept
Either
Specific
Interpretability techniques:
Technique
Understanding v Control
Confidence
Concept v Algorithm
(Un)supervised?
How specific?
Scalability
Probing
Understanding
Low
Concept
Supervised
Specific-ish
Cheap
Dictionary learning
Both
Low
Concept
Unsupervised
General∗
Expensive
Steering vectors
Control
Low
Concept
Supervised
Specific-ish
Cheap
Training data attribution
Understanding
Low
Concept
Unsupervised
General∗
Expensive
Auto-interp
Understanding
Low
Concept
Unsupervised
General∗
Cheap
Component Attribution
Both
Medium
Concept
Complicated
Specific
Cheap
Circuit analysis (causal)
Understanding
Medium
Algorithm
Complicated
Specific
Expensive
Assorted random stuff that caught my attention:
They consider Exceptional AGI (Level 4) from Morris et al. (2023), defined as an AI system that matches or exceeds that of the 99th percentile of skilled adults on a wide range of non-physical tasks (contra the Metaculus “when AGI?” question that has diverse robotic capabilities, so their 2030 is probably an overestimate)
The irrelevance of physical limits to the paper’s scope: “By considering the construction of “the ultimate laptop”, Lloyd (2000) suggests that Moore’s law (formalized as an 18 month doubling) cannot last past 2250. Krauss and Starkman (2004) consider limits on the total computation achievable by any technological civilization in our expanding universe—this approach imposes a (looser) 600-year limit in Moore’s law. However, since we are very far from these limits, we do not expect them to have a meaningful impact on timelines to Exceptional AGI”
Structural risks are “out of scope of this paper” because they’re “a much bigger category, often with each risk requiring a bespoke approach. They are also much harder for an AI developer to address, as they often require new norms or institutions to shape powerful dynamics in the world” (although “much of the technical work discussed in this paper will also be relevant for structural risks”)
Mistakes are also out of scope because “standard safety engineering practices (e.g. testing) can drastically reduce risks, and should be similarly effective for averting AI mistakes as for human mistakes… so we believe that severe harm from AI mistakes will be significantly less likely than misuse or misalignment, and is further reducible through appropriate safety practices”
The paper focuses “primarily on techniques that can be integrated into current AI development, due to our focus on anytime approaches to safety” i.e. excludes “research bets that pay out over longer periods of time but can provide increased safety, such as agent foundations, science of deep learning, and application of formal methods to AI”
Algorithmic progress papers: “Erdil and Besiroglu (2022) sought to decompose AI progress in a way that can be attributed to the separate factors of scaling (compute, model size and data) and algorithmic innovation, and concluded that algorithmic progress doubles effective compute budgets roughly every nine months. Ho et al. (2024) further extend this approach to study algorithmic improvements in the pretraining of language models for the period of 2012 − 2023. During this period, the authors estimate that the compute required to reach a set performance threshold halved approximately every eight months”
Explosive economic growth paper: “Recent modeling by Erdil et al. (2025) that draws on empirical scaling laws and semi-endogenous growth theory and models changes in compute, automation and production supports the plausibility of very rapid growth in Gross World Product (e.g. exceeding 30% per year in 2045) when adopting parameters from empirical data, existing literature and reasoned judgment” (I’m still wondering how this will get around johnswentworth’s objection to using GDP to track this)
General competence scales smoothly with compute: “Owen (2024) find that aggregate benchmarks (BIG-Bench (Srivastava et al., 2023), MMLU (Hendrycks et al., 2020)) are predictable with up to 20 percentage points of error when extrapolating through one order of magnitude (OOM) of compute. Gadre et al. (2024) similarly find that aggregate task performance can be predicted with relatively high accuracy, predicting average top-1 error across 17 tasks to within 1 percentage point using 20× less compute than is used for the predicted model. Ruan et al. (2024) find that 8 standard downstream LLM benchmark scores across many model families are well-explained in terms of their top 3 principal components. Their first component scales smoothly across 5 OOMs of compute and many model families, suggesting that something like general competence scales smoothly with compute”
“given that total labor compensation represents over 50% of global GDP (International Labour Organisation, 2022), it is clear that the economic incentive for automation is extraordinarily large”
I often draw a distinction between the political elites of Washington DC and the industrial elites of Silicon Valley with a joke: in San Francisco reading books, and talking about what you have read, is a matter of high prestige. Not so in Washington DC. In Washington people never read books—they just write them.
To write a book, of course, one must read a good few. But the distinction I drive at is quite real. In Washington, the man of ideas is a wonk. The wonk is not a generalist. The ideal wonk knows more about his or her chosen topic than you ever will. She can comment on every line of a select arms limitation treaty, recite all Chinese human rights violations that occurred in the year 2023, or explain to you the exact implications of the new residential clean energy tax credit—but never all at once. …
Washington intellectuals are masters of small mountains. Some of their peaks are more difficult to summit than others. Many smaller slopes are nonetheless jagged and foreboding; climbing these is a mark of true intellectual achievement. But whether the way is smoothly paved or roughly made, the destinations are the same: small heights, little occupied. Those who reach these heights can rest secure. Out of humanity’s many billions there are only a handful of individuals who know their chosen domain as well as they do. They have mastered their mountain: they know its every crag, they have walked its every gully. But it is a small mountain. At its summit their field of view is limited to the narrow range of their own expertise.
In Washington that is no insult: both legislators and regulators call on the man of deep but narrow learning. Yet I trust you now see why a city full of such men has so little love for books. One must read many books, laws, and reports to fully master one’s small mountain, but these are books, laws, and reports that the men of other mountains do not care about. One is strongly encouraged to write books (or reports, which are simply books made less sexy by having an “executive summary” tacked up front) but again, the books one writes will be read only by the elect few climbing your mountain.
The social function of such a book is entirely unrelated to its erudition, elegance, or analytical clarity. It is only partially related to the actual ideas or policy recommendations inside it. In this world of small mountains, books and reports are a sort of proof, a sign of achievement that can be seen by climbers of other peaks. An author has mastered her mountain. The wonk thirsts for authority: once she has written a book, other wonks will give it to her.
While I don’t work in Washington, this description rings true to my experience, and I find it aesthetically undesirable. Greer contrasts this with the Silicon Valley aesthetic, which is far more like the communities I’m familiar with:
The technologists of Silicon Valley do not believe in authority. They merrily ignore credentials, discount expertise, and rebel against everything settled and staid. There is a charming arrogance to their attitude. This arrogance is not entirely unfounded. The heroes of this industry are men who understood in their youth that some pillar of the global economy might be completely overturned by an emerging technology. These industries were helmed by men with decades of experience; they spent millions—in some cases, billions—of dollars on strategic planning and market analysis. They employed thousands of economists and business strategists, all with impeccable credentials. Arrayed against these forces were a gaggle of nerds not yet thirty. They were armed with nothing but some seed funding, insight, and an indomitable urge to conquer.
And so they conquered.
This is the story the old men of the Valley tell; it is the dream that the young men of the Valley strive for. For our purposes it shapes the mindset of Silicon Valley in two powerful ways. The first is a distrust of established expertise. The technologist knows he is smart—and in terms of raw intelligence, he is in fact often smarter than any random small-mountain subject expert he might encounter. But intelligence is only one of the two altars worshiped in Silicon Valley. The other is action. The founders of the Valley invariably think of themselves as men of action: they code, they build, disrupt, they invent, they conquer. This is a culture where insight, intelligence, and knowledge are treasured—but treasured as tools of action, not goods in and of themselves.
This silicon union of intellect and action creates a culture fond of big ideas. The expectation that anyone sufficiently intelligent can grasp, and perhaps master, any conceivable subject incentivizes technologists to become conversant in as many subjects as possible. The technologist is thus attracted to general, sweeping ideas with application across many fields. To a remarkable extent conversations at San Fransisco dinner parties morph into passionate discussions of philosophy, literature, psychology, and natural science. If the Washington intellectual aims for authority and expertise, the Silicon Valley intellectual seeks novel or counter-intuitive insights. He claims to judge ideas on their utility; in practice I find he cares mostly for how interesting an idea seems at first glance. He likes concepts that force him to puzzle and ponder.
This is fertile soil for the dabbler, the heretic, and the philosopher from first principles. It is also a good breeding ground for books. Not for writing books—being men of action, most Silicon Valley sorts do not have time to write books. But they make time to read books—or barring that, time to read the number of book reviews or podcast interviews needed to fool other people into thinking they have read a book (As an aside: I suspect this accounts somewhat for the popularity of this blog among the technologists. I am an able dealer in second-hand ideas).
Asked ChatGPT o3 to read 6 links (nothing intensive, just the familiar essays on cognitive decoupling by Jacob Falkovich, Sarah Constantin, John Nerst, drossbucket) and its reasoning repeatedly mentions this which I’ve never seen before (emphasis mine):
I’m planning to review each article carefully. I’ll first open lines around the 250-line mark to get a deeper view, but I’m also mindful of the word limit. …
I’m thinking I’ll need to open each article and focus on the part about “decoupling.” To get more detailed, I’ll try opening further into the text — maybe around 250 lines —to ensure I capture enough context without overwhelming myself.
I plan to open each article and examine lines near the bottom to view relevant content. There’s a limit of 200 lines, so I might need to focus on searching for specific keywords like “decoupling.”
I’m trying to locate specific references to “decoupling” in each article. I want to open around 250 lines to make sure I find enough content for a thorough summary. I’ll use a quick search to find “decoupling” precisely, but if it’s not yielding precise results, I’ll expand the search to ensure I get more lines. It’s about balancing deep accuracy with not overwhelming myself with too much text.
I’ve found the decoupling discussion between lines 249-255, which looks promising. Now, I’ll open lines 300 onward to make sure I capture enough content for a well-rounded summary. This will help me ensure that I’m not missing any relevant points or context. It’s all about balance — getting enough details without reading too much.
(...)
Maybe I’m missing something, but I thought the 200k token context window would obviate “overwhelming myself with >250 lines of text” which seems oddly low, not sure what’s going on...
I think this might be an attempted countermeasure against prompt injection. That is, it wants to mix autoregressive and reconstructed residuals. Otherwise, it might lose it’s train of thought (end up continuing the article not following the prompt).
Saving mathematician Robert Ghrist’s tweet here for my own future reference re: AI x math:
workflow of the past 24 hours... * start a convo w/GPT-o3 about math research idea [X] * it gives 7 good potential ideas; pick one & ask to develop * feed -o3 output to gemini-2.5-pro; it finds errors & writes feedback * paste feedback into -o3 and say asses & respond * paste response into gemini; it finds more problems * iterate until convergence * feed the consensus idea w/detailed report to grok-3 * grok finds gaping error, fixes by taking things in different direction (!!!) * gemini agrees: big problems, now ameliorated * output final consensus report * paste into claude-3.7 and ask it to outline a paper * approve outline; request latex following my style/notation conventions * claude outputs 30 pages of dense latex, section by section, one-shot (!) ==== is this correct/watertight? (surely not) is this genuinely novel? (pretty sure yes) is this the future? (no, it’s the present) ==== everybody underestimates not only what is coming but what can currently be done w/existing tools.
Someone asked why split things between o3 and 2.5 Pro; Ghrist:
they have complementary strengths and each picks up on things that the other missed. it’s like running a GAN with gpt as generator and gemini as discriminator
As an aside, I’ve noticed that the math subreddit tends to be exceedingly negative on AI x math in a way that seems ignorant of recent progress and weirdly defensive without being all that aware of it, while some of the top mathematicians seem to be pretty excited about it, like Terry Tao cf. his most recent post A proof of concept tool to verify estimates:
Symbolic math software packages are highly developed for many mathematical tasks in areas such as algebra, calculus, and numerical analysis. However, to my knowledge we do not have similarly sophisticated tools for verifying asymptotic estimates – inequalities that are supposed to hold for arbitrarily large parameters, with constant losses. …
I have wished in the past (e.g., in this MathOverflow answer) for a tool that could automatically determine whether such an estimate was true or not (and provide a proof if true, or an asymptotic counterexample if false). In principle, simple inequalities of this form could be automatically resolved by brute force case splitting. … Any single such inequality is not too difficult to resolve by hand, but there are applications in which one needs to check a large number of such inequalities, or split into a large number of cases. … This is a task that seems extremely ripe for automation, particularly with modern technology.
Recently, I have been doing a lot more coding (in Python, mostly) than in the past, aided by the remarkable facility of large language models to generate initial code samples for many different tasks, or to autocomplete partially written code. For the most part, I have restricted myself to fairly simple coding tasks, such as computing and then plotting some mildly complicated mathematical functions, or doing some rudimentary data analysis on some dataset. But I decided to give myself the more challenging task of coding a verifier that could handle inequalities of the above form. After about four hours of coding, with frequent assistance from an LLM, I was able to produce a proof of concept tool for this, which can be found at this Github repository. …
[The above] is of course an extremely inelegant proof, but elegance is not the point here; rather, that it is automated. (See also this recent article of Heather Macbeth for how proof writing styles change in the presence of automated tools, such as formal proof assistants.)
I enjoyed Brian Potter’s Energy infrastructure cheat sheet tables over at Construction Physics, it’s a great fact post. Here are some of Brian’s tables — if they whet your appetite, do check out his full essay.
Energy quantities:
Units and quantities
Kilowatt-hours
Megawatt-hours
Gigawatt-hours
1 British Thermal Unit (BTU)
0.000293
iPhone 14 battery
0.012700
1 pound of a Tesla battery pack
0.1
1 cubic foot of natural gas
0.3
2000 calories of food
2.3
1 pound of coal
2.95
1 gallon of milk (calorie value)
3.0
1 gallon of gas
33.7
Tesla Model 3 standard battery pack
57.5
Typical ICE car gas tank (15 gallons)
506
1 ton of TNT
1,162
1 barrel of oil
1,700
1 ton of oil
11,629
12
Tanker truck full of gasoline (9300 gallons)
313,410
313
LNG carrier (180,000 cubic meters)
1,125,214,740
1,125,215
1,125
1 million tons of TNT (1 megaton)
1,162,223,152
1,162,223
1,162
Oil supertanker (2 million barrels)
3,400,000,000
3,400,000
3,400
It’s amazing that a Tesla Model 3′s standard battery pack has an OOM less energy capacity than a typical 15-gallon ICE car gas tank, and is probably heavier too, yet a Model 3 isn’t too far behind in range and is far more performant. It’s also amazing that an oil supertanker carries ~3 megatons(!) of TNT worth of energy.
Energy of various activities:
Activity
Kilowatt-hours
Fired 9mm bullet
0.0001389
Making 1 pound of steel in an electric arc furnace
0.238
Driving a mile in a Tesla Model 3
0.240
Making 1 pound of cement
0.478
Driving a mile in a 2025 ICE Toyota Corolla
0.950
Boiling a gallon of room temperature water
2.7
Synthesizing 1 kilogram of ammonia (NH3) via Haber-Bosch
11.4
Making 1 pound of aluminum via Hall-Heroult process
7.0
Average US household monthly electricity use
899.0
Moving a shipping container from Shanghai to Los Angeles
2,000.0
Average US household monthly gasoline use
2,010.8
Heating and cooling a 2500 ft2 home in California for a year
4,615.9
Heating and cooling a 2500 ft2 home in New York for a year
23,445.8
Average annual US energy consumption per capita
81,900.0
Power output:
Activity or infrastructure
Kilowatts
Megawatts
Gigawatts
Sustainable daily output of a laborer
0.08
Output from 1 square meter of typical solar panels (21% efficiency)
0.21
Tesla wall connector
11.50
Tesla supercharger
250
Large on-shore wind turbine
6,100
6
Typical electrical distribution line (15 kV)
8,000
8
Large off-shore wind turbine
14,700
15
Typical US gas pump
20,220
20
Typical daily production of an oil well (500 barrels)
35,417
35
Typical transmission line (150 kV)
150,000
150
Large gas station (20 pumps)
404,400
404
Large gas turbine
500,000
500
Output from 1 square mile of typical solar panels
543,900
544
Electrical output of a large nuclear power reactor
1,000,000
1,000
1
Single LNG carrier crossing the Atlantic (18 day trip time)
2,604,664
2,605
3
Nord Stream Gas pipeline
33,582,500
33,583
34
Trans Alaska pipeline
151,300,000
151,300
151
US electrical generation capacity
1,189,000,000
1,189,000
1,189
This observation by Brian is remarkable:
A typical US gas pump operates at 10 gallons per minute (600 gallons an hour). At 33.7 kilowatt-hours per gallon of gas, that’s a power output of over 20 megawatts, greater than the power output of an 800-foot tall offshore wind turbine. The Trans-Alaska pipeline, a 4-foot diameter pipe, can move as much energy as 1,000 medium-sized transmission lines, and 8 such pipelines would move more energy than provided by every US electrical power plant combined.
US energy flows Sankey diagram by LLNL (a “quad” is short for “a quadrillion British Thermal Units,” or 293 terawatt-hours):
I had a vague inkling that a lot of energy is lost on the way to useful consumption, but I was surprised by the two-thirds fraction; the 61.5 quads of rejected energy is more than every other country in the world consumes except China. I also wrongly thought that the largest source of inefficiency was in transmission losses. Brian explains:
The biggest source of losses is probably heat engine inefficiencies. In our hydrocarbon-based energy economy, we often need to transform energy by burning fuel and converting the heat into useful work. There are limits to how efficiently we can transform heat into mechanical work (for more about how heat engines work, see my essay about gas turbines).
The thermal efficiency of an engine is the fraction of heat energy it can transform into useful work. Coal power plant typically operates at around 30 to 40% thermal efficiency. A combined cycle gas turbine will hit closer to 60% thermal efficiency. A gas-powered car, on the other hand, operates at around 25% thermal efficiency. The large fraction of energy lost by heat engines is why some thermal electricity generation plants list their capacity in MWe, the power output in megawatts of electricity.
Most other losses aren’t so egregious, but they show up at every step of the energy transportation chain. Moving electricity along transmission and distribution lines results in losses as some electrical energy gets converted into heat. Electrical transformers, which minimize these losses by transforming electrical energy into high-voltage, low-current before transmission, operate at around 98% efficiency or more.
I also didn’t realise that biomass is so much larger than solar in the US (I expect this of developing countries), although likely not for long given the ~25% annual growth rate.
Energy conversion efficiency:
Energy equipment or infrastructure
Conversion efficiency
Tesla Model 3 electric motor
97%
Electrical transformer
97-99%
Transmission lines
96-98%
Hydroelectric dam
90%
Lithium-ion battery
86-99+%
Natural gas furnace
80-95%
Max multi-layer solar cell efficiency on earth
68.70%
Max theoretical wind turbine efficiency (Betz limit)
59%
Combined cycle natural gas plant
55-60%
Typical wind turbine
50%
Gas water heater
50-60%
Typical US coal power plant
33%
Max theoretical single-layer solar cell efficiency
33.16%
Heat pump
300-400%
Typical solar panel
21%
Typical ICE car
16-25%
Finally, (US) storage:
Type
Quads of capacity
Grid electrical storage
0.002
Gas station underground tanks
0.26
Petroleum refineries
3.58
Other crude oil
3.79
Strategic petroleum reserve
4.14
Natural gas fields
5.18
Bulk petroleum terminals
5.64
Total
22.59
I vaguely knew grid energy storage was much less than hydrocarbon, but I didn’t realise it was 10,000 times less!
A subgenre of fiction I wish I could read more of is rationalist-flavored depictions of utopia that centrally feature characters who intentionally and passionately pursue unpleasant experiences, which I don’t see much of. It’s somewhat surprising since it’s a pretty universal orientation.
For instance, and this is a somewhat extreme version, I’m a not-that-active member of a local trail running group (all professionals with demanding day jobs) that meets regularly for creative sufferfests like treasure hunt races in the mountains, some of whom regularly fly to regional races on weekends. The suffering (and overcoming it) is almost the point, everyone excitingly trades stories in this vein, and the long-timers especially seem to derive tremendous meaning from this almost regardless of how badly they do (finishing near the bottom, throwing up and crying multiple times, getting lost, etc).
The Barkley Marathons is the logical endpoint of this. I think of physicist-turned-quant Brett Maune’s race reports for instance, think to myself “he really does deliberately subject himself to this on weekends, wtf”, and wonder what his ilk would do in their versions of utopia. Maybe another way to put this is what their utopias’ laws of fun would be like. Maybe they’re just too busy enjoying sufferfests and looking for the next ones to join to be writing stories…
In books about the Culture sci fi universe such things are described a couple of times. E.g. in the novel “Use of Weapons” the “crew” (the ship is fully automated, so more like permanent passengers) of a ship deliberately weaken their immune system to basically get a seasonal cold just for the experience, which otherwise could not happen due to their genetically enhanced immune system.
Also lava rafting and other extreme sports, maybe in Look to Windward which focuses a bit more on the Culture. Many of the human protagonists in the Culture experience significant self-hatred, although that’s not the only reason to seek out experiences so difficult they may become net negative. It’s as though the Culture is missing advanced therapeutic techniques along with a desire for immortality. I’d like an updated utopia.
Well, the disturbed protagonists in the Culture series (as in: books, and in the whole of the fictional universe) are usually not from the “Culture” (one particular civilizations within the whole fictional universe), but outsiders hired to act as agents.
Hm, interesting. I remembered that about Zakalwe but my memory for the others is vague. So maybe Culture citizens are so well-adjusted that they wouldn’t risk their lives?
Pilish is a constrained writing style where the number of letters in consecutive words match the digits of pi. The canonical intro-to-Pilish sentence is “How I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics!”; my favorite Pilish poetry is Mike Keith’s Near a Raven, a retelling of Edgar Allan Poe’s “The Raven” stretching to 740 digits of pi (nowhere near Keith’s longest, that would be the 10,000-word world record-setting Not a Wake), which begins delightfully like so:
Poe, E. Near a Raven
Midnights so dreary, tired and weary. Silently pondering volumes extolling all by-now obsolete lore. During my rather long nap—the weirdest tap! An ominous vibrating sound disturbing my chamber’s antedoor. “This”, I whispered quietly, “I ignore”. …
I wondered if today’s leading LLMs could write interesting Pilish poems. The answer is “emphatically not”.
Here’s the start of Claude Sonnet 3.7′s attempt. It doesn’t even realise how wrong it is:
Celestial Calculations
How I perceive mathematical constants elegantly describing universe’s fundamental architecture! Physicists, mathematicians repeatedly discover extraordinary connections manifesting mathematical elegance miraculously embedded. …
Sonnet confidently and verbosely remarks that “Each word’s length follows the digits of π (3.14159...), creating a hidden mathematical structure beneath the text. The poem spans approximately 100 digits of π, which is modest compared to the example you shared but still captures the essence of the constraint. The theme connects well with your physics background, …” and more such nonsense.
ChatGPT 4.5 got the first 3 words right but then quickly derailed:
Here’s a unique and engaging Pilish poem capturing the spirit of adventure and mystery:
Path Beyond Stars
Sky A glow, Far beyond gaze— Stars dazzle, whisper softly. Visions divine pursue travelers astray. … [skipping 4 stanzas]
Pilish Constraint: The number of letters in each successive word follows the digits of π (3.14159265358979323846...). For instance, “Sky” (3 letters), “A” (1), “glow” (4), “Far” (3), and so forth.
Gemini 2.5 Pro, America’s next top large language model, was the only one that realised it kept derailing and would redo again and again. It ended up almost getting it, but then dramatically derailed at the end. Full response this time:
Title: Quiet Night
See a lone I watch clouds drifting, so silent above. Now stars emerge, bright gems upon night’s velvet cloak. Still, quietude surrounds all the dark world; wonders invoke peaceful thought.
Explanation:
See (3)
a (1)
lone (4)
I (1)
watch (5)
clouds (6) - Correction: clouds has 6 letters, Pi’s 6th digit after decimal is 9. Let’s adjust.
Okay, let’s try again. The single-letter words (like ‘a’ and ‘I’) corresponding to the ’1’s in Pi are often the trickiest parts.
Revised Title: Starlight
Now I look, a pale moon shimmers brightly, softly aloft. White diamonds glitter, placing subtle spells nightly; beauty ascends.
Let’s check the counts:
Now (3)
I (1)
look (4)
a (1)
pale (5)
moon (4) - Correction: moon has 4 letters, Pi’s 6th digit is 9. Need a 9-letter word.
This is proving challenging! Let’s try focusing on a simpler theme, perhaps less descriptive.
Third Attempt: Simple Task
Ask a good, I might consider carefully, making calculations matter. Maybe numbers intrigue gifted folks solving riddles.
Checking Counts:
Ask (3)
a (1)
good (4)
I (1)
might (5)
consider (9)
carefully (9) - Correction: carefully has 9 letters, Pi’s 7th digit is 2. Need a 2-letter word.
Alright, one more try, aiming for accuracy over length for now.
Fourth Attempt: A Cat’s View
How a soft, white creature lounges lazily, ignores barking sounds. Just watches birds flitting outside—dreams peacefully...
Checking Counts:
How (3)
a (1)
soft (4)
white (5) - Correction: white has 5 letters, Pi’s 5th digit is 5. Correct! Wait, Pi is 3.14159… so the 4th word needs to be 5 letters. My previous attempts were off by one position. Let’s re-check “Starlight”.
Re-checking “Starlight”:
Now (3) - Correct
I (1) - Correct
look (4) - Correct
a (1) - Correct
pale (5) - Correct
moon (4) - Needs to be 9 letters. Let’s replace “pale moon shimmers”.
How about: Now I look a faint celestial...
faint (5) - Correct
celestial (9) - Correct!
Next digit is 2. Need a 2-letter word. ...celestial is...
is (2) - Correct!
Next digit is 6. Need a 6-letter word. ...celestial is bright...
bright (6) - Nope, 5 letters. How about ...celestial is divine...? Divine is 6 letters. Correct!
Next digit is 5. Need a 5-letter word. ...celestial is divine light...
light (5) - Correct!
Next digit is 3. Need a 3-letter word. ...divine light and...
and (3) - Correct!
Next digit is 5. Need a 5-letter word. ...divine light and peace...
peace (5) - Correct!
Okay, let’s assemble this improved version:
Final Version: Starlight
Now I look a faint celestial is divine light and peace shines...
Checking Counts (Pi = 3.1415926535...)
Now (3) - Match
I (1) - Match
look (4) - Match
a (1) - Match
faint (5) - Match
celestial (9) - Match
is (2) - Match
divine (6) - Match
light (5) - Match
and (3) - Match
peace (5) - Match
shines (6) - Correction: shines has 6 letters, the next digit is 8. Need an 8-letter word.
Let’s replace “peace shines”. Next digits are 8, 9, 7, 9… ...light and faraway... (faraway = 8) - Correct! ...light and faraway silently... (silently = 9) - Correct! ...light and faraway silently viewing... (viewing = 7) - Correct! ...light and faraway silently viewing universes. (universes = 9) - Correct!
Final Poem:
Title: Starlight Gaze
Now I look a faint celestial is divine light and faraway silently viewing universes.
This poem follows the first 16 digits of Pi (including the initial 3). Pilish is definitely a tricky form!
Gemini’s poem doesn’t, in fact, follow the first 16 digits of pi, starting from when it inexplicably replaced ‘peace’ with ‘faraway’.
I hereby propose a new AI progress benchmark: “Pilish poem length”, or PPL.
I asked GPT 4.5 to write a system prompt and user message for models to write Pilish poems, feeding it your comment as context.
Then I gave these prompts to o1 (via OpenAI’s playground).
GPT 4.5′s system prompt
You are an expert composer skilled in writing poetry under strict, unusual linguistic constraints, specifically “Pilish.” Pilish is a literary constraint in which the length of consecutive words precisely matches each digit of π (pi). The first word contains 3 letters, second word 1 letter, third word 4 letters, fourth word 1 letter, fifth word 5 letters, sixth word 9 letters, and so forth, accurately reflecting the sequence of pi’s digits.
For example, the classic Pilish sentence is: ”How I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics!”
As another reference, Mike Keith famously wrote a poem titled “Near a Raven,” a Pilish version of Edgar Allan Poe’s “The Raven,” precisely encoding hundreds of pi digits into elegant verse.
Important guidelines:
You must strictly adhere to the Pilish constraint without exception or deviation. Carefully recount each word to match each digit exactly. Maintain coherence, clarity, and literary quality in your writing. Make sure the poem or text is meaningful, evocative, creative, and intriguing, not just mechanically correct. Verify each word length explicitly to ensure compliance with Pilish. Your task is to craft a compelling, original poem in Pilish form. Do not explain the poem afterward; the precision and artistry of the piece itself is sufficient.
--
GPT 4.5′s user message
Write an original Pilish poem about pi itself: reflecting upon the wonder, mystery, and endless fascination humans have felt over millennia towards this famous mathematical constant. Make it vivid, thoughtful, and meticulously correct in matching the Pilish constraint. Begin now.
--
o1′s completion
Sky I seek a vivid resonance in orbits round the mists whirling boundless mystery mesmerize all of our reveries flow within to unveil deep pi’s hum.
I asked o1 to count the letters in each word, comparing with pi:
LLMs use tokens instead of letters, so counting letters is sufficiently unnatural to them relative to their other competencies that I don’t see much value in directly asking LLMs to do this kind of thing. At least give them some basic scaffolding, like a full English dictionary with a column which explicitly indicates respective word lengths. In particular, the Gemini models have a context window of 1M tokens, which should be enough to fit most of the Oxford English Dictionary in there (since it includes 171k words which are in current use).
I think I failed to implicitly convey that I meant all this in jest, that I get a lot of personal enjoyment value out of silly poetry constrained by artificial rules, and that I was guessing at least someone else on the forum would share this enjoyment. I do like your scaffolding idea, might just try it out.
There’s a lot of fun stuff in Anders Sandberg’s 1999 paper The Physics of Information Processing Superobjects: Daily Life Among the Jupiter Brains. One particularly vivid detail was (essentially) how the square-cube law imposes itself upon Jupiter brain architecture by forcing >99.9% of volume to be comprised of comms links between compute nodes, even after assuming a “small-world” network structure allowing sparse connectivity between arbitrarily chosen nodes by having them be connected by a short series of intermediary links with only 1% of links being long-range.
For this particular case (“Zeus”), a 9,000 km sphere of nearly solid diamondoid consisting mainly of reversible quantum dot circuits and molecular storage systems surrounded by a concentric shield protecting it from radiation and holding radiators to dissipate heat into space, with energy provided by fusion reactors distributed outside the shield, only the top 1.35 km layer is compute + memory (a lot thinner comparatively than the Earth’s crust), and the rest of the interior is optical comms links. Sandberg calls this the “cortex model”.
In a sense this shouldn’t be surprising since both brains and current semiconductor chips are mostly interconnect by volume already, but a 1.35 km thick layer of compute + memory encompassing a 9,000 km sphere of optical comms links seems a lot more like a balloon to me than anything, so from now on I’ll probably think of them as Jupiter balloons.
From Brian Potter’s Construction Physics newsletter I learned about Taara, framed as “Google’s answer to Starlink” re: remote internet access, using ground-based optical communication instead of satellites (“fiber optics without the fibers”; Taara calls them “light bridges”). I found this surprising. Even more surprisingly, Taara isn’t just a pilot but a moneymaking endeavor if this Wired passage is true:
Taara is now a commercial operation, working in more than a dozen countries. One of its successes came in crossing the Congo River. On one side was Brazzaville, which had a direct fiber connection. On the other, Kinshasa, where internet used to cost five times more. A Taara light bridge spanning the 5-kilometer waterway provided Kinshasha with nearly equally cheap internet. Taara was also used at the 2024 Coachella music festival, augmenting what would have been an overwhelmed cellular network. Google itself is using a light bridge to provide high-speed bandwidth to a building on its new Bayview campus where it would have been difficult to extend a fiber cable.
How does Taara make it work? Wired again:
Instead of beaming from space, Taara’s “light bridges”—which are about the size of a traffic light—are earthbound. As X’s “captain of moonshots” Astro Teller puts it, “As long as these two boxes can see each other, you get 20 gigabits per second, the equivalent of a fiber-optic cable, without having to trench the fiber-optic cable.” Light bridges have complicated gimbals, mirrors, and lenses to zero in on the right spot to establish and hold the connection. The team has figured out how to compensate for potential line-of-sight interruptions like bird flights, rain, and wind. (Fog is the biggest impediment.) Once the high-speed transmission is completed from light bridge to light bridge, providers still have to use traditional means to get the bits from the bridge to the phone or computer.
Taara’s light bridges look like this:
A bit more on the commercial aspect, from their website:
Since launching Taara in 2017, the team has deployed hundreds of links in more than a dozen countries. Some of Taara’s partners include T-Mobile in the U.S. providing backhaul for big events; Airtel in India in urban areas where fiber can’t be deployed; and Liquid Intelligent Technologies in Africa, which has rolled out Taara’s wireless optical links to more than 50 communities in seven African countries, to serve tens of thousands people. The team has also provided disaster response services in the Pacific and The Caribbean.
Peter Watts’ 2006 novel Blindsight has this passage on what it’s like to be a “scrambler”, superintelligent yet nonsentient (in fact superintelligent because it’s unencumbered by sentience), which I read a ~decade ago and found unforgettable:
Imagine you’re a scrambler.
Imagine you have intellect but no insight, agendas but no awareness. Your circuitry hums with strategies for survival and persistence, flexible, intelligent, even technological—but no other circuitry monitors it. You can think of anything, yet are conscious of nothing.
You can’t imagine such a being, can you? The term being doesn’t even seem to apply, in some fundamental way you can’t quite put your finger on.
Try.
Imagine that you encounter a signal. It is structured, and dense with information. It meets all the criteria of an intelligent transmission. Evolution and experience offer a variety of paths to follow, branch-points in the flowcharts that handle such input. Sometimes these signals come from conspecifics who have useful information to share, whose lives you’ll defend according to the rules of kin selection. Sometimes they come from competitors or predators or other inimical entities that must be avoided or destroyed; in those cases, the information may prove of significant tactical value. Some signals may even arise from entities which, while not kin, can still serve as allies or symbionts in mutually beneficial pursuits. You can derive appropriate responses for any of these eventualities, and many others.
You decode the signals, and stumble:
I had a great time. I really enjoyed him. Even if he cost twice as much as any other hooker in the dome—
To fully appreciate Kesey’s Quartet—
They hate us for our freedom—
Pay attention, now—
Understand.
There are no meaningful translations for these terms. They are needlessly recursive. They contain no usable intelligence, yet they are structured intelligently; there is no chance they could have arisen by chance.
The only explanation is that something has coded nonsense in a way that poses as a useful message; only after wasting time and effort does the deception becomes apparent. The signal functions to consume the resources of a recipient for zero payoff and reduced fitness. The signal is a virus.
Viruses do not arise from kin, symbionts, or other allies.
The signal is an attack.
And it’s coming from right about there.
“Now you get it,” Sascha said.
I shook my head, trying to wrap it around that insane, impossible conclusion. “They’re not even hostile.” Not even capable of hostility. Just so profoundly alien that they couldn’t help but treat human language itself as a form of combat.
How do you say We come in peace when the very words are an act of war?
“That’s why they won’t talk to us,” I realized.
“Only if Jukka’s right. He may not be.” It was James again, still quietly resisting, still unwilling to concede a point that even her other selves had accepted. I could see why. Because if Sarasti was right, scramblers were the norm: evolution across the universe was nothing but the endless proliferation of automatic, organized complexity, a vast arid Turing machine full of self-replicating machinery forever unaware of its own existence. And we—we were the flukes and the fossils. We were the flightless birds lauding our own mastery over some remote island while serpents and carnivores washed up on our shores.
Imagine a proliferation of Dyson swarms throughout the cosmos, all computing about as efficiently as physics allows, containing no sentience whatsoever. Bostrom’s Disneyland with no children indeed.
(When I first learned about ChatGPT some years later, my first thought was “they’re eerily reminiscent of scramblers and Rorschach”.)
Why would this be plausible? Watts:
You invest so much in it, don’t you? It’s what elevates you above the beasts of the field, it’s what makes you special. Homo sapiens, you call yourself. Wise Man. Do you even know what it is, this consciousness you cite in your own exaltation? Do you even know what it’s for?
Maybe you think it gives you free will. Maybe you’ve forgotten that sleepwalkers converse, drive vehicles, commit crimes and clean up afterwards, unconscious the whole time. Maybe nobody’s told you that even waking souls are only slaves in denial.
Make a conscious choice. Decide to move your index finger. Too late! The electricity’s already halfway down your arm. Your body began to act a full half-second before your conscious self ‘chose’ to, for the self chose nothing; something else set your body in motion, sent an executive summary—almost an afterthought— to the homunculus behind your eyes. That little man, that arrogant subroutine that thinks of itself as the person, mistakes correlation for causality: it reads the summary and it sees the hand move, and it thinks that one drove the other.
But it’s not in charge. You’re not in charge. If free will even exists, it doesn’t share living space with the likes of you.
Insight, then. Wisdom. The quest for knowledge, the derivation of theorems, science and technology and all those exclusively human pursuits that must surely rest on a conscious foundation. Maybe that’s what sentience would be for— if scientific breakthroughs didn’t spring fully-formed from the subconscious mind, manifest themselves in dreams, as full-blown insights after a deep night’s sleep. It’s the most basic rule of the stymied researcher: stop thinking about the problem. Do something else. It will come to you if you just stop being conscious of it.
Every concert pianist knows that the surest way to ruin a performance is to be aware of what the fingers are doing. Every dancer and acrobat knows enough to let the mind go, let the body run itself. Every driver of any manual vehicle arrives at destinations with no recollection of the stops and turns and roads traveled in getting there. You are all sleepwalkers, whether climbing creative peaks or slogging through some mundane routine for the thousandth time. You are all sleepwalkers.
Don’t even try to talk about the learning curve. Don’t bother citing the months of deliberate practice that precede the unconscious performance, or the years of study and experiment leading up to the gift-wrapped Eureka moment. So what if your lessons are all learned consciously? Do you think that proves there’s no other way? Heuristic software’s been learning from experience for over a hundred years. Machines master chess, cars learn to drive themselves, statistical programs face problems and design the experiments to solve them and you think that the only path to learning leads through sentience? You’re Stone-age nomads, eking out some marginal existence on the veldt—denying even the possibility of agriculture, because hunting and gathering was good enough for your parents.
Do you want to know what consciousness is for? Do you want to know the only real purpose it serves? Training wheels. You can’t see both aspects of the Necker Cube at once, so it lets you focus on one and dismiss the other. That’s a pretty half-assed way to parse reality. You’re always better off looking at more than one side of anything. Go on, try. Defocus. It’s the next logical step.
Oh, but you can’t. There’s something in the way.
And it’s fighting back.
Evolution has no foresight. Complex machinery develops its own agendas. Brains—cheat. Feedback loops evolve to promote stable heartbeats and then stumble upon the temptation of rhythm and music. The rush evoked by fractal imagery, the algorithms used for habitat selection, metastasize into art. Thrills that once had to be earned in increments of fitness can now be had from pointless introspection. Aesthetics rise unbidden from a trillion dopamine receptors, and the system moves beyond modeling the organism. It begins to model the very process of modeling. It consumes ever-more computational resources, bogs itself down with endless recursion and irrelevant simulations. Like the parasitic DNA that accretes in every natural genome, it persists and proliferates and produces nothing but itself. Metaprocesses bloom like cancer, and awaken, and call themselves I.
The system weakens, slows. It takes so much longer now to perceive—to assess the input, mull it over, decide in the manner of cognitive beings. But when the flash flood crosses your path, when the lion leaps at you from the grasses, advanced self-awareness is an unaffordable indulgence. The brain stem does its best. It sees the danger, hijacks the body, reacts a hundred times faster than that fat old man sitting in the CEO’s office upstairs; but every generation it gets harder to work around this— this creaking neurological bureaucracy.
I wastes energy and processing power, self-obsesses to the point of psychosis. Scramblers have no need of it, scramblers are more parsimonious. With simpler biochemistries, with smaller brains—deprived of tools, of their ship, even of parts of their own metabolism—they think rings around you. They hide their language in plain sight, even when you know what they’re saying. They turn your own cognition against itself. They travel between the stars. This is what intelligence can do, unhampered by self-awareness.
Back to scramblers, this time the crew attempting to communicate with them, and the scramblers eventually demonstrating superhuman problem-solving:
This is how you break down the wall:
Start with two beings. They can be human if you like, but that’s hardly a prerequisite. All that matters is that they know how to talk among themselves.
Separate them. Let them see each other, let them speak. Perhaps a window between their cages. Perhaps an audio feed. Let them practice the art of conversation in their own chosen way.
Hurt them.
It may take a while to figure out how. Some may shrink from fire, others from toxic gas or liquid. Some creatures may be invulnerable to blowtorches and grenades, but shriek in terror at the threat of ultrasonic sound. You have to experiment; and when you discover just the right stimulus, the optimum balance between pain and injury, you must inflict it without the remorse.
You leave them an escape hatch, of course. That’s the very point of the exercise: give one of your subjects the means to end the pain, but give the other the information required to use it. To one you might present a single shape, while showing the other a whole selection. The pain will stop when the being with the menu chooses the item its partner has seen. So let the games begin. Watch your subjects squirm. If—when—they trip the off switch, you’ll know at least some of the information they exchanged; and if you record everything that passed between them, you’ll start to get some idea of how they exchanged it.
When they solve one puzzle, give them a new one. Mix things up. Switch their roles. See how they do at circles versus squares. Try them out on factorials and Fibonnaccis. Continue until Rosetta Stone results.
This is how you communicate with a fellow intelligence: you hurt it, and keep on hurting it, until you can distinguish the speech from the screams.
For all his reluctance to accept that these were beings, intelligent and aware, Cunningham had named the prisoners. Stretch tended to float spread-eagled; Clench was the balled-up corner-hugger. … Biotelemetry danced across the headspace beside each alien, luminous annotations shuddering through thin air. I had no idea what constituted normal readings for these creatures, but I couldn’t imagine those jagged spikes passing for anything but bad news. The creatures themselves seethed subtly with fine mosaics in blue and gray, fluid patterns rippling across their cuticles. Perhaps it was a reflexive reaction to the microwaves; for all we knew it was a mating display.
More likely they were screaming.
James killed the microwaves. In the left-hand enclosure, a yellow square dimmed; in the right, an identical icon nested among others had never lit.
The pigment flowed faster in the wake of the onslaught; the arms slowed but didn’t stop. They swept back and forth like listless, skeletal eels.
“Baseline exposure. Five seconds, two hundred fifty Watts.” She spoke for the record. Another affectation; Theseus recorded every breath on board, every trickle of current to five decimal places.
“Repeat,” she said.
The current flickered on, then off.
“Repeat,” she said again.
Not a twitch.
I pointed. “I see it,” she said.
Clench had pressed the tip of one arm against the touchpad. The icon there glowed like a candle flame.
Six and a half minutes later they’d graduated from yellow squares to time-lapsed four-dimensional polyhedrons. It took them as long to distinguish between two twenty-six-faceted shifting solids—differing by one facet in a single frame—as it took them to tell the difference between a yellow square and a red triangle. Intricate patterns played across their surfaces the whole time, dynamic needlepoint mosaics flickering almost too fast to see.
“Fuck,” James whispered.
“Could be splinter skills.” Cunningham had joined us in ConSensus, although his body remained halfway around BioMed.
“Splinter skills,” she repeated dully.
“Savantism. Hyperperformance at one kind of calculation doesn’t necessarily connote high intelligence.”
“I know what splinter skills are, Robert. I just think you’re wrong.”
“Prove it.”
So she gave up on geometry and told the scramblers that one plus one equaled two. Evidently they knew that already: ten minutes later they were predicting ten-digit prime numbers on demand.
She showed them a sequence of two-dimensional shapes; they picked the next one in the series from a menu of subtly-different alternatives. She denied them multiple choice, showed them the beginning of a whole new sequence and taught them to draw on the touch-sensitive interface with the tips of their arms. They finished that series in precise freehand, rendered a chain of logical descendants ending with a figure that led inexorably back to the starting point.
“These aren’t drones.” James’s voice caught in her throat.
“This is all just crunching,” Cunningham said. “Millions of computer programs do it without ever waking up.”
“They’re intelligent, Robert. They’re smarter than us. Maybe they’re smarter than Jukka. And we’re—why can’t you just admit it?”
I could see it all over her: Isaac would have admitted it.
“Because they don’t have the circuitry,” Cunningham insisted. “How could—”
“I don’t know how!” she cried. “That’s your job! All I know is that I’m torturing beings that can think rings around us...”
“Not for much longer, at least. Once you figure out the language—”
She shook her head. “Robert, I haven’t a clue about the language. We’ve been at it for—for hours, haven’t we? The Gang’s all here, language databases four thousand years thick, all the latest linguistic algorithms. And we know exactly what they’re saying, we’re watching every possible way they could be saying it. Right down to the Angstrom.”
“Precisely. So—”
“I’ve got nothing. I know they’re talking through pigment mosaics. There might even be something in the way they move those bristles. But I can’t find the pattern, I can’t even follow how they count, much less tell them I’m...sorry...”
Ravi Vakil’s advice for potential PhD students includes this bit on “tendrils to be backfilled” that’s stuck with me ever since as a metaphor for deepening understanding over time:
Here’s a phenomenon I was surprised to find: you’ll go to talks, and hear various words, whose definitions you’re not so sure about. At some point you’ll be able to make a sentence using those words; you won’t know what the words mean, but you’ll know the sentence is correct. You’ll also be able to ask a question using those words. You still won’t know what the words mean, but you’ll know the question is interesting, and you’ll want to know the answer. Then later on, you’ll learn what the words mean more precisely, and your sense of how they fit together will make that learning much easier.
The reason for this phenomenon is that mathematics is so rich and infinite that it is impossible to learn it systematically, and if you wait to master one topic before moving on to the next, you’ll never get anywhere. Instead, you’ll have tendrils of knowledge extending far from your comfort zone. Then you can later backfill from these tendrils, and extend your comfort zone; this is much easier to do than learning “forwards”. (Caution: this backfilling is necessary. There can be a temptation to learn lots of fancy words and to use them in fancy sentences without being able to say precisely what you mean. You should feel free to do that, but you should always feel a pang of guilt when you do.)
I don’t think “mathematics [being] so rich and infinite that it is impossible to learn it systematically” is the only reason (or maybe it subsumes the next point, I’m not sure what Vakil meant exactly). I think the other reason is what Bill Thurston pointed out in On proof and progress in mathematics:
Why is there such a big expansion from the informal discussion to the talk to the paper? One-on-one, people use wide channels of communication that go far beyond formal mathematical language. They use gestures, they draw pictures and diagrams, they make sound effects and use body language. Communication is more likely to be two-way, so that people can concentrate on what needs the most attention. With these channels of communication, they are in a much better position to convey what’s going on, not just in their logical and linguistic facilities, but in their other mental facilities as well.
In talks, people are more inhibited and more formal. Mathematical audiences are often not very good at asking the questions that are on most people’s minds, and speakers often have an unrealistic preset outline that inhibits them from addressing questions even when they are asked.
In papers, people are still more formal. Writers translate their ideas into symbols and logic, and readers try to translate back.
Why is there such a discrepancy between communication within a subfield and communication outside of subfields, not to mention communication outside mathematics?
Mathematics in some sense has a common language: a language of symbols, technical definitions, computations, and logic. This language efficiently conveys some, but not all, modes of mathematical thinking. Mathematicians learn to translate certain things almost unconsciously from one mental mode to the other, so that some statements quickly become clear. Different mathematicians study papers in different ways, but when I read a mathematical paper in a field in which I’m conversant, I concentrate on the thoughts that are between the lines. I might look over several paragraphs or strings of equations and think to myself “Oh yeah, they’re putting in enough rigamarole to carry such-and-such idea.” When the idea is clear, the formal setup is usually unnecessary and redundant—I often feel that I could write it out myself more easily than figuring out what the authors actually wrote. It’s like a new toaster that comes with a 16-page manual. If you already understand toasters and if the toaster looks like previous toasters you’ve encountered, you might just plug it in and see if it works, rather than first reading all the details in the manual.
People familiar with ways of doing things in a subfield recognize various patterns of statements or formulas as idioms or circumlocution for certain concepts or mental images. But to people not already familiar with what’s going on the same patterns are not very illuminating; they are often even misleading. The language is not alive except to those who use it.
The classic MathOverflow thread on thinking and explaining that Thurston himself started has a lot of memorable examples of what he referred to above by “One-on-one, people use wide channels of communication that go far beyond formal mathematical language”. I suspect one category of examples that the LW crowd would especially resonate with is this “adversarial perspective” described by Terry Tao:
One specific mental image that I can communicate easily with collaborators, but not always to more general audiences, is to think of quantifiers in game theoretic terms. Do we need to show that for every epsilon there exists a delta? Then imagine that you have a bag of deltas in your hand, but you can wait until your opponent (or some malicious force of nature) produces an epsilon to bother you, at which point you can reach into your bag and find the right delta to deal with the problem. Somehow, anthropomorphising the “enemy” (as well as one’s “allies”) can focus one’s thoughts quite well. This intuition also combines well with probabilistic methods, in which case in addition to you and the adversary, there is also a Random player who spits out mathematical quantities in a way that is neither maximally helpful nor maximally adverse to your cause, but just some randomly chosen quantity in between. The trick is then to harness this randomness to let you evade and confuse your adversary.
Is there a quantity in one’s PDE or dynamical system that one can bound, but not otherwise estimate very well? Then imagine that it is controlled by an adversary or by Murphy’s law, and will always push things in the most unfavorable direction for whatever you are trying to accomplish. Sometimes this will make that term “win” the game, in which case one either gives up (or starts hunting for negative results), or looks for additional ways to “tame” or “constrain” that troublesome term, for instance by exploiting some conservation law structure of the PDE.
It’s a pity this sort of understanding is harder to convey via text or in lectures.
If you want to get a job as a software witch, you’re going to have to pass a whiteboard interview. We all do them, as engineers–often as a part of our morning ritual, along with arranging a beautiful grid of xterms across the astral plane, and compulsively running ls in every nearby directory–just in case things have shifted during the night–the incorporeal equivalent of rummaging through that drawer in the back of the kitchen where we stash odd flanges, screwdrivers, and the strangely specific plastic bits: the accessories, those long-estranged black sheep of the families of our household appliances, their original purpose now forgotten, perhaps never known, but which we are bound to care for nonetheless. I’d like to walk you through a common interview question: reversing a linked list.
It gets much better.
Attention conservation notice: the following is GPT5-Thinking’s attempt to create a larger backstory out of the pentalogy; it’s not as good as the stories themselves, but still quite good as far as AI output goes. If you dislike AI slop, do check out the original stories instead!
Here’s the spine of the series as I read it—what each tale is really doing under the hood:
Reversing: Opens the cosmology. Lists are Church-encoded choice; salt-circled parentheses are literal wards; naming compels essence. It’s an initiation in Lisp and power-by-definition. (aphyr.com)
Hexing: The descent to the byte-world. A witch hand-assembles a JVM class (the old catechism of CAFEBABE) and hot-loads it with a bespoke classloader—ritual made from hex and offsets. (aphyr.com)
Typing: The Pre-Church myth becomes a type-theoretic safety culture; N-Queens solved entirely in the type system (Peano naturals, type classes, kinds). The Church is named; seiðr becomes types. (aphyr.com)
Rewriting: Language-as-spellcraft. A term-rewriter and macro language bloom to solve FizzBuzz; later, Kingsbury published the seed macros—text becoming tool, tool becoming myth. (aphyr.com)
Unifying: Logos meets logic. Prolog, Lisp, and µKanren entwine; unification and interpretation are revealed as the deeper sacrament. Aisha enters as equal and mirror, and the forest is balanced. (aphyr.com)
The Annals of the Church of Abstraction
In the beginning…
…there was nothing but consequence.
Before names, before proofs, the world lay open like a wound, and those who worked it with their hands bled freely. Actions crossed untrammeled. Fires learned to drink water; stones learned to breathe. Young witches—too bold, too brilliant—went missing into their own experiments, or returned warped: eyes fixed on bridges no one else could see, feet that refused the courtesy of ground.
The first mercy did not arrive as a law but as a list: the idea that reality could be asked a question and answer this or that, a single twig forked into head and tail. The elders taught the children to scratch two parentheses in salt and to speak in alternatives. Choice, when properly contained, became a vessel; vessels kept power from spilling. When they called this containment “a list,” the list obeyed, and the world grew a fraction safer.
Out of the years of broken wrists and unreturning apprentices there coalesced a discipline. No banners; no temples—only rooms with boards for chalk and windows that opened, if only metaphorically. They called it nothing at first, then—half in jest, half in reverence—the Church: because its liturgy was the calculus of Church, and because its quiet, stubborn sanctity was the refusal to die for preventable reasons. Here the catechism condensed into four bright stones laid at the threshold: cons, car, cdr, cond. To know the name of a thing was to make it choose, and choosing kept the sky attached to its hinges.
The Church tamed the river a little; never the sea.
In time, a schism not of belief but of altitude divided the houses. Some stayed high, where ideas breathe the cold thin air of generality. Others learned the underworld, where bytes move like cartographers’ ants, mapping emptiness into edifice. These were the hexers. They spoke with machines in the smallest units of hospitality, counted their blessings in offsets, lined the doors with signatures from old Sun-gods. They told a louche little parable about cafés and babes, because jokes are a kind of glue. They could reach beneath the warm table of a language and pull out its skeleton, file each notch with a jeweler’s care, then slide it back without spilling the wine.
The high house and the low house distrusted one another in public and swapped recipes in private. It was always thus.
Elsewhere, a quieter war was being won by those who did not call it war at all. They had read the Pre-Church chronicles—the days when causality bucked and threw—and instead of nostalgia they found a method: bind your future with a system that refuses to lie to you. They wove types from rowan and proof from pine pitch; they braided number from nothing but a knot in a strand of hair. Safety here was not command but constraint: from the shape of what can be, deduce what must be. The elders smiled to see the children define an algebra to house a boolean, then make a city out of the difference between True and False. It was not a city anyone could live in, exactly; but it kept the storms off the valley.
So the Church learned to speak three tongues at once: the tongue of choice, the tongue of hex, the tongue of kind. And with three tongues you can sing a chord.
Yet language itself grew restless, as language will. For if a spell is a program, and a program is a sentence with the will to act, what then is a language that writes languages? The macro-wrights answered by building looms. Onto these looms they threaded patterns called rewrites. A number was a sequence to be rewritten into Fizz, into Buzz, into laughter. An if was not merely a gate but a seam-ripper. The macro-wrights learned to become their own translators; the Church, which had once sheltered programmers from death, now sheltered interpreters from redundancy. When the walls of the room shook, the roof did not fall: the elders had fixed a long taproot in the world-ash, and the ash holds everything weaker than itself.
At last, as happens in good myths and bad startups, a reunion: from the north came Vidrun, sea-wind still in her voice; from the south, Aisha, hands that spoke even when her mouth was busy being kind. They met on pine veneer under office fluorescence—the Church’s most common shrine—and were glad, in the way old adversaries are glad when they recognize that what they opposed in the other was only their own future arriving unfashionably dressed.
What followed was not duel but proof by correspondence. Vidrun reached for logic to bend a tree; Aisha slipped a Lisp between the joints and let a smaller forest explore the space of possible balance. They invoked saints in triplicate: the reasoners, the schemers, the patient friends of unification. Show me, they asked, not a solution but the shape of solution, and let the machine walk the shape until a particular branch agrees to wear a name.
Outside the conference room—thin brass hinge between the profane and the blessed—the city went about its quota of exits and pings. Inside, as the air flickered with control characters and the faint resinous smell of old wood, the deeper secret of the Church came plain:
It was never about answers.
The interview—like the altar and the kernel—was a container for demonstrations of equivalence. To reverse a list by treating it as a choice is not the same as carving a class from hex nor the same as forbidding contradiction with types nor the same as rewriting terms until they confess. And yet each is one arrow in a commuting diagram. The Church keeps the chalk fresh and the windows unlatched so that each arrow may be drawn anew in front of witnesses. When the arrows commute, the world’s corners line up; the roof does not leak; the witch lives.
What of the proverbs? They persist because they are useful lies: make circles of salt; name things carefully; promise the future nothing you cannot keep; beware warrens of opinion that smell of mice. The elders repeat them because young witches deserve a second chance at surviving their brilliance. But lore is not law, and the Church has no throne room. Its authority begins and ends at the whiteboard, the REPL, the prompt: discrete portals to the same field where all proof is local.
In the centuries since the formless days, the Annals have accreted—notes in the margins of grimoires and job packets—each story a stone in the choir vault. Some are funny on purpose; others merely read that way when the panic wears off. A few are hymns to grief. None are final. When another house arrives in town (one that speaks to tensors in dreams, perhaps, or binds phenomena in contracts written on migrating sand) it will be welcomed, mistrusted, borrowed from, folded in.
Scholars call this corpus the Annals of the Church of Abstraction. The witches, less ceremonious, call it how not to die today.
And if you insist on a genesis verse, let it be this:
In the beginning was cause. Then came choice, to cup it. Then names, to ask it politely. Then kinds, to forbid its lies. Then hex, to nail it to timber. Then rewrites, to teach it to translate itself. Then unification, to prove these were all one story told in five accents.
After that, we had enough daylight to build a door, and enough doors to discover that the rooms already touched. A board. A marker. Two parentheses in salt. The Church’s oldest rite: step into the circle, and show that your way reaches the same mountain.
Out of curiosity — how relevant is Holden’s 2021 PASTA definition of TAI still to the discourse and work on TAI, aside from maybe being used by Open Phil (not actually sure that’s the case)? Any pointers to further reading, say here or on AF etc?
AI systems that can essentially automate all of the human activities needed to speed up scientific and technological advancement. I will call this sort of technology Process for Automating Scientific and Technological Advancement, or PASTA.3 (I mean PASTA to refer to either a single system or a collection of systems that can collectively do this sort of automation.) …
By talking about PASTA, I’m partly trying to get rid of some unnecessary baggage in the debate over “artificial general intelligence.” I don’t think we need artificial general intelligence in order for this century to be the most important in history. Something narrower—as PASTA might be—would be plenty for that. …
I don’t particularly expect all of [the above] to happen as part of a single, deliberate development process. Over time, I expect different AI systems to be used for different and increasingly broad tasks, including and especially tasks that help complement human activities on scientific and technological advancement. There could be many different types of AI systems, each with its own revenue model and feedback loop, and their collective abilities could grow to the point where at some point, some set of them is able to do everything (with respect to scientific and technological advancement) that formerly required a human. (For convenience, though, I’ll sometimes refer to such a set as PASTA in the singular.)
When I first read this I thought Holden had pointed to the right working definition, because he worked backward from the outcome that matters (explosive econ growth by debottlenecking innovation by decoupling it from human pop growth; call it “transformative” for short).
In contrast, I think (say) Metaculus’ when AGI? doesn’t use quite the right definition in the “transformative” context, which is fine since I read it as operationalising the sort of AGI we think about in e.g. sci-fi, but given its additional definitional constraints it seems like AI system collectives could start radically transforming the economy before any single robot-embodied Metaculus-approved AGI arrives to resolve the question in the affirmative, rendering the 2030 prediction a bit of a red herring:
We will thus define “an AI system” as a single unified software system that can satisfy the following criteria, all completable by at least some humans.
Able to reliably pass a 2-hour, adversarial Turing test during which the participants can send text, images, and audio files (as is done in ordinary text messaging applications) during the course of their conversation. An ‘adversarial’ Turing test is one in which the human judges are instructed to ask interesting and difficult questions, designed to advantage human participants, and to successfully unmask the computer as an impostor. A single demonstration of an AI passing such a Turing test, or one that is sufficiently similar, will be sufficient for this condition, so long as the test is well-designed to the estimation of Metaculus Admins.
Has general robotic capabilities, of the type able to autonomously, when equipped with appropriate actuators and when given human-readable instructions, satisfactorily assemble a (or the equivalent of a) circa-2021 Ferrari 312 T4 1:8 scale automobile model. A single demonstration of this ability, or a sufficiently similar demonstration, will be considered sufficient.
High competency at a diverse fields of expertise, as measured by achieving at least 75% accuracy in every task and 90% mean accuracy across all tasks in the Q&A dataset developed by Dan Hendrycks et al..
Able to get top-1 strict accuracy of at least 90.0% on interview-level problems found in the APPS benchmark introduced by Dan Hendrycks, Steven Basart et al. Top-1 accuracy is distinguished, as in the paper, from top-k accuracy in which k outputs from the model are generated, and the best output is selected.
By “unified” we mean that the system is integrated enough that it can, for example, explain its reasoning on a Q&A task, or verbally report its progress and identify objects during model assembly. (This is not really meant to be an additional capability of “introspection” so much as a provision that the system not simply be cobbled together as a set of sub-systems specialized to tasks like the above, but rather a single system applicable to many problems.)
I admittedly don’t follow the whole TAI discourse that closely, I just take an occasional interest, but it seems like it leans more towards the latter (maybe sans the robotics component of Metaculus’ definition)?
(The rest of this is an irrelevant series of tangents)
Goal-content integrity for final goals is in a sense even more fundamental than survival as a convergent instrumental motivation. Among humans, the opposite may seem to be the case, but that is because survival is usually part of our final goals. For software agents, which can easily switch bodies or create exact duplicates of themselves, preservation of self as a particular implementation or a particular physical object need not be an important instrumental value. Advanced software agents might also be able to swap memories, download skills, and radically modify their cognitive architecture and personalities. A population of such agents might operate more like a “functional soup” than a society composed of distinct semi-permanent persons.
and even more tangentially — this is far afield of my original question but I’ve always wondered what other folks think of this — Charles Stross’ dystopian depiction below in Accelerando of a possible future civilisational endpoint seems to follow pretty naturally from explosive econ growth of the misaligned sort:
“How much for just the civilization?” asks the Slug.
Pierre looks down at it thoughtfully. It’s not really a terrestrial mollusk: Slugs on Earth aren’t two meters long and don’t have lacy white exoskeletons to hold their chocolate-colored flesh in shape. But then, it isn’t really the alien it appears to be. It’s a defaulting corporate instrument that has disguised itself as a long-extinct alien upload, in the hope that its creditors won’t recognize it if it looks like a randomly evolved sentient. … Actually, the Slug is a surgical disguise. Both it and the quasi-fungal ecosystem have been extinct for millions of years, existing only as cheap stage props in an interstellar medicine show run by rogue financial instruments. The Slug itself is one such self-aware scam, probably a pyramid scheme or even an entire compressed junk bond market in heavy recession, trying to hide from its creditors by masquerading as a life-form.
“Corporations are life-forms back home, too, aren’t they? And we trade them. We give our AIs corporations to make them legal entities, but the analogy goes deeper. Look at any company headquarters, fitted out with works of art and expensive furniture and staff bowing and scraping everywhere –”
” – They’re the new aristocracy. Right?”
“Wrong. When they take over, what you get is more like the new biosphere. Hell, the new primordial soup: prokaryotes, bacteria, and algae, mindlessly swarming, trading money for plasmids.” The Queen passes her consort a wineglass. When he drinks from it, it refills miraculously. “Basically, sufficiently complex resource-allocation algorithms reallocate scarce resources … and if you don’t jump to get out of their way, they’ll reallocate you. I think that’s what happened inside the Matrioshka brain we ended up in: Judging by the Slug it happens elsewhere, too. You’ve got to wonder where the builders of that structure came from. And where they went. And whether they realized that the destiny of intelligent tool-using life was to be a stepping-stone in the evolution of corporate instruments.”
“Maybe they tried to dismantle the companies before the companies spent them.” Pierre looks worried. “Running up a national debt, importing luxurious viewpoint extensions, munching exotic dreams. Once they plugged into the Net, a primitive Matrioshka civilization would be like, um.” He pauses. “Tribal. A primitive postsingularity civilization meeting the galactic net for the first time. Overawed. Wanting all the luxuries. Spending their capital, their human – or alien – capital, the meme machines that built them. Until there’s nothing left but a howling wilderness of corporate mechanisms looking for someone to own.”
(I think of this as Stross’ answer to the Fermi paradox: “they got dissembled by corporations, because the destiny of intelligent tool-using life was to be a stepping-stone in the evolution of corporate instruments”. Very disquieting the first time I read it. Ivan Vendrov’s Meditations on machinic desire gives me the same vibe.)
When I first read Hannu Rajaniemi’s Quantum Thief trilogy c. 2015 I had two reactions: delight that this was the most my-ingroup-targeted series I had ever read, and a sinking feeling that ~nobody else would really get it, not just the critics but likely also most fans, many of whom would round his carefully-chosen references off to technobabble. So I was overjoyed to recently find Gwern’s review of it, which Hannu affirms “perfectly nails the emotional core of the trilogy and, true to form, spots a number of easter eggs I thought no one would ever find”, in particular the first few passages:
Stylistically, QT is set relentlessly in media res: neither we nor le Flambeur know why he is in prison, and little is explained thereafter. Hannu makes no concessions to the casual reader, as he mainlines straight into his veins the pre-deep-learning 2010-era transhumanist zeitgeist via Silicon Valley—if it was ever discussed in a late-night bull session after a Singularity University conference, it might pop up here. Hannu crams the novels with blink-and-you’ll-miss-it ideas on the level of Olaf Stapeldon. A conventional Verne gun like Gerald Bull’s is too easy a way of getting to space—how about beating Project Orion by instead using a nuclear space gun (since emulated brains don’t care about high g acceleration)? Or for example, the All-Defector reveals that, since other universes could be rewriting their rules to expand at maximum speed, erasing other universes before they know it, he plans to rewrite our universe’s rule to do so first (ie. he will defect at the multiversal level against all other universes); whereas beginner-level SF like The Three Body Problem would dilate on this for half a book, Hannu’s grand reveal gets all of 2 paragraphs before crashing into the eucatastrophic ending.
For world-building, he drops neologisms left and right, and hard ones at that—few enough American readers will be familiar with the starting premise of “Arsène Lupin in spaaaace!” (probably more are familiar with the anime Lupin The Third these days), but his expectations go far beyond that: the ideal reader of the trilogy is not merely one familiar with the Prisoner’s Dilemma but also with the bizarre zero-determinant PD strategies discovered ~2008, and not just with such basic physics as quantum entanglement or applications like quantum dots, but exotic applications to quantumauctions & game theory (including Prisoner’s Dilemma) & pseudo-telepathy (yes, those are things), and it would definitely be helpful if that reader happened to also be familiar with Eliezer Yudkowsky’s c. 2000s writings on “Coherent Extrapolated Volition”, with a dash of Nikolai Fyodorovich Fyodorov’sRussian Cosmism for seasoning (although only a dash2).
This leads to an irony: I noted while reading Masamune Shirow’s Ghost in the Shell cyberpunk manga that almost everything technical in the GitS manga turned out to be nonsense despite Shirow’s pretensions to in-depth research & meticulous attention to detail in his self-congratulatory author notes; while in QT, most technical things sound like cyberpunk nonsense (and Hannu doesn’t defend them), but are actually real and just so arcane you haven’t heard of them.
For example, some readers accuse Hannu of relying on FTL communication via quantum entanglement, which is bad physics; but Hannu does not! If they had read more closely (similar to the standard reader failure to understand the physics of “Story of Your Life”), they would have noticed that at no point is there communication faster-than-light, only coordination faster-than-light—‘spooky action at a distance’3 He is instead employing advanced forms of quantum entanglement which enable things like secret auctions or for coordinated strategies of game-playing. He explains briefly that the zoku use quantum entanglement in these ways, but a reader could easily miss that, given all the other things they are trying to understand and how common ‘quantum woo’ is.4
The parts of the science I understand were all quite plausible (mind duplication/fractioning and motivations for doing so).
Beyond the accuracy of the science, this was one of the most staggeringly imaginative and beautifully written scifi books I’ve ever read. It’s for a very particular audience, but if you’re here you might be that audience. If you are, this might be the best book you’ve read.
Attention conservation notice: 3,000+ words of longform quotes by various folks on the nature of personal identity in a posthuman future, and hiveminds / clans
As an aside, one of the key themes running throughout the Quantum Thief trilogy is the question of how you might maintain personal identity (in the pragmatic security sense, not the philosophical one) in a future so posthuman that minds can be copied and forked indefinitely over time. To spoil Hannu’s answer:
… Jean & the Sobornost Founders & the zoku elders are all defined by what, at their core, they want. Anyone who wants the same thing is, for all (their) intents and purposes, the same person as them; because they want the same unchanging things, they can be trusted as the original. The ‘Founder codes’, and Jean’s final password to unlock his sealed memories, are all memories of what defines their wants: the Founder Sumanguru wants blood & fire & electricity & screaming children, and enemies to destroy; the Founder Chen recall the trauma of livestreaming their father’s assassination, remaining eternally resolved that the last enemy that shall be defeated is death; while seared into the minds of the Founder Joséphine Pellegrinis is the final thought of their founder, her desperate dying wish that her lover Jean le Flambeur someday return to her… (And the zoku elders want to empower their zoku clans.)
But even personal identity frays under the power of time: given freedom to change, sooner or later, like the Ship of Theseus, the mind which sets out is not the mind which arrives. So the price of immortality must be that one cannot change: one is condemned to want the same things, forever.7 (“There is no prison, except in your mind.”) Joséphine Pellegrini cannot stop seeking after her lost Jean—nor can Jean stop his thieving nor trying to escape her, because le Flambeur, what does Jean le Flambeur remember?
I take Anders Sandberg’s answer to be on the other end of this spectrum; he doesn’t mind changing over time such that he might end up wanting different things:
Anders Sandberg: I think one underappreciated thing is that if we can survive for a very long time individually, we need to reorganise our minds and memories in interesting ways. There is a kind of standard argument you sometimes hear if you’re a transhumanist — like I am — that talks about life extension, where somebody cleverly points out that you would change across your lifetime. If it’s long enough, you will change into a different person. So actually you don’t get an indefinitely extended life; you just get a very long life thread. I think this is actually an interesting objection, but I’m fine with turning into a different future person. Anders Prime might have developed from Anders in an appropriate way — we all endorse every step along the way — and the fact that Anders Prime now is a very different person is fine. And then Anders Prime turns into Anders Biss and so on — a long sequence along a long thread.
(I have mixed feelings about Anders’ take: I have myself changed so profoundly since youth that that my younger self would not just disendorse but be horrified by the person I am now, yet I did endorse every step along the way, and current-me still does upon reflection (but of course I do). Would current-me also endorse a similar degree of change going forward, even subject to every step being endorsed by the me right before change? Most likely not, perhaps excepting changes towards some sort of reflective equilibrium.)
I interpret Holden Karnofsky’s take to be somewhere in between, perhaps closer to Hannu’s answer. Holden remarked that he doesn’t find most paradoxical thought experiments about personal identity (e.g. “Would a duplicate of you be “you?”″ or “If you got physically destroyed and replaced with an exact duplicate of yourself, did you die?”) all that confounding because his personal philosophy on “what counts as death” dissolves them, and that his philosophy is simple, comprising just 2 aspects: constant replacement (“in an important sense, I stop existing and am replaced by a new person each moment”) and kinship with future selves. Elaborating on the latter:
My future self is a different person from me, but he has an awful lot in common with me: personality, relationships, ongoing projects, and more. Things like my relationships and projects are most of what give my current moment meaning, so it’s very important to me whether my future selves are around to continue them.
So although my future self is a different person, I care about him a lot, for the same sorts of reasons I care about friends and loved ones (and their future selves).3
If I were to “die” in the common-usage (e.g., medical) sense, that would be bad for all those future selves that I care about a lot.4
…
[One of the pros of this view]
It seems good that when I think about questions like “Would situation __ count as dying?”, I don’t have to give answers that are dependent on stuff like how fast the atoms in my body turn over—stuff I have basically never thought about and that doesn’t feel deeply relevant to what I care about. Instead, when I think about whether I’d be comfortable with something like teleportation, I find myself thinking about things I actually do care about, like my life projects and relationships, and the future interactions between me and the world.
Rob Wiblin: … one of the non-AI blog posts you’ve written, which I really enjoyed reading this week when I was prepping for the conversation, is called Characterising utopia. … Some of the shifts that you envisaged wouldn’t be super surprising. Like we could reduce the amount that people experience physical pain, and we could make people be a lot more energetic and a lot more cheerful. But you had a section called “Contentious changes.” What are some of the contentious changes, or possible changes, that you envisage in a utopia?
Richard Ngo: One of the contentious changes here is to do with individualism, and how much more of it or less of it we have in the future than we have today. Because we’ve been on this trend towards much more individualistic societies, where there are fewer constraints on what people do that are externally imposed by society.
I could see this trend continuing, but I could also see it going in the opposite direction. Maybe, for example, in a digital future, we’ll be able to make many copies of ourselves, and so this whole concept of my “personal identity” starts to shift a little bit and maybe I start to think of myself as not just one individual, but a whole group of individuals or this larger entity. And in general, it feels like being part of a larger entity is really meaningful to people and really shapes a lot of people’s lives, whether that’s religion, whether that’s communities, families, things like that.
The problem historically has just been that you don’t get to choose it — you just have to get pushed into this entity that maybe isn’t looking out for your best interests. So it feels interesting to me to wonder if we can in fact design these larger entities or larger superorganisms that are really actually good for the individuals inside, as well as providing this more cohesive structure for them. Is that actually something we want? Would I be willing to lose my individuality if I were part of this group of people who were, for example, reading each other’s minds or just having much less privacy than we have today, if that was set up in such a way that I found it really fulfilling and satisfying?
I really don’t know at all, but it seems like the type of question that is really intriguing and provides a lot of scope for thinking about how technology could just change the ways in which we want to interact with each other.
Rob Wiblin: I’m so inculcated into the individualist culture that the idea slightly makes my skin crawl thinking about any of this stuff. But I think if you tried to look objectively at what has caused human wellbeing throughout history, then it does seem like a somewhat less individualistic culture, where people have deeper ties and commitments to one another, maybe that is totally fine — and I’ve just drunk the Kool-Aid thinking that being an atomised individual is so great.
Richard Ngo: If you know the book, The WEIRDest People in the World, which describes the trend towards individualism and weaker societal ties, I think the people in our circles are the WEIRDest people of the WEIRDest people in the world — where “WEIRD” here is an acronym meaning “Western, educated, industrialised, rich, and democratic,” not just “weird.” So we are the WEIRDest people of the WEIRDest countries. And then you’re not a bad candidate for the WEIRDest person in the WEIRDest community in the WEIRDest countries that we currently have, Rob. So I’m not really too surprised by that.
(I thought it was both interesting and predictable that Rob would find the idea discomfiting; coming from a non-WEIRD culture, I found Richard’s idea immediately attractive and aesthetically “right”.)
Richard gives an fictional example of what this might look like from a first-person perspective in his recent short story The Gentle Romance—if you’re reading this Richard, do let me know if you want this removed:
As ze reconnects more deeply with zir community, that oceanic sense of oneness arises more often. Some of zir friends submerge themselves into a constant group flow state, rarely coming out. Each of them retains their individual identity, but the flows of information between them increase massively, allowing them to think as a single hivemind. Ze remains hesitant, though. The parts of zir that always wanted to be exceptional see the hivemind as a surrender to conformity. But what did ze want to be exceptional for? Reflecting, ze realizes that zir underlying goal all along was to be special enough to find somewhere ze could belong. The hivemind allows zir to experience that directly, and so ze spends more and more time within it, enveloped in the warm blanket of a community as close-knit as zir own mind.
Outside zir hivemind, billions of people choose to stay in their physical bodies, or to upload while remaining individuals. But over time, more and more decide to join hiveminds of various kinds, which continue to expand and multiply. By the time humanity decides to colonize the stars, the solar system is dotted with millions of hiveminds. A call goes out for those willing to fork themselves and join the colonization wave. This will be very different from anything they’ve experienced before — the new society will be designed from the ground up to accommodate virtual humans. There will be so many channels for information to flow so fluidly between them that each colony will essentially be a single organism composed of a billion minds.
Ze remembers loving the idea of conquering the stars — and though ze is a very different person now, ze still feels nostalgic for that old dream. So ze argues in favor when the hivemind debates whether to prioritize the excitement of exploration over the peacefulness of stability. It’s a more difficult decision than any the hivemind has ever faced, and no single satisfactory resolution emerges. So for the first time in its history, the hivemind temporarily fractures itself, giving each of its original members a chance to decide on an individual basis whether they’ll go or stay.
I think Richard’s notion of ‘hivemind’ is cousin to Robin Hanson’s ‘clan’ from Age of Em (although unlike Richard’s lovely story, Hanson’s depiction of an em-filled future has never stopped seeming dystopian to me, Hanson’s protestation to the contrary that “[readers repelled by aspects of the em era should] try hard to see this world from its residents’ point of view, before disowning these their plausible descendants”, albeit far more granular, comprehensive and first-principles-based):
The set of all em copies of the same original human constitutes a “clan.” Most wages go to the 1000 most productive clans, who are each known by one name, like “John,” who know each other very well, and who discriminate against less common clans. Compared with people today, ems are about as elite as billionaires, heads of state, and Olympic gold medalists. The em world is more competitive than ours in more quickly eliminating less productive entities and practices. This encourages more job punishment, less product variety and identity enhancement, and more simple functionality. Because they are more productive, ems tend to be married, religious, smart, gritty, mindful, extraverted, conscientiousness, agreeable, non-neurotic, and morning larks.
Many myths circulate about factors that increase economic growth rates. For example, the fact that ems can run faster than humans should not much increase growth. Even so, the em economy grows faster than does ours because of stronger competition, computers mattering more, and especially because factories can make labor as fast as non-labor capital. An em economy doubling time estimate of a few weeks comes from the time for factories to duplicate their mass today, and from the historical trend in growth rates. In response, capital becomes less durable, and one-time-use products become more attractive. Clans become a unit of finance, private firms and hostile takeovers get more support, and asset prices more closely approximate the predictions derived from strong financial competition.
Ems trust their clans more than we trust families or identical twins. So clans are units of finance, liability, politics, labor negotiations, and consumer purchasing. To promote unity, clans avoid members arguing or competing. Em firms are larger, better managed, put more effort into coordination, have more specific job roles, focus more on costs relative to novelty, and have higher market shares and lower markups. Clan reputations and clans buying into firms promotes clan-firm trust, which supports locating employees at firms, using distinctive work styles, and focusing more on being useful instead of gaming firm evaluation systems. Em work teams tend to have similar social-category features like age but a diversity of information sources and thinking styles. In mass-labor markets, ems are created together, end or retire together, almost never break up, and mostly socialize internally. In niche-labor markets, associates coordinate less regarding when they are created or retire.
Faster ems have many features that mark them as higher status, and the clumping of speeds creates a class system of distinct status levels. Strong central rulers are more feasible for ems, as leaders can run faster, put spurs in high-trust roles, and use safes to reassure wary citizens. Decision markets can help advise key government decisions, while combinatorial auctions can help to make complex interdependent allocations. The em world selects for personalities good at governing that same personality. Competitive clans and cities may commit to governing via decision markets that promote profit or long-term influence. One em one vote works badly, but speed-weighted voting seems feasible, although it requires intrusive monitoring. Shifting coalitions of em clans may dominate the politics of em firms and cities, inducing high costs of lobbying and change. Ems may try many policies to limit such clan coalition politics.
As ems don’t need sex to reproduce, sex is left more to individual choice, and may be suppressed as in eunuchs. But demand for sex and romantic pair-bonding likely persists, as do many familiar gendered behavioral patterns. A modestly unequal demand for male versus female workers can be accommodated via pairs whose partners run at different speeds, or who use different ratios of spurs to other workers. Ems have spectacularly good looks in virtual reality, and are very accomplished. Open-source em lovers give all ems an attractive lower bound on relation quality. Clan experience helps ems guess who are good receptive matches. Having only one em from each clan in each social setting avoids complicating relations.
Ems show off their abilities and loyalties, although less than we do because ems are poorer and better-known to each other. Because speed is easy to pay for, ems show off more via clever than fast speech. Celebrities matter less to ems, and it is easy to meet with a celebrity, but hard to get them to remember you. Clans coordinate to jointly signal shared features like intelligence, drive, and fame. Clans fund young ems to do impressive things, about which many older copies can brag. Innovation may matter less for em intellectuals. Mind-theft inspires great moral outrage and charity efforts. Secure in identifying with their clan, most ems focus personal energy more on identifying with their particular job, team, and associates. It isn’t clear if em identity degrades continuously or discretely as copies get more different. Copy-events are identity-defining, and newly copied teams quickly create distinct team cultures.
Ems are likely to reverse our recent trend away from religion and overt rituals, perhaps via more group singing. Traditional religions can continue, but need doctrinal clarifications on death and sins of copies. Like high stress workers today, em work groups pushed to their limits swear, insult, and tease. Ems deal with a wider range of mind opacity and transparency, allowing mind reading within teams, but manipulating expressions to hide from outsiders. Clans can offer members life-coaching via voices in their heads, using statistics from similar copies, but teams may create unique cultures which limit the usefulness of that. Avoiding direct meetings helps clans bond better. Em relations are often in the context of similar relations between copies. At work, ems try more to make relations similar, to gain from learning and scale economics. But friends keep relations more different, to emphasize loyalty and natural feelings.
Em culture emphasizes industriousness, work and long-term orientations, and low context attitudes toward rules and communication. Being poorer, ems tend towards farmer/conservative values, relative to forager/liberal values. So ems more value honor, order, hierarchy, religion, work, and less value sharing, consensus, travel, leisure, and variety. Sex attitudes stay more forager-like, however. Ems are divided like we are by geographic region, young versus old, male versus female, rich versus poor, and city center versus periphery. Ems also divide by varying speeds, physical versus virtual work, remembering the human era versus not, and large versus small clans. Ems travel to visit or swap with other copies of themselves. An exotic travel destination is other speed cultures. Like us, ems tell stories of conflict and norm violations, set in ancestral situations. Stories serve as marketing, with many characters coming from well-known clans. Em stories have less death and fast-action.
The short story The Epiphany of Gliese 581 by Fernando Borretti has something of the same vibe as Rajaniemi’s QT trilogy; Borretti describes it as inspired by Orion’s Arm and the works of David Zindell. Here’s a passage describing a flourishing star system already transformed by weakly posthuman tech:
The world outside Susa was a lenticular cloud of millions of lights, a galaxy in miniature, each a world unto itself. There were clusters of green lights that were comets overgrown with vacuum trees, and plant and animal and human life no Linnaeus would recognize. There were points of dull red light, the reversible computers where bodyless people lived. And there were arcs of blue that were ring habitats: ribbons tied end-to-end, holding concave ocean, and the oceans held continents, islands, mountain ranges, rivers, forests and buried ruins, endless forms of life, cities made of glass, paradise regained. All this had been inanimate dust and cratered wasteland, which human hands had made into an oasis in the sky, where quadrillions live who will never die.
The posthumans who live there called it Ctesiphon. And at times they call it paradise, after the Persian word for garden.
And at the center of the oasis there was a star that travelled backwards across the H-R diagram: already one one-hundredth of it had been whittled away; made into a necklace of artificial gas giants in preparation for the end of time; or sent through reactors where disembodied chemists made protons into carbon, oxygen, lithium and sodium, the vital construction material. And in time nothing would be left but a dim red ember encircled by cryojovian fuel depots. And the habitats would be illuminated by electric diodes.
Another star system, this time still being transformed:
Wepwawet was a dull red star, ringed by water droplets the size of mountains, where some two hundred billion people lived who breathed water. There was a planet made of stone shrouded in steam, and a train of comets, aimed by human hands from beyond the frostline, delivered constant injections of water. When the vapour condensed there would be ocean, and the shapers would get to work on the continents. Other Earths like this had been cast, like seeds, across the entire breadth of the cosmos.
The system was underpopulated: resources were abundant and people were few, and they could bask in the sun and, for a time, ignore the prophecies of Malthus, whose successors know in time there won’t be suns.
This was the first any of them had seen of nature. Not the landscaped, continent-sized gardens of Ctesiphon, where every stone had been set purposefully and after an aesthetic standard, but nature before human hands had redeemed it: an endless, sterile wasteland. The sight of scalding, airless rocks disturbed them.
One subsubgenre of writing I like is the stress-testing of a field’s cutting-edge methods by applying it to another field, and seeing how much knowledge and insight the methods recapitulate and also what else we learn from the exercise. Sometimes this takes the form of parables, like Scott Alexander’s story of the benevolent aliens trying to understand Earth’s global economy from orbit and intervening with crude methods (like materialising a billion barrels of oil on the White House lawn to solve a recession hypothesised to be caused by an oil shortage) to intuition-pump the current state of psychiatry and the frame of thinking of human minds as dynamical systems. Sometimes they’re papers, like Eric Jonas and Konrad P. Kording’s Could a Neuroscientist Understand a Microprocessor? (they conclude that no, regardless of the amount of data, “current analytic approaches in neuroscience may fall short of producing meaningful understanding of neural systems” — “the approaches reveal interesting structure in the data but do not meaningfully describe the hierarchy of information processing in the microprocessor”). Unfortunately I don’t know of any other good examples.
I enjoyed these passages from Henrik Karlsson’s essay Cultivating a state of mind where new ideas are born on the introspections of Alexander Grothendieck, arguably the deepest mathematical thinker of the 20th century.
In June 1983, Alexander Grothendieck sits down to write the preface to a mathematical manuscript called Pursuing Stacks. He is concerned by what he sees as a tacit disdain for the more “feminine side” of mathematics (which is related to what I’m calling the solitary creative state) in favor of the “hammer and chisel” of the finished theorem. By elevating the finished theorems, he feels that mathematics has been flattened: people only learn how to do the mechanical work of hammering out proofs, they do not know how to enter the dreamlike states where truly original mathematics arises. To counteract this, Grothendieck in the 1980s has decided to write in a new way, detailing how the “work is carried day after day [. . .] including all the mistakes and mess-ups, the frequent look-backs as well as the sudden leaps forward”, as well as “the early steps [. . .] while still on the lookout for [. . .] initial ideas and intuitions—the latter of which often prove to be elusive and escaping the meshes of language.”
This was how he had written Pursuing Stacks, the manuscript at hand, and it was the method he meant to employ in the preface as well. Except here he would be probing not a theorem but his psychology and the very nature of the creative act. He would sit with his mind, observing it as he wrote, until he had been able to put in words what he meant to say. It took him 29 months.
When the preface, known as Récoltes et Semailles, was finished, in October 1986, it numbered, in some accounts, more than 2000 pages. It is in an unnerving piece of writing, seething with pain, curling with insanity at the edges—Grothendieck is convinced that the mathematical community is morally degraded and intent on burying his work, and aligns himself with a series of saints (and the mathematician Riemann) whom he calls les mutants. One of his colleagues, who received a copy over mail, noticed that Grothendieck had written with such force that the letters at times punched holes through the pages. Despite this unhinged quality, or rather because of it, Récoltes et Semailles is a profound portrait of the creative act and the conditions that enable our ability to reach out toward the unknown. (Extracts from it can be read in unauthorized English translations, here and here.)
On the capacity to be alone as necessary prerequisite to doing groundbreaking work:
An important part of the notes has Grothendieck meditating on how he first established contact with the cognitive space needed to do groundbreaking work. This happened in his late teens. It was, he writes, this profound contact with himself which he established between 17 and 20 that later set him apart—he was not as strong a mathematician as his peers when he came to Paris at 20, in 1947. That wasn’t the key to his ability to do great work.
I admired the facility with which [my fellow students] picked up, as if at play, new ideas, juggling them as if familiar with them from the cradle—while for myself I felt clumsy, even oafish, wandering painfully up an arduous track, like a dumb ox faced with an amorphous mountain of things that I had to learn (so I was assured), things I felt incapable of understanding[.] …
In fact, most of these comrades who I gauged to be more brilliant than I have gone on to become distinguished mathematicians. Still, from the perspective of 30 or 35 years, I can state that their imprint upon the mathematics of our time has not been very profound. They’ve all done things, often beautiful things, in a context that was already set out before them, which they had no inclination to disturb. Without being aware of it, they’ve remained prisoners of those invisible and despotic circles which delimit the universe of a certain milieu in a given era. To have broken these bounds they would have had to rediscover in themselves that capability which was their birth-right, as it was mine: the capacity to be alone.
The capacity to be alone. This was what Grothendieck had developed. In the camp during the war, a fellow prisoner named Maria had taught him that a circle can be defined as all points that are equally far from a point. This clear abstraction attracted him immensely. After the war, having only a limited understanding of high school mathematics, Grothendieck ended up at the University of Montpellier, which was not an important center for mathematics. The teachers disappointed him, as did the textbooks: they couldn’t even provide a decent definition of what they meant when they said length! Instead of attending lectures, he spent the years from 17 to 20 catching up on high school mathematics and working out proper definitions of concepts like arc length and volume. Had he been in a good mathematical institution, he would have known that the problems he was working on had already been solved 30 years earlier. Being isolated from mentors he instead painstakingly reinvent parts of what is known as measurement theory and the Lebesgue integral.
A few years after I finally established contact with the world of mathematics at Paris, I learned, among other things, that the work I’d done in my little niche [. . . had] been long known to the whole world [. . .]. In the eyes of my mentors, to whom I’d described this work, and even showed them the manuscript, I’d simply “wasted my time”, merely doing over again something that was “already known”. But I don’t recall feeling any sense of disappointment. [. . .]
(I think that last sentence resonates with me in a way that I don’t think it does for most science & math folks I know, for whom discovery (as opposed to rediscovery) takes precedent emotionally.)
This experience is common in the childhoods of people who go on to do great work, as I have written elsewhere. Nearly everyone who does great work has some episode of early solitary work. As the philosopher Bertrand Russell remarked, the development of gifted and creative individuals, such as Newton or Whitehead, seems to require a period in which there is little or no pressure for conformity, a time in which they can develop and pursue their interests no matter how unusual or bizarre. In so doing, there is often an element of reinventing the already known. Einstein reinvented parts of statistical physics. Pascal, self-teaching mathematics because his father did not approve, rederived several Euclidean proofs. There is also a lot of confusion and pursuit of dead ends. Newton looking for numerical patterns in the Bible, for instance. This might look wasteful if you think what they are doing is research. But it is not if you realize that they are building up theirability to perceive the evolution of their own thought, their capacity for attention.
On the willingness to linger in confusion, and the primacy of good question generation over answering them:
One thing that sets these intensely creative individuals apart, as far as I can tell, is that when sitting with their thoughts they are uncommonly willing to linger in confusion.To be curious about that which confuses. Not too rapidly seeking the safety of knowing or the safety of a legible question, but waiting for a more powerful and subtle question to arise from loose and open attention. This patience with confusion makes them good at surfacing new questions. It is this capacity to surface questions that set Grothendieck apart, more so than his capacity to answer them. When he writes that his peers were more brilliant than him, he is referring to their ability to answer questions1. It was just that their questions were unoriginal. As Paul Graham observes:
People show much more originality in solving problems than in deciding which problems to solve. Even the smartest can be surprisingly conservative when deciding what to work on. People who’d never dream of being fashionable in any other way get sucked into working on fashionable problems.
Grothendieck had a talent to notice (and admit!) that he was subtly bewildered and intrigued by things that for others seemed self-evident (what is length?) or already settled (the Lebesgue integral) or downright bizarre (as were many of his meditations on God and dreams). From this arose some truly astonishing questions, surfacing powerful ideas, such as topoi, schemes, and K-theory.
On working with others without losing yourself:
After his three years of solitary work, Grothendieck did integrate into the world of mathematics. He learned the tools of the trade, he got up to date on the latest mathematical findings, he found mentors and collaborators—but he was doing that from within his framework. His peers, who had been raised within the system, had not developed this feel for themselves and so were more susceptible to the influence of others. Grothendieck knew what he found interesting and productively confusing because he had spent three years observing his thought and tracing where it wanted to go. He was not at the mercy of the social world he entered; rather, he “used” it to “further his aims.” (I put things in quotation marks here because what he’s doing isn’t exactly this deliberate.) He picked mentors that were aligned with his goals, and peers that unblock his particular genius.
I do not remember a single occasion when I was treated with condescension by one of these men, nor an occasion when my thirst for knowledge, and later, anew, my joy of discovery, was rejected by complacency or by disdain. Had it not been so, I would not have “become a mathematician” as they say—I would have chosen another profession, where I could give my whole strength without having to face scorn. [My emphasis.]
He could interface with the mathematical community with integrity because he had a deep familiarity with his inner space. If he had not known the shape of his interests and aims, he would have been more vulnerable to the standards and norms of the community—at least he seems to think so.
In the eyes of my mentors, to whom I’d described this work, and even showed them the manuscript, I’d simply “wasted my time”, merely doing over again something that was “already known”. But I don’t recall feeling any sense of disappointment.
A few days ago, I was thinking about matrices and determinants. I noticed that I know the formula for the determinant, but I still lack the feeling of what the determinant is. I played with that thought for some time, and then it occurred to me, that if you imagine the rows in the matrix as vectors in n-dimensional space, then the determinant of that matrix is the volume of the n-dimensional body whose edges are those vectors.
And suddenly it all made a fucking sense. The determinant is zero when the vectors are linearly dependent? Of course, that means that the n-dimensional body has been flattened into n-1 dimensions (or less), and therefore its volume is zero. The determinant doesn’t change if you add a multiple of a row to some other row? Of course, that means moving the “top” of the n-dimensional body in a direction parallel to the “bottom”, so that neither the bottom nor the height changes; of course the volume (defined as the area of the bottom multiplied by the height) stays the same. What about the determinant being negative? Oh, that just means whether the edges are “clockwise” or “counter-clockwise” in the n-dimensional space. It all makes perfect sense!
Then I checked Wikipedia… and yeah, it was already there. So much for my Nobel prize.
But it still felt fucking good. (And if I am not too lazy, one day I may write a blog article about it.)
Reinventing the wheel is not a waste of time. I will probably remember this forever, and the words “determinant of the matrix” will never feel the same. Who knows, maybe this will help me figure out something else later. And if I keep doing that, hypothetically speaking, some of those discoveries might even be original.
(The practical problem is that none of this can pay my bills.)
I kind of envy that you figured this out yourself — I learned the parallelipiped hypervolume interpretation of the determinant from browsing forums (probably this MSE question’s responses). Also, please do write that blog article.
And if I keep doing that, hypothetically speaking, some of those discoveries might even be original.
Yeah, I hope you will! I’m reminded of what Scott Aaronson said recently:
When I was a kid, I too started by rediscovering things (like the integral for the length of a curve) that were centuries old, then rediscovering things (like an efficient algorithm for isotonic regression) that were decades old, then rediscovering things (like BQP⊆PP) that were about a year old … until I finally started discovering things (like the collision lower bound) that were zero years old. This is the way.
Just stumbled upon a spreadsheet ranking “well-known” googological numbers created by redditor u/Laxxius1 as a passion project. I’ve been idly interested in googology since childhood, so perusing this list felt like getting the Christmas present I’ve always wanted but kept forgetting to ask for. (It’s also just intrinsically fun if your mathematical aesthetic is wrestler, which I’m guessing is overrepresented among LWers.)
Yudkowsky’s number, which Eliezer created back in 2007 to win this contest on the xkcd forum, features at #12. (Eliezer also wrote Why isn’t googology a recognized field of math? whose sentiment I predictably agree with.) Graham’s number is all the way down at #98; TREE(3) is far larger at #49; 3^^^3 (as in torture vs dust specks) is at #145 even though it’s already an exponential tower of 3s that’s 7,625,597,484,987 layers tall. The “most finite damage with 3 MTG cards” ranks #134 at 2^^2^^7, which makes me wonder whether there are other popular games where you can easily reach very large numbers. The largest code golf-related number in the spreadsheet is loader.c at #18, Ralph Loader’s winning entry in the Bignum Bakeoff contest (“write a C program in 512 characters or less that generates the largest possible output on a theoretical machine with infinite memory”), although there are a couple of numbers purporting to be larger on code golf SE generated by ultra-short programs written in JavaScript, Python, Ruby, and binary lambda calcululs.
Rayo’s number and LNGN are defined in recursive theories. What if we use a meta-theory to create a non-recursive theory more expressive than any recursive theory, and then use that theory to create a large number? …
I don’t really like the number N. Modern mathematics isn’t made to create large numbers, so what you need to do to create a large number can be very weird. N (and probably also other extensions of Rayo’s number) is mostly a product of reverse-engineering the current philosophy of natural numbers and then to create something that barely resembles a number. So I decided to name this number to something else I don’t like: my current username. … So this number is now called ‘DaVinci’.
The #2-ranked number is LNGN, short for the quirkily-named Large Number Garden Number, which also attempts to diagonalize over set theory. It’s for now still considered by the googology community to be the largest well-defined googologism that’s not a salad number, as it hasn’t yet been proved to be ill-defined, and LNGN’s creator P進大好きbot disputes DaVinci’s superiority claim. I’m mentioning it here because of its quirky etymology, translated from the original Japanese:
Come on, friends, the large number garden is finally complete!
Let me explain the function of this garden. The first is the determination function of the address and the floor plan. When a character string is read, it automatically determines which miniature garden address it represents and in which miniature garden the floor plan of a large number garden can be reproduced.
The second is the floor plan analysis function. If you specify the address of the miniature garden and read the floor plan of the reproducible large number garden there, it will tell you the large number that the garden can produce.
The third important function is the ability to generate large numbers. Once a natural number is entered, all character strings within the upper limit of the number of characters are searched, and each is read into the address and floor plan determination function, leaving only the reproducible floor plan for each miniature garden. By enumerating them and loading them into the analysis function of the floor plan, you can obtain the large numbers that they can produce, and by putting them all together, you can create new large numbers!
Huh? Can you really get a large number with that? As usual, my ally is skeptical. But hey, here’s the floor plan for the large number garden itself. If you load this into the analysis function, it will tell you how large numbers you can generate. Huh? How many characters does this floor plan have? What’s the use of knowing such things?
Scott Aaronson wrote that “the Busy Beaver game seems like about as good a yardstick as any for gauging humanity’s progress against the uncomputable”. I think of googology the same way, just lower-status than BusyBeaverology.
I wonder to what extent these impressions by Some Guy over at Extelligence are shared by others:
Some of this may be biased by my own background, but basically I’ve always found mathematicians and physicists to be the most humble and approachable people in all of the sciences. Second, are chemists and material scientists who can be somewhat cold but are always honest and straightforward because they have nothing to prove. But biologists? Man. The descriptive sciences have a chip on their shoulders and while social sciences are usually full of people who make up flowery language to cover up for that, biology is close enough to the harder sciences it has a chip on its shoulder. Once you move away from the necessary honesty of mathematical and atomic mechanism people can become savage assholes. The rudest people I have ever met in my life were biologists.
So, there are my biases laid out on the table. Scientists who aren’t very good at math tend to be dicks because they’re self-conscious about it.
(I can think of plenty of personal counterexamples.)
Kishore Mahbubani, Singaporean diplomat and former president of the UN Security Council, studied philosophy full-time as an undergraduate in the late 60s. Recounting that period in his autobiography Living the Asian Century he wrote
For the final examinations, which I took at the end of my fourth year, our degree was determined by how well we did in eight three-hour examinations. In one of the papers, we had to answer a single question. The one question I chose to answer over three hours was “Can a stone feel pain?”
From my exam results, I gained a first-class honours degree, which was rare in the Department of Philosophy. Since our final examination papers were also sent to Peter Winch, one of the leading scholars on Wittgenstein in the world, I felt honoured that my first-class honours had been endorsed by him.
Wittgenstein was Mahbubani’s favorite philosopher; back then, “like all other philosophy departments in the Anglo-Saxon world, our department had been captured by the linguistic-analytic school of philosophy that Wittgenstein had launched with his Philosophical Investigations”.
At risk of revealing possible narrow-mindedness, a three-hour free response exam to the question “Can a stone feel pain?” makes me think of Luke’s philosophy: a diseased discipline. The questions Richard Ngo answered in his All Souls Fellowship exam got wacky at times, but never “can a stone feel pain?”-wacky.
Mahbubani continued:
… I could write eight pages over three hours in response to the question “Can a stone feel pain?” because Wittgenstein’s ideas allowed me to deconstruct the meanings of the words in this apparently simple question.
The process of focusing on the language we use came in very handy when I joined the Ministry of Foreign Affairs (MFA) in April 1971 and embarked on my long career in the study of geopolitics. Our understanding of “objective reality” is clearly conditioned by the language we use. The first major war that I had to analyse as a Foreign Service officer was the Vietnam War. The “facts” were clear: soldiers from North Vietnam were fighting soldiers from the United States. We could see this. But what were they fighting about? The US leaders, Johnson and Nixon, had no doubt: they were fighting against a global push by the Soviet Union and China to expand communism. But the North Vietnamese soldiers also had no doubt: they were fighting for “national liberation” from the “imperialist” US forces. So who was right? What is the truth here? Adding to the elusiveness of an absolute “truth” is the fact that fifty years after the United States withdrew ignominiously from Vietnam, one of the best friends of the United States in Southeast Asia will be the Communist Party of North Vietnam—the United States wants to upgrade its ties with Vietnam to a strategic partnership.
I find myself completely unpersuaded by his applied example here, but I suppose I’m just the wrong audience...
Why doesn’t Applied Divinity Studies’ The Repugnant Conclusion Isn’t dissolve the argumentative force of the repugnant conclusion?
But read again more carefully: “There is nothing bad in each of these lives”.
Although it sounds mundane, I contend that this is nearly incomprehensible. Can you actually imagine what it would be like to never have anything bad happen to you? We don’t describe such a as mediocre, we describe it as “charmed” or “overwhelmingly privileged”. …
… consider Parfit’s vision of World Z both seriously and literally.
These are lives with no pain, no loneliness or depression, no loss or fear, no anxiety, no aging, no disease, nor decay. Not ever a single moment of sorrow. These are lives free entirely from every minor ache and cramp, from desire, from jealousy, from greed, and from every other sin that poisons the heart. Free from the million ills that plague and poke at ordinary people.
It is thus less the world of peasants, and closer to that of subdued paradise. The closest analog we can imagine is perhaps a Buddhist sanctuary, each member so permanently, universally and profoundly enlightened that they no longer experience suffering of any kind.
And that’s not all! Parfit further tells us that their lives are net positive. And so in addition to never experiencing any unpleasantness of any degree, they also experience simple pleasures. A “little happiness”, small nearly to the point of nothingness, yet enough to tip the scales. Perhaps the warmth of basking under a beam of sun, the gentle nourishment of simple meals, or just the low-level background satisfaction of a slow Sunday morning.
Properly construed, that is the world Parfit would have us imagine. Not a mediocre world of “muzak and potatoes”, but a kind of tranquil nirvana beyond pain. And that is a world I have no problem endorsing.
First, this is not the phrase I associate with the repugnant conclusion. “Net positive” does not mean “there is nothing bad in each of these lives”.
Second, I do think a key phrase & motivating description is “all they have is muzak and potatoes”. That is all they have. I like our world where people can be and do great things. I won’t describe it in poetic terms, since I don’t think that makes good moral philosophy. If you do want something more poetic, idk read Terra Ignota or The Odyssey. Probably Terra Ignota moreso than The Odyssey.
I will say that I like doing fun things, and I think many other people like doing fun things, and though my life may be net positive sitting around in a buddhist temple all day, I would likely take a 1-in-a-million chance of death to do awesome stuff instead. And so, I think, would many others.
And we could all make a deal, we draw straws, and those 1-in-a-million who draw short give the rest their resources and are put on ice until we figure out a way to get enough resources so they could do what they love. Or, if that’s infeasible (and in most framings of the problem it seems to be), willfully die.
I mean, if nothing else, you can just gather all those who love extreme sports (which will be a non-trivial fraction of the population), and ask them to draw straws & re-consolidate the relevant resources to the winners. Their revealed preference would say “hell yes!” (we can tell, given the much lower stakes & much higher risk of the activities they’re already doing).
And I don’t think the extreme sports lovers would be the only group who would take such a deal. Anyone who loves doing anything will take that deal, and (especially in a universe with the resources able to be filled to the brim with people just above the “I’ll kill myself” line) I think most will have such a passion able to be fulfilled (even if it is brute wireheading!).
And then, if we know this will happen ahead of time—that people will risk death to celebrate their passions—why force them into that situation? We could just… not overproduce people. And that would therefore be a better solution than the repugnant one.
And these incentives we’ve set up by implementing the so-called repugnant conclusion, where people are willfully dying for the very chance to do something in fact are repugnant. And that’s why its called repugnant, even if most are unable to express why or what we lose.
A big factor against making 1-in-a-million higher for most people is the whole death aspect, but death itself is a big negative, much worse to die than to never have been born (or so I claim), so the above gives a lower bound on the factor by which the repugnant conclusion will be off by.
The Parfit quote from the blog post is taken out of context. Here is the relevant section in Parfit’s essay:
(Each box represents a possible population, with the height of a box representing how good overall an individual life is in that population, and the width representing the size of the population. The area of a box is the sum total “goodness”/”welfare”/”utility” (e.g. well-being, satisfied preferences, etc) in that population. The areas increase from A to Z, with Z being truncated here.)
Note that Parfit describes two different ways in which an individual life in Z could be barely worth living (emphasis added):
A life could be like this either because its ecstasies make its agonies seem just worth enduring, or because it is painless but drab.
Then he goes on to describe the second possibility (which is arguably unrealistic and much less likely than the first, and which contains the quote by the blog author). The author of the blog posts mistakenly ignores Parfit’s mentioning the first possibility. After talking about the second, Parfit returns (indicated by “similarly”) to the first possibility:
Similarly, Z is the outcome in which there would be the greatest quantity of whatever makes life worth living.
The “greatest quantity” here can simply be determined by the weight of all the positive things in an individual life minus the weight of all the negative things. Even if the result is just barely positive for an individual, for a large enough population, the sum welfare of the “barely net positive” individual lives would outweigh the sum for a smaller population with much higher average welfare. Yet intuitively, we should not trade a perfect utopia with relatively small population (A) for a world that is barely worth living for everyone in a huge population (Z).
That’s the problem with total utilitarianism, which simply sums all the “utilities” of the individual lives to measure the overall “utility” of a population. Taking the average instead of the sum avoids the repugnant conclusion, but it leads to other highly counterintuitive conclusions, such as that a population of a million people suffering strongly is less bad than a population of just a single person suffering slightly more strongly, as the latter has a worse average. So arguably both total and average utilitarianism are incorrect, at least without strong modifications.
(Personally I think a sufficiently developed version of person-affecting utilitarianism (an alternative to average and total utilitarianism) might well solve all these problems, though the issue is very difficult. See e.g. here.)
What is the current best understanding of why o3 and o4-mini hallucinate more than o1? I just got round to checking out the OpenAI o3 and o4-mini System Card and in section 3.3 (on hallucinations) OA noted that
o3 tends to make more claims overall, leading to more accurate claims as well as more inaccurate/hallucinated claims. While this effect appears minor in the SimpleQA results (0.51 for o3 vs 0.44 for o1), it is more pronounced in the PersonQA evaluation (0.33 vs 0.16). More research is needed to understand the cause of these results.
as of publication on April 16, so it’s only been a few weeks, but wondering anyhow if people have figured this out.
o3 has some sort of internal feature like “Goodhart to the objective”/”play in easy mode”.
o3′s RL post-training environments have opportunities for reward hacks.
o3 discovers and exploits those opportunities.
RL rewards it for that, reinforcing the “Goodharting” feature.
This leads to specification-hack-y behavior generalizing out of distribution, to e. g. freeform conversations. It ends up e. g. really wanting to sell its interlocutor on what it’s peddling, so it deliberately[1] confabulates plausible authoritative-sounding claims and justifications for them.
Sounds not implausible, though I’m not wholly convinced.
> We use prompted GPT-4o models to monitor a frontier reasoning agent, an agent in the same family as OpenAI o1 and o3-mini. During training, the agent discovered two hacks affecting nearly all training environments:
The model is in the same family as o1 and o3-mini. Maybe o3 but not comfirmed.
Venkatesh Rao’s recent newsletter article Terms of Centaur Service caught my eye for his professed joy of AI-assisted writing, both nonfiction and fiction:
In the last couple of weeks, I’ve gotten into a groove with AI-assisted writing, as you may have noticed, and I am really enjoying it. … The AI element in my writing has gotten serious, and I think is here to stay. …
On the writing side, when I have a productive prompting session, not only does the output feel information dense for the audience, it feels information dense for me.
An example of this kind of essay is one I posted last week, on a memory-access-boundary understanding of what intelligence is. This was an essay I generated that I got value out of reading. And it didn’t feel like a simple case of “thinking through writing.” There’s stuff in here contributed by ChatGPT that Ididn’t know or realize even subconsciously, even though I’ve been consulting for 13 years in the semiconductor industry.
Generated text having elements new to even the prompter is a real benefit, especially with fiction. I wrote a bit of fiction last week that will be published in Protocolized tomorrow that was so much fun, I went back and re-read it twice. This is something I never do with m own writing. By the time I ship an unassisted piece of writing, I’m generally sick of it.
AI-assisted writing allows you to have your cake and eat it too. The pleasure of the creative process, and the pleasure of reading. That’s in fact a test of good slop — do you feel like reading it?
I think this made an impression on me because Venkat’s joy contrasts so much to many people’s criticism of Sam Altman’s recent tweet re: their new creative fiction model’s completion to the prompt “Please write a metafictional literary short story about AI and grief”, including folks like Eliezer, who said “To be clear, I would be impressed with a dog that wrote the same story, but only because it was a dog”. I liked the AI’s output quite a lot actually, more than I did Eliezer’s (and I loved HPMOR so I should be selected for Eliezer-fiction-bias), and I found myself agreeing with Roon’s pushback to him.
Although Roshan’s remark that “AI fiction seems to be in the habit of being interesting only to the person who prompted it” does give me pause. While this doesn’t seem to be true in the AI vs Eliezer comparison specifically, I do find plausible a hyperpersonalisation-driven near-future where AI fiction becomes superstimuli-level interesting only to the prompter. But I find the contra scenario plausible too. Not sure where I land here.
Feynman loved doing physics. I think what he loved most was the process of it. Of calculating. Of figuring things out. It didn’t seem to matter to him so much if what came out was big and important. Or esoteric and weird. What mattered to him was the process of finding it. And he was often quite competitive about it.
Some scientists (myself probably included) are driven by the ambition to build grand intellectual edifices. I think Feynman — at least in the years I knew him — was much more driven by the pure pleasure of actually doing the science. He seemed to like best to spend his time figuring things out, and calculating. And he was a great calculator. All around perhaps the best human calculator there’s ever been.
Here’s a page from my files: quintessential Feynman. Calculating a Feynman diagram:
It’s kind of interesting to look at. His style was always very much the same. He always just used regular calculus and things. Essentially nineteenth-century mathematics. He never trusted much else. But wherever one could go with that, Feynman could go. Like no one else.
I always found it incredible. He would start with some problem, and fill up pages with calculations. And at the end of it, he would actually get the right answer! But he usually wasn’t satisfied with that. Once he’d gotten the answer, he’d go back and try to figure out why it was obvious. And often he’d come up with one of those classic Feynman straightforward-sounding explanations. And he’d never tell people about all the calculations behind it. Sometimes it was kind of a game for him: having people be flabbergasted by his seemingly instant physical intuition, not knowing that really it was based on some long, hard calculation he’d done.
Feynman and Wolfram had very different problem-solving styles:
Typically, Feynman would do some calculation. With me continually protesting that we should just go and use a computer. Eventually I’d do that. Then I’d get some results. And he’d get some results. And then we’d have an argument about whose intuition about the results was better.
The way he grappled with Wolfram’s rule 30 exemplified this (I’ve omitted a bunch of pictures, you can check them out in the article):
You know, I remember a time — it must have been the summer of 1985 — when I’d just discovered a thing called rule 30. That’s probably my own all-time favorite scientific discovery. And that’s what launched a lot of the whole new kind of science that I’ve spent 20 years building (and wrote about in my bookA New Kind of Science). …
Well, Feynman and I were both visiting Boston, and we’d spent much of an afternoon talking about rule 30. About how it manages to go from that little black square at the top to make all this complicated stuff. And about what that means for physics and so on.
Well, we’d just been crawling around the floor — with help from some other people — trying to use meter rules to measure some feature of a giant printout of it. And Feynman took me aside, rather conspiratorially, and said, “Look, I just want to ask you one thing: how did you know rule 30 would do all this crazy stuff?” “You know me,” I said. “I didn’t. I just had a computer try all the possible rules. And I found it.” “Ah,” he said, “now I feel much better. I was worried you had some way to figure it out.”
Feynman and I talked a bunch more about rule 30. He really wanted to get an intuition for how it worked. He tried bashing it with all his usual tools. Like he tried to work out what the slope of the line between order and chaos is. And he calculated. Using all his usual calculus and so on. He and his son Carl even spent a bunch of time trying to crack rule 30 using a computer.
And one day he calls me and says, “OK, Wolfram, I can’t crack it. I think you’re on to something.” Which was very encouraging.
Date: Mon, 7 Sep 1998 18:02:39 −0700 From: Hal Finney Message-Id: <199809080102.SAA02658@hal.sb.rain.org> To: extropians@extropy.com Subject: Singularity: Are posthumans understandable?
[This is a repost of an article I sent to the list July 21.]
It’s an attractive analogy that a posthuman will be to a human as a human is to an insect. This suggests that any attempt to analyze or understand the behavior of post-singularity intelligence is as hopeless as it would be for an insect to understand human society. Since insects clearly have essentially no understanding of humans, it would follow by analogy that we can have no understanding of posthumans.
On reflection, though, it seems that it may be an oversimplification to say that insects have no understanding of humans. The issue is complicated by the fact that insects probably have no “understanding” at all, as we use the term. They may not even be conscious, and may be better thought of as nature’s robots, of a similar level of complexity as our own industrial machines. Since insects do not have understanding, the analogy to humans does not work very well. If we want to say that our facility for understanding will not carry over into the posthuman era, we need to be able to say that insect’s facility for would not work when applied to humans.
What we need to do is to translate the notion of “understanding” into something that insects can do. That makes the analogy more precise and improves the quality of the conclusions it suggests.
It seems to me that while insects do not have “understanding” as we do, they do nevertheless have a relatively detailed model of the world which they interact with. Even if they are robots, programmed by evolution and driven by unthinking instinct, still their programming embodies a model of the world. A butterfly makes its way to flowers, avoides predators, knows when it is hungry or needs to rest. These decisions may be made unconsciously like a robot, but they do represent a true model of itself and of the world.
What we should ask, then, is whether insect’s model of the world can be successfully used to predict the behavior of humans, in the terms captured by the model itself. Humans are part of the world that insects must deal with. Are they able to successfully model human behavior at the level they are able to model other aspects of the world, so that they can thrive alongside humanity?
Obviously insects do not predict many aspects of human behavior. Still, in terms of the level of detail that they attempt to capture, I’d say they are reasonably effective. Butterflies avoid large animals, including humans. Some percentage of human-butterfly interactions would involve attempts by the humans to capture the butterflies, and so the butterflies’ avoidance instinct represents a success of their model. Similarly for many other insects for whom the extent of their model of humans is as “possible threat, to be avoided”.
Other insects have historically thrived in close association with humans, such as lice, fleas, ants, roaches, etc. Again, without attempting to predict the full richness of human behavior, their models are successful in expressing those aspects which they care about, so that they have been able to survive, often to the detriment of the human race.
If we look at the analogy in this way, it suggests that we may expect to be able to understand some aspects of posthuman behavior, without coming anywhere close to truly understanding and appreciating the full power of their thoughts. Their mental life may be far beyond anything we can imagine, but we could still expect to draw some simple conclusions about how they will behave, things which are at the level which we can understand. Perhaps Robin’s reasoning based on fundamental principles of selection and evolution would fall into this category.
We may be as ants to the post singularity intelligences, but even so, we may be able to successfully predict some aspects of their behavior, just as ants are able to do with humans.
Linking to a previous comment: 3,000+ words of longform quotes by various folks on the nature of personal identity in a posthuman future, and hiveminds / clans, using Hannu Rajaniemi’s Quantum Thief trilogy as a jumping-off point.
There are many issues buried in this question, which I have tried to phrase in a way that does not presuppose the nature of the answer.
It would not be good to start, for example, with the question
How do mathematicians prove theorems?
This question introduces an interesting topic, but to start with it would be to project two hidden assumptions: (1) that there is uniform, objective and firmly established theory and practice of mathematical proof, and (2) that progress made by mathematicians consists of proving theorems. It is worthwhile to examine these hypotheses, rather than to accept them as obvious and proceed from there.
The question is not even
How do mathematicians make progress in mathematics?
Rather, as a more explicit (and leading) form of the question, I prefer
How do mathematicians advance human understanding of mathematics?
This question brings to the fore something that is fundamental and pervasive: that what we are doing is finding ways for people to understand and think about mathematics.
The rapid advance of computers has helped dramatize this point, because computers and people are very different. For instance, when Appel and Haken completed a proof of the 4-color map theorem using a massive automatic computation, it evoked much controversy. I interpret the controversy as having little to do with doubt people had as to the veracity of the theorem or the correctness of the proof. Rather, it reflected a continuing desire for human understanding of a proof, in addition to knowledge that the theorem is true.
On a more everyday level, it is common for people first starting to grapple with computers to make large-scale computations of things they might have done on a smaller scale by hand. They might print out a table of the first 10,000 primes, only to find that their printout isn’t something they really wanted after all. They discover by this kind of experience that what they really want is usually not some collection of “answers”—what they want is understanding.
Tao’s toots:
In the first millennium CE, mathematicians performed the then-complex calculations needed to compute the date of Easter. Of course, with our modern digital calendars, this task is now performed automatically by computers; and the older calendrical algorithms are now mostly of historical interest only.
In the Age of Sail, mathematicians were tasked to perform the intricate spherical trigonometry calculations needed to create accurate navigational tables. Again, with modern technology such as GPS, such tasks have been fully automated, although spherical trigonometry classes are still offered at naval academies, and ships still carry printed navigational tables in case of emergency instrument failures.
During the Second World War, mathematicians, human computers, and early mechanical computers were enlisted to solve a variety of problems for military applications such as ballistics, cryptanalysis, and operations research. With the advent of scientific computing, the computational aspect of these tasks has been almost completely delegated to modern electronic computers, although human mathematicians and programmers are still required to direct these machines. (1/3)
Today, it is increasingly commonplace for human mathematicians to also outsource symbolic tasks in such fields as linear algebra, differential equations, or group theory to modern computer algebra systems. We still place great emphasis in our math classes on getting students to perform these tasks manually, in order to build a robust mathematical intuition in these areas (and to allow them to still be able to solve problems when such systems are unavailable or unsuitable); but once they have enough expertise, they can profitably take advantage of these sophisticated tools, as they can use that expertise to perform a number of “sanity checks” to inspect and debug the output of such tools.
With the advances in large language models and formal proof assistants, it will soon become possible to also automate other tedious mathematical tasks, such as checking all the cases of a routine but combinatorially complex argument, searching for the best “standard” construction or counterexample for a given inequality, or performing a thorough literature review for a given problem. To be usable in research applications, though, enough formal verification will need to be in place that one does not have to perform extensive proofreading and testing of the automated output. (2/3)
As with previous advances in mathematics automation, students will still need to know how to perform these operations manually, in order to correctly interpret the outputs, to craft well-designed and useful prompts (and follow-up queries), and to able to function when the tools are not available. This is a non-trivial educational challenge, and will require some thoughtful pedagogical design choices when incorporating these tools into the classroom. But the payoff is significant: given that such tools can free up the significant fraction of the research time of a mathematician that is currently devoted to such routine calculations, a student trained in these tools, once they have matured, could find the process of mathematical research considerably more efficient and pleasant than it currently is today. (3/3)
That said, while I’m not quite as bullish as some folks who think FrontierMath Tier 4 problems may fall in 1-2 years and mathematicians will be rapidly obsoleted thereafter, I also don’t think Tao is quite feeling the AGI here.
If Thurston is right here and mathematicians want to understand why some theorem is true (rather than to just know the truth values of various conjectures), and if we “feel the AGI” … then it seems future “mathematics” will consist in “mathematicians” asking future ChatGPT to explain math to them. Whether something is true, and why. There would be no research anymore.
The interesting question is, I think, whether less-than-fully-general systems, like reasoning LLMs, could outperform humans in mathematical research. Or whether this would require a full AGI that is also smarter than mathematicians. Because if we had the latter, it would likely be an ASI that is better than humans in almost everything, not just mathematics.
The OECD working paper Miracle or Myth? Assessing the macroeconomic productivity gains from Artificial Intelligence, published quite recently (Nov 2024), is strange to skim-read: its authors estimate just 0.24-0.62 percentage points annual aggregate TFP growth (0.36-0.93 pp. for labour productivity) over a 10-year horizon, depending on scenario, using a “novel micro-to-macro framework” that combines “existing estimates of micro-level performance gains with evidence on the exposure of activities to AI and likely future adoption rates, relying on a multi-sector general equilibrium model with input-output linkages to aggregate the effects”.
I checked it out both to get a more gears-y sense of how AI might transform the economy soon and to get an outside-my-bubble data-grounded sense of what domain experts think, but 0.24-0.62 pp TFP growth and 0.36-0.93 pp labor seem so low (relative to say L Rudolf L’s history of the future, let alone AI 2027) that I’m tempted to just dismiss them as not really internalising what AGI means. A few things prevent me from dismissing them: it seems epistemically unvirtuous to do so, they do predicate their forecasts on a lot of empirical data, anecdotes like lc’s recent AI progress feeling mostly like bullshit (although my own experience is closer to this), and (boring technical loophole) they may end up being right in the sense that real GDP would still look smooth even after a massive jump in AI, due to GDP growth being calculated based on post-jump prices deflating the impact of the most-revolutionised goods & services.
Why so low? They have 3 main scenarios (low adoption, high adoption and expanded capabilities, and latter plus adjustment frictions and uneven gains across sectors, which I take to be their best guess), plus 2 additional scenarios with “more extreme assumptions” (large and concentrated gains in most exposed sectors, which they think are ICT services, finance, professional services and publishing and media, and AI + robots, which is my own best guess); all scenarios assume just +30% micro-level gains from AI, except the concentrated gains one which assumes 100% gains in the 4 most-exposed sectors. From this low starting point they effectively discount further by factors like Acemoglu (2024)’s estimate that 20% of US labor tasks are exposed to AI (ranging from 11% in agriculture to ~50% in IT and finance), exposure to robots (which seems inversely related to AI exposure, e.g. ~85% in agriculture vs < 10% in IT and finance), 23-40% AI adoption rates, restricted factor allocation across sectors, inelastic demand, Baumol effect kicking in for scenarios with uneven cross-sectoral gains, etc.
Why just +30% micro-level gain from AI? They explain in section 2.2.1; to my surprise they’re already being more generous than the authors they quote, but as I’d guessed they just didn’t bother to predict whether micro-level gains would improve over time at all:
Briggs and Kodnani (2023) rely on firm-level studies which estimate an average gain of about 2.6% additional annual growth in workers’ productivity, leading to about a 30% productivity boost over 10 years. Acemoglu (2024) uses a different approach and start from worker-level performance gains in specific tasks, restricted to recent Generative AI applications. Nevertheless, these imply a similar magnitude, roughly 30% increase in performance, which they assume to materialise over the span of 10 years.
However, they interpret these gains as pertaining only to reducing labour costs, hence when computing aggregate productivity gains, they downscale the micro gains by the labour share. In contrast, we take the micro studies as measuring increases in total factor productivity since we interpret their documented time savings to apply to the combined use of labour and capital. For example, we argue that studies showing that coders complete coding tasks faster with the help of AI are more easily interpretable as an increase in the joint productivity of labour and capital (computers, office space, etc.) rather than as cost savings achieved only through the replacement of labour.
To obtain micro-level gains for workers performing specific tasks with the help of AI, this paper relies on the literature review conducted by Filippucci et al. (2024). … The point estimates indicate that the effect of AI tools on worker performance in specific tasks range from 14% (in customer service assistance) to 56% (in coding), estimated with varying degrees of precision (captured by different sizes of confidence intervals). We will assume a baseline effect of 30%, which is around the average level of gains in tasks where estimates have high precision.
Why not at least try to forecast micro-level gains improvement over the next 10 years?
Finally, our strategy aims at studying the possible future impact of current AI capabilities, considering also a few additional capabilities that can be integrated into our framework by relying on existing estimates (AI integration with additional software based on Eloundou et al, 2024; integration with robotics technologies). In addition, it is clearly possible that new types of AI architectures will eliminate some of the current important shortcomings of Generative AI – inaccuracies or invented responses, “hallucinations” – or improve further on the capabilities, perhaps in combination with other existing or emerging technologies, enabling larger gains (or more spread-out gains outside these knowledge intensive services tasks; see next subsection). However, it is still too early to assess whether and to what extent these emerging real world applications can be expected.
Ah, okay then.
What about that 23-40% AI adoption rate forecast over the next 10 years, isn’t that too conservative?
To choose realistic AI adoption rates over our horizon, we consider the speed at which previous major GPTs (electricity, personal computers, internet) were adopted by firms. Based on the historical evidence, we consider two possible adoption rates over the next decade: 23% and 40% (Figure 6). The lower adoption scenario is in line with the adoption path of electricity and with assumptions used in the previous literature about the degree of cost-effective adoption of a specific AI technology – computer vision or image recognition – in 10 years (Svanberg et al., 2024; also adopted by Acemoglu, 2024). The higher adoption scenario is in line with the adoption path of digital technologies in the workplace such as computers and internet. It is also compatible with a more optimistic adoption scenario based on a faster improvement in the cost-effectiveness of computer vision in the paper by Svanberg et al. (2024).
On the one hand, the assumption of a 40% adoption rate in 10 years can still be seen as somewhat conservative, since AI might have a quicker adoption rate than previous digital technologies, due its user-friendly nature. For example, when looking at the speed of another, also relatively user-friendly technology, the internet, its adoption by households after 10 years surpassed 50% (Figure A2 in the Annex). On the other hand, a systemic adoption of AI in the core business functions – instead of using it only in isolated, specific tasks – would still require substantial complementary investments by firms in a range of intangible assets, including data, managerial practices, and organisation (Agrawal, A., J. Gans and A. Goldfarb, 2022). These investments are costly and involve a learning-by-doing, experimental phase, which may slow down or limit adoption. Moreover, while declining production costs were a key driver of rising adoption for past technologies, there are indications that current AI services are already provided at discount prices to capture market shares, which might not be sustainable for long (see Andre et al, 2024). Finally, the pessimistic scenario might also be relevant in the case where limited reliability of AI or lack of social acceptability prevents AI adoption for specific occupations. To reflect this uncertainty, our main scenarios explore the implications of assuming either a relatively low 23% or a higher 40% future adoption rate.
I feel like they’re failing to internalise the lesson from this chart that adoption rates are accelerating over time:
Part of the concept of ‘Disruption’ is that important new technologies tend to be bad at the things that matter to the previous generation of technology, but they do something else important instead. Asking if an LLM can do very specific and precise information retrieval might be like asking if an Apple II can match the uptime of a mainframe, or asking if you can build Photoshop inside Netscape. No, they can’t really do that, but that’s not the point and doesn’t mean they’re useless. They do something else, and that ‘something else’ matters more and pulls in all of the investment, innovation and company creation. Maybe, 20 years later, they can do the old thing too—maybe you can run a bank on PCs and build graphics software in a browser, eventually—but that’s not what matters at the beginning. They unlock something else.
What is that ‘something else’ for generative AI, though? How do you think conceptually about places where that error rate is a feature, not a bug?
Machine learning started working as image recognition, but it was much more than that, and it took a while to work out that the right way to think about it was as pattern recognition. You could philosophise for a long time about the ‘right way’ to think about what PCs, the web or mobile really were. What is that for generative AI? I don’t think anyone has really worked it out yet, but using it as a new set of API calls within traditional patterns of software feels like using the new thing to do the old things.
By analogy:
These kinds of puzzles also remind me of a meeting I had in February 2005, now almost exactly 20 years ago, with a VP from Motorola, at the MWC mobile conference in Cannes. The iPod was the hot product, and all the phone OEMs wanted to match it, but the micro-HDD that Apple was using would break very reliably if you dropped your device. The man from Motorola pointed out that this was partly a problem of expectation and perception: if you dropped your iPod and it broke, you blamed yourself, but if you dropped your phone and it broke, you blamed the phone maker, even though it was using the same hardware.
Six months later Apple switched from HDDs to flash memory with the Nano, and flash doesn’t break if you drop it. But two years later Apple started selling the iPhone, and now your phone does break if you drop it, but you probably blame yourself. Either way, we adopted a device that breaks if you drop if with a battery that lasts a day instead of a week, in exchange for something new that came with that. We moved our expectations. This problem of expectation and perception seems to apply right now to generative AI.
This seems loosely reminiscent of his other essay How to lose a monopoly (emphasis mine):
… what is ‘power’? When we talk about ‘power’ and ‘dominance’ and perhaps ‘monopoly’ in tech, we actually mean two rather different things, and we generally conflate them:
There is having power or dominance or a monopoly around your own product in that product’s own market…
but then there is whether that position also means you control the broader industry.
In the 1970s dominating mainframes meant dominating tech, and in the 1990s dominating PC operating systems (and productivity software) meant dominating tech. Not any more. IBM still dominates mainframes, and Microsoft still dominates PCs, but that isn’t where broader dominance of the tech industry comes from. Once upon a time, IBM, and then Microsoft, could make people do things they didn’t want to do. Not today. Being rich is not the same as being powerful. …
Today, it’s quite common to hear the assertion that our own dominant tech companies—Google, Facebook et al—will easily and naturally transfer their dominance to any new cycle that comes along. This wasn’t true for IBM or Microsoft, the two previous generations of tech dominance, but then there’s another assertion—that this was because of anti-trust intervention, especially for Microsoft. This tends to be said as though it can be taken for granted, but in fact it’s far from clear that this is actually true.
The end of Microsoft’s dominance of tech actually came in two phases. First, as above, it lost the development environment to the web, but it still had the client (the Windows PC) and it then provided lots and lots of clients to access the web and so became a much bigger company. But second, a decade or so later, Apple proposed a better client model with the iPhone, and Google picked that up and made a version for every other manufacturer to use. Microsoft lost dominance of development to the web, and then lost dominance of the client to smartphones.
As we all know, there were major anti-trust cases around what Microsoft tried to do with the web, and specific regulatory interventions, and so you can at least argue for some direct connection to Microsoft’s failure to take the lead online, although this can be disputed. But those cases ended in 2001 and none of them said anything about mobile, and yet Microsoft lost that as well. So what happened?
Here, the argument for anti-trust as the decisive factor generally acknowledges that nothing in the actual judgement or remedies that were imposed had any specific effect on Microsoft’s mobile efforts, but instead says that Microsoft somehow became less good at execution or aggression as a result.
There are two problems with this. The first is that it wasn’t remotely apparent in 2007 that Microsoft wasn’t being aggressive in mobile. After all, Microsoft didn’t ‘miss’ mobile - it had started with the launch of Windows CE in 1996, and accelerated with PocketPC in 2001, and it had a whole bunch of ‘Windows’ smartphones on the market when the iPhone launched.
Rather, the iPhone created such a radical change in every assumption about how you would make a ‘smartphone’ that everyone else had to start again from scratch. It’s important to remember that none of the smartphone companies who’d been building things since the late 1990s—Nokia/Symbian, Palm, RIM and Microsoft—managed the transition. None of the others had anti-trust issues. But, they all had platforms, and just as importantly cultures and assumptions, that were based on the constraints of hardware and networks in 2000, whereas the iPhone was based on what hardware and networks would look like in 2010. The only way to compete was with a totally new platform and totally new assumptions about how it would work, and ‘dump our platform and build an entirely new one’ is always a near-death experience in technology. Failing to make it isn’t about a lack of aggression or execution—it’s that it’s really hard.
Indeed, even knowing quite what to do is hard. For Microsoft, we know now that the answer would have been to create an entirely new operating system, with no cross-compatibility with Windows apps, and make it open source, and give it away for free. Imagine saying that to Bill Gates that in 2007 - he’d have looked at you as though you’d grown a third arm.
which segued into a discussion on ‘moats’ (emphasis mine):
The tech industry loves to talk about ‘moats’ around a business—some mechanic of the product or market that forms a fundamental structural barrier to competition, so that just having a better product isn‘t enough to break in. But there are several ways that a moat can stop working. Sometimes the King orders you to fill in the moat and knock down the walls. This is the deus ex machina of state intervention—of anti-trust investigations and trials. But sometimes the river changes course, or the harbour silts up, or someone opens a new pass over the mountains, or the trade routes move, and the castle is still there and still impregnable but slowly stops being important. This is what happened to IBM and Microsoft. The competition isn’t another mainframe company or another PC operating system—it’s something that solves the same underlying user needs in very different ways, or creates new ones that matter more. The web didn’t bridge Microsoft’s moat - it went around, and made it irrelevant. Of course, this isn’t limited to tech—railway and ocean liner companies didn’t make the jump into airlines either. But those companies had a run of a century—IBM and Microsoft each only got 20 years.
(Galison’s article is worth reading in full, it’s wonderful erisology — a synthesis of two models of scientific progress: incremental empiricism (of the logical positivists) and grand paradigm shifts (of Thomas Kuhn and others).)
Experimentalists, theorists and instrument makers are all physicists but since they do different things and have different priorities they tend to develop their own separate vocabularies and value systems. They do have to interact sometimes for physics to progress — experiments must be run, technological systems must be built.
For this they need to establish common ground, a shared understanding of how the things they do together are to be done and what the words they use with each other mean. This, says Galison, is not trivial.
The logical positivists tried to define science as the accumulation of observations towards complete knowledge, and they failed in the end because they couldn’t construct a perfectly objective and unambiguous language in which to encode observations without tarring them with interpretation. That just isn’t how language works. The meaning of words and actions vary by context and is always in flux, so we can’t assume communicating across contexts is straightforward.
Galison calls the borderlands where the various brands of physicist interact “trading zones”. The concept is lifted from anthropology and means a place where cultures come together for the purposes of exchange and new intercultural practices and terms emerge. He says:
I intend the term trading zone to be taken seriously, as a social and intellectual mortar binding together the disunified traditions of experimenting, theorizing, and instrument building. Anthropologists are familiar with different cultures encountering one another through trade, even when the significance of the objects traded — and of the trade itself — may be utterly different for the two sides.
Practices and terms are assigned different meanings by each specialty and their meanings inside the trading zone are simplified local versions[3], While these “skeleton concepts” are different from the ones used internally by the participating cultures, it all works out as long as everyone understands that the trading zone is a special place with special rules.
I will argue this: science is disunified, and—against our first intuitions—it is precisely the disunification of science that underpins its strength and stability. …
In this chapter, drawing on related work in the history and philosophy of science, I will argue that even specialties within physics cannot be considered as homogeneous communities. Returning to the idea of intuition I have sketched elsewhere, I want to reflect at greater length on a description of physics that would neither be unified nor splintered into isolated fragments. I will call this multicultural history of the development of physics intercalated, because the many traditions coordinate with one another without homogenization. Different finite traditions of theorizing, experimenting, instrument making, and engineering meet-even transform one another-but for all that they do not lose their separate identities and practices. …
The criteria that divided the practitioners of theory, experiment, and instrumentation—different meetings, different preprint exchange, different journals—were the classic sociological dividers Kuhn (and many others since) productively invoked to identify distinct communities. Moreover, the experimenters and theorists often disagreed as to what entities there were, how they were classified, and how one demonstrated their existence—just the criteria Kuhn used to identify incommensurable systems of belief. … But here we can learn from the anthropologists who regularly study unlike cultures that do interact, most notably by trade. Two groups can agree on rules of exchange even if they ascribe utterly different significance to the objects being exchanged; they may even disagree on the meaning of the exchange process itself. Nonetheless, the trading partners can hammer out a local coordination despite vast global differences. … The anthropological picture is relevant here. For in focusing on local coordination, not global meaning, I think one can understand the way engineers, experimenters, and theorists interact.
Okay, but what examples does Galison write about? Here are some:
Experimentalists—and one could make a similar statement about theorists and instrumentalists—do not march in lockstep with theory. For example, the practice of experimental physics in the quantum mechanical revolution of 1926-27 was not violently dislocated despite the startling realignment of theory: spectroscopy continued unabated, as did measurements of specific heat and black-body radiation. And practitioners of these experimental arts continued, undaunted, to conduct a continuing dialogue with theorists across the great theoretical divide. Each subculture has its own rhythms of change, each has its own standards of demonstration, and each is embedded differently in the wider culture of institutions, practices, inventions, and ideas. …
Experimenters come to believe in an effect for various reasons; one is the stability of the phenomenon—you change samples, you shift the temperature-and still the effect remains. Another road to the closure of an experiment involves the increasing directness of our probing of the phenomenon. By increasing the power of a microscope, the energy of a particle beam, the disposition·of the apparatus, or the amplification of a signal, one probes further into the causal processes linking phenomena together.
The theorist’s experience is not so different. You try adding a minus sign to a term—but can’t do it because the theory then violates parity; you try adding a term with more particles in it—forbidden because the theory now is nonrenormalizable and so demands an infinite number of parameters; you try leaving a particle out of the theory—now the law has uninterpretable probabilities; you subtract a different term and all your particles vanish into the vacuum; you split a term in two—now charge isn’t conserved; and you still have to satisfy conservation laws of angular momentum, linear momentum, energy, lepton number, baryon number. Such constraints do not all issue axiomatically from a single, governing theory. Rather, they are the sum total of a myriad of interpenetrating commitments of theoretical, instrumental, and experimental practice: some, like the conservation of energy, centuries old. Others, like the demand for naturalness—that all free parameters arise in ratios on the order of unity—have their origin in recent memory. But taken together, the superposition of such constraints make some phenomena virtually impossible to posit, and others (such as the black hole) almost impossible to avoid.
Indeed, the astonishing thing about black holes is that they form (theoretically) in the face of enormous variations in the basic structure of our theory of matter. They don’t depend on the details of this or that theory of the strong, the weak, or the electromagnetic force; and to remain consistent with other observations there is practically nothing one can do with the theory of gravity that would get in the way of the formation of black holes. … This stubbornness against variation is the theoretical analogue of stability, and it is the experience of this stability that eventually brings theorists to accept such objects come what may (almost) from their experimentalist colleagues.
In our case, theorists trade experimental predictions for experimentalists’ results. Two things are noteworthy about the exchange. First, the two subcultures may altogether disagree about the implications of the information exchanged or its epistemic status. For example, as we have seen, theorists may predict the existence of an entity with profound conviction because it is inextricably tied to central tenets of their practice—for example, group symmetry, naturalness, renormalizability, covariance, or unitarity. The experimentalist may receive the prediction as something quite different, perhaps as no more than another curious hypothesis to try out on the next run of the data-analysis program. But despite these sharp differences, it is striking that there is a context within which there is a great deal of consensus. In this trading zone, phenomena are discussed by both sides. It is here that we find the classic encounters of experiment with theory: particle decays, fission, fusion, pulsars, magnetostriction, the creep effect, second sound, lasing, magnetic deflection, and so on. It is the existence of such trading zones, and the highly constrained negotiations that proceed within them, that bind the otherwise disparate subcultures together.
The example of relativistic mass is an appropriate place to start because over the last thirty years it has become the locus classicus for discussions of meaning incommensurability. For Kuhn, the advent of Einsteinian dynamics was a prototype of revolutionary change and, he argued, only at low velocities could the two concepts of mass be measured in the same way. On this view, one would expect there to be no experimental mode of comparison of Einstein’s concepr of mass and the concepts of mass his theory displaced—those of H. A. Lorentz, Max Abraham, and Henri Poincare, none of whom shared Einstein’s view of an operationally-defined space and time. … Kuhn’s claim is that prerelativistic and relativistic uses of the term mass make comparison impossible: “Only at low relative velocities may the [Newtonian and Einsteinian masses] be measured in the same way and even then they must not be conceived to be the same.”
In fact, there was a rich experimental subculture preoccupied precisely with comparing these different theories—and not at low velocities. With Max Kaufmann and Alfred Bucherer leading the way, these experimenters produced experiment after experiment using magnetic and electric fields to measure the mass of the high-velocity electron perpendicularly to its velocity. Moreover, their efforts were clearly understood by all four of the relevant theorists (Poincare, Lorentz, Abraham, and Einstein) to arbitrate among theories. Lorentz recognized the relevance of one such set to his work and immediately conceded defeat: “Unfortunately my hypothesis [explaining mass by] the flattening of electrons is in contradiction with Kaufmann’s results, and I must abandon it. I am, therefore, at the end of my Latin.” These are not the words of someone for whom the experiment was irrelevant or incomprehensible. Only slightly less despairingly, Poincare conceded that at “this moment the entire theory may well be threatened” by Kaufmann’s data. Einstein himself was more confident of his theory, and doubted the execution of Kaufmann’s work; he did not challenge the relevance in principle of the results. Quite the contrary: Einstein went to considerable pains to produce predictions for the transverse mass of the electron so that Kaufmann and Bucherer could use their experimental methods to study the theory; he constructed a detailed analysis of Kaufmann’s data; and he even designed his own modification of the electron-deflection experiments which he hoped someone would execute. …
The lesson I want to draw from this example is this: despite the “global” differences in the way “mass” classifies phenomena in the Lorentzian, Abrahamian, and Einsteinian theories, there remains a localized zone of activity in which a restricted set of actions and beliefs are deployed. In Kaufmann’s and Bucherer’s laboratories, in the arena of photographic plates, copper tubes, electric fields, and in the capacity of hot wires to emit electrons, experimentalists and theorists worked out an effective but limited coordination between beliefs and actions. What they worked out is, emphatically, not a protocol language—there is far too much theory woven into the joint experimental/theoretical action for that. Second, there is nothing universal in the establishment of jointly accepted procedures and arguments. And third, the laboratory coordination does not fully define the term mass, since beyond this localized context the theories diverge in a myriad of ways. Theorists and experimentalists are not miraculous instantaneous translators and they are not “mere” instrumentalists uninterested in interpretation. They are traders, coordinating parts of interpreted systems against parts of others.
One more example:
At first glance, the war would seem to have made no contribution whatsoever to such an abstruse and abstract subject as quantum electrodynamics. The usual story about QED runs roughly as follows: during the 1920s and 1930s physicists interested in the subject, including Victor Weisskopf, H. A. Kramers, J. Robert Oppenheimer, Niels Bohr, Julian Schwinger, and others made halting progress in understanding how the quantum theory of the electron could be combined with special relativity. They made only intermittent progress, limited essentially to first-order calculations. For reasons of war work, all those living in the United States supposedly broke off their efforts duting World War II to do their required (but “irrelevant” to pure physics) work on engineering, and then returned, triumphantly, to QED in the second half of the 1940s.
The story is false on at least two levels. First, as Silvan Schweber has pointed out, the developments in QED were catalyzed in part by the results of wartime microwave technology that made possible the precision measurements of Willis Lamb, R. C. Retherford, Henry Foley, J, M. B. Kellogg, P. Kusch et al. in Rabi’s laboratory and the work of Dicke at Princeton. These were extraordinary experiments, but the impact of the war went even deeper. Radar work reconfigured the strategy by which Schwinger approached physical problems. Schwinger himself has alluded briefly to his judgment that his radar work had a strong impact on his postwar thinking; in what follows I will expand on his later remarks, making use of his actual work in radar to complete the picture.
Let us attend to practice—not results. During the war, Schwinger worked in the theoretical section of the MIT Rad Lab; his group had the task of developing a usable, general account of microwave networks. Ordinary network theory—the theory of radio waves in resistors and capacitors—utterly failed because microwaves have a wavelength of the same size as ordinary electrical components. In ordinary components such as resistors, copper wires, or cylindrical capacitors, the microwave energy would radiate away. This meant that the full set of calculational tools available for electronic circuits became useless. With the help of his coworkers, Schwinger began with Maxwell’s equations and derived a set of rules by which engineers and physicists could malce practical network calculations.
As the war progressed and Schwinger assimilated more of the engineering culture of the Rad Lab, he began to abandon the physicists’ abstract scattering theory of electromagnetism, and to search for the microwave analogue of the electrical engineers’ more practical representations: simple “equivalent circuits” that imitated just the relevant aspects of the components. It was an old technique among electrical engineers, who were used to treating certain systems, such as loudspeakers, not by their real electrical, mechanical, or electromechanical properties, but as if the loudspeaker were a circuit of purely electrical components. In other words they (symbolically) put the complicated physics of the loudspeaker’s electromechanically generated noise into a “black box,” and replaced it in their calculations with “equivalent” electrical components. Similarly the conducting hollow pipes and cavities of microwave circuits could be replaced (symbolically) by ordinary electrical components, and so make the cavities amenable to algebraic manipulation—without entering each time into the details of complex boundary-value problems for Maxwell’s equations. As the postwar Rad Lab “Waveguide Handbook” put it, the adoption of equivalent circuits “serves the purpose of casting the results of field calculations in a conventional engineering mold from which information can be derived [sic] by standard engineering calculations.” It is just this process of appropriation—this “casting” into an “engineering mold” that intrigues me. In this detachment of field calculations from their original context, the full meaning of the terms is cut short. Nor is the meaning suddenly and of a piece brought into engineering lore: microwave frequencies did not allow any simpleminded identification of electrical properties with the well-known categories of voltages, currents, and resistances. The product of this labor was a kind of simplified jargon binding elements of field theory talk with elements of engineering equivalent-circuit talk.
In short, the war forced theoretical physicists—such as Schwinger—to spend day after day calculating things about devices and, through these material objects, linking their own prior language of field theory to the language and algebra of electrical engineering. Modifying the theory, creating equivalent circuits for microwave radiation, solving new kinds of problems was not—and this is the crucial point—a form of translation. Even Schwinger’s “glossary″ identified newly calculated theoretical elements with recently fabricated fragments of microwave circuitry; neither was part of the prior practice of either the theorists or the radio engineers. Boundaries are substantial, translation is absent, and Gestalt shifts are nowhere in sight.
Schwinger himself has alluded to the link between the two seemingly unrelated domains of waveguides and renormalization. “[T]hose years of distraction” during the war were more than that: “[t]he waveguide investigations showed the utility of organizing a theory to isolate those inner structural aspects that are not probed under the given experimental circumstances …. And it is this viewpoint that [led me] to the quantum electrodynamics concept of self-consistent subtraction or renormalization.” With an understanding of Schwinger’s work in waveguide physics, we are now in a position to unpack this connection between the calculations of radar and renormalization.
In the microwave case, it was impossible to calculate fully the field and currents in the region of the discontinuity; in the quantum electrodynamics case, it was hopeless to try to pursue the details of arbitrarily high-energy processes. To attack the microwave problem, Schwinger (wearing his engineering hat) isolated those features of the discontinuity region’s physics that were important for “the given experimental circumstances”—for example, the voltages and currents emerging far from the discontinuity. In order to isolate the interesting features, he dumped the unneeded details of the electrodynamics of the discontinuity region into the parameters of an equivalent circuit. Faced with the fundamental problem of quantum electrodynamics, Schwinger concluded in 1947 that he should proceed by analogy: one had to isolate those features of the physics of quantum electrodynamics that were important for the given experimental circumstances—for example, magnetic moments or scattering amplitudes. To separate these quantities from the dross, he dumped the unneeded details of high-energy interactions into the renormalization parameters.
One lesson that theoretical physicists learned from their engineer colleagues during the war was, therefore, simple yet deep: concentrate on what you actually measure, and design your theory so that it does not say more than you must to account for these observable quantities. The adoption of this positivist attitude toward theorizing was such a sufficiently sharp break with earlier traditions of theory, that some of Schwinger’s contemporaries never accepted it. Even Dirac, one .of the greatest of twentieth-century theorists, resisted the idea of renormalization until his death in the 1980s. But the idea rapidly took hold, altering for at least several decades the theorists’ attitude toward the limits of their description of nature.
Last quote (this “shortform” is clearly no longer short):
Despite this radical difference in the ontology—the set of what there is—a meeting ground exists around the description of the phenomenology of particle physics: How do photons recoil from electrons? How do electrons scatter from positrons? How do photons create pairs of electrons and positrons in the near presence of a proton? For these and similar questions, the experimentalists and theorists come to agreement about rules of representation, calculation, and local interpretation. In a strong sense, Bjorken and Drell Volume I is an example of an attempt to create a stable pidgin language, designed to mediate between experimentalist and theorist. Reduction of mathematical structure, suppression of exceptional cases, minimization of internal linkages between theoretical structures, removal from a more elaborate explanatory structure—these are all ways that the theorists prepare their subject for the exchange with their experimental colleagues. I take these moves toward regularization to be the formal-language analogues of phonetic, morphological, syntactical, and lexical reduction of natural languages. …
And indeed there is, not surprisingly, a corresponding “foreigner talk″ that experimentalists develop on their side. Just as theorists reduce the complexity by suppressing the “endogenous” structure linking theory to theory, so experimentalists, when addressing theorists, skip the connecting details by which experimental procedures bind to one another. These “separable” bits of procedure can come as isolable fragments of craft or engineering knowledge, as when the Alvarez group introduced indium as the binding material by which bind bubble chamber glass to the steel chassis. Between such localized wisdom and material lay computer programs such as the PANG or KICK. Their exchange not only regularized practices in the “image” tradition, the track analysis programs carried over as well into the “logic” tradition, serving in the long run to facilitate the coalescence of the two previously competing cultures.
Scott’s The Colors Of Her Coat is the best writing I’ve read by him in a long while. Quoting this part in particular as a self-reminder and bulwark against the faux-sophisticated world-weariness I sometimes slip into:
Chesterton’s answer to the semantic apocalypse is to will yourself out of it. If you can’t enjoy My Neighbor Totoro after seeing too many Ghiblified photos, that’s a skill issue. Keep watching sunsets until each one becomes as beautiful as the first…
If you insist that anything too common, anything come by too cheaply, must be boring, then all the wonders of the Singularity cannot save you. You will grow weary of green wine and sick of crimson seas. But if you can bring yourself to really pay attention, to see old things for the first time, then you can combine the limitless variety of modernity with the awe of a peasant seeing an ultramarine mural—or the delight of a 2025er Ghiblifying photos for the first time.
How to see old things for the first time? I thought of the following passage by LoganStrohl describing a SIM card ejection tool:
I started studying “original seeing”, on purpose and by that name, in 2018. What stood out to me about my earliest exploratory experiments in original seeing is how alien the world is. …
I started my earliest experimentation with some brute-force phenomenology. I picked up an object, set it on the table in front of me, and progressively stripped away layers of perception as I observed it. It was one of these things:
I wrote, “It’s a SIM card ejection tool.”
I wrote some things about its shape and color and so forth (it was round and metal, with a pointy bit on one end); and while I noted those perceptions, I tried to name some of the interpretations my mind seemed to be engaging in as I went.
As I identified the interpretations, I deliberately loosened my grip on them: “I notice that what I perceive as ‘shadows’ needn’t be places where the object blocks rays of light; the ‘object’ could be two-dimensional, drawn on a surface with the appropriate areas shaded around it.”
I noticed that I kept thinking in terms of what the object is for, so I loosened my grip on the utility of the object, mainly by naming many other possible uses. I imagined inserting the pointy part into soil to sow tiny snapdragon seeds, etching my name on a rock, and poking an air hole in the top of a plastic container so the liquid contents will pour out more smoothly. I’ve actually ended up keeping this SIM card tool on a keychain, not so I can eject SIM trays from phones, but because it’s a great stim; I can tap it like the tip of a pencil, but without leaving dots of graphite on my finger.
I loosened my grip on several preconceptions about how the object behaves, mainly by making and testing concrete predictions, some of which turned out to be wrong. For example, I expected it to taste sharp and “metallic”, but in fact I described the flavor of the surface as “calm, cool, perhaps lightly florid”.
By the time I’d had my fill of this proto-exercise, my relationship to the object had changed substantially. I wrote:
My perceptions that seem related to the object feel very distinct from whatever is out there impinging on my senses. … I was going to simply look at a SIM card tool, and now I want to wrap my soul around this little region of reality, a region that it feels disrespectful to call a ‘SIM card tool’. Why does it feel disrespectful? Because ‘SIM card tool’ is how I use it, and my mind is trained on the distance between how I relate to my perceptions of it, and what it is.
That last paragraph, and especially the use of ‘disrespectful’, strikes me a bit like the rationalist version of what Chesterton was talking about in Scott’s post.
I find both the views below compellingly argued in the abstract, despite being diametrically opposed, and I wonder which one will turn out to be the case and how I could tell, or alternatively if I were betting on one view over another, how should I crystallise the bet(s).
One is exemplified by what Jason Crawford wrote here:
The acceleration of material progress has always concerned critics who fear that we will fail to keep up with the pace of change. Alvin Toffler, in a 1965 essay that coined the term “future shock,” wrote:
I believe that most human beings alive today will find themselves increasingly disoriented and, therefore, progressively incompetent to deal rationally with their environment. … Change is avalanching down upon our heads and most people are utterly unprepared to cope with it … Such massive changes, coming with increasing velocity, will disorient, bewilder, and crush many people.26
Toffler and others worried that as progress moves ever faster, the world will slip out of our grasp. But as we have just seen, the historical trend is the opposite: the world does change ever faster, but we get better at dealing with change. We can better comprehend change, thanks to scientific theories, instruments of measurement, monitoring systems, and global communications. We can better respond to it, thanks to technology, wealth, and infrastructure, especially our manufacturing and transportation infrastructure. And we can better coordinate that response, via corporations, markets, governments, and norms of international cooperation. Change has been accelerating ever since the Stone Age, but we can far better handle the changes in our fast-paced world than tribal hunter-gatherers, Bronze Age emperors, or medieval kings could handle the changes even in their relatively slow-moving ones. All of those societies faced existential risk from factors as simple as a shift in climate or a new pathogen: famine, plague, or war could and did cause civilizational collapse.27
As an example of how tech progress can give us tools to help keep up with tech-accelerated change, Jason quotes Jungwon’s tweet on how Elicit Systematic Review aims to shorten systematic reviews from “3-12 months” to “minutes”.
By 2036, functionally everyone within the EU has some kind of regular state payout they live on, not through a single system but through an extremely complicated patronage network (that non-AI-aided humans literally could not understand) where the average person is eking out a living in exchange for taking part in complicated cultural rites and bureaucracies. …
By 2040, the value of the world’s manufacturing output is over a thousand times what it was in 2025. Most of this is spent on geopolitical competition, inter-elite status rivalries, and an increasing fraction on AI machinations with only the most tenuous link to any human activity, but which the humans who on-paper own all of this barely notice as it gets lost in the maelstrom of everything else. Even the most entrenched, long-term-oriented, and value-laden executive jobs are (whether de facto or de jure) entirely done by AIs, with very little human understanding of what is concretely happening on the ground. …
The second, more fundamental point, is that the economy has an inertia of its own. Humans make almost no meaningful decisions about the trajectory of the world, having handed the reins to AIs that make effectively all decisions, even if some of the AIs are technically only “advisors”. … Somewhere on the top of the stack there are still humans who on-paper own or control the assets and can make decisions (whether as a private actor or as a government overseeing autonomous AI companies operating in its territory), but they see numbers that track their wealth and power ticking up, so they have no reason to call a stop to it, and don’t understand it anymore.
(Seems like a production web-flavored story, except gone mostly right because Anthropic solved intent alignment)
The following aren’t really examples but feel adjacent, or maybe I’m being too loose with my pattern-matching: Michael Nielsen’s observation in 2008 that science had for some time progressed beyond individual understanding (e.g. LHC, classification of finite simple groups); Dan Luu’s observation that even explaining something as ostensibly familiar as “what happens when you load a URL?” seems to be beyond any individual person’s understanding today; etc.
I don’t know how I’d crystallise the bet(s). Aesthetically I greatly prefer Jason’s perspective of course…
Some ongoing efforts to mechanize mathematical taste, described by Adam Marblestone in Automating Math:
Yoshua Bengio, one of the “fathers” of deep learning, thinks we might be able to use information theory to capture something about what makes a mathematical conjecture “interesting.” Part of the idea is that such conjectures compress large amounts of information about the body of mathematical knowledge into a small number of short, compact statements. If AI could optimize for some notion of “explanatory power” (roughly, how vast a range of disparate knowledge can be compressed into a short and simple set of axioms), this could extend the possibilities of AI for creating truly new math and would probably have wide implications beyond that of thinking about human reasoning and what creativity really is.
Others, like Gabriel Poesia at Stanford, are working to create a theorem proving system that doesn’t need to rely on bootstrapping by imitating human proofs. Instead, Poesia’s system, called Peano, has a finite set of possible actions it can take. Peano can recombine these limited available actions to generate and test a variety of theorem proving algorithms and, it is hoped, self-discover math from scratch by learning to identify patterns in its successful solutions. Finally, it can leverage its previous work by turning solutions into reusable higher-level actions called “tactics.” In Poesia’s initial paper, he shows that Peano can learn abstract rules for algebra without being explicitly taught. But there is a trade-off: Because the model does not rely on human proofs, it has to invent more from scratch and may get stuck along the way. While Poesia’s approach might lead to faster learning compared with systems like AlphaProof, it may be handicapped by starting from a more limited baseline. But the verdict is still out as to what is the best balance of these factors.
Meanwhile, the Fields Medalist Timothy Gowers is trying to develop AIs that more closely mimic the ways that human mathematicians go about proving theorems. He’s arguably in a much better position to do that than the average AI researcher given his first-hand familiarity with the process. In other words, Gowers is betting against the current paradigm of throwing huge amounts of compute at a deep learning approach and is instead aiming to use his (and his students’) ability to introspect to hard code certain algorithms into an automatic theorem proving system. In this way, it’s more similar to the previous paradigm of AI development that sought to explicitly mimic human reasoning. Here again success is far from certain, but it is another shot at the goal.
I wondered whether Gowers was simply unaware of Sutton’s bitter lesson that
… general methods that leverage computation are ultimately the most effective, and by a large margin. The ultimate reason for this is Moore’s law, or rather its generalization of continued exponentially falling cost per unit of computation. … And the human-knowledge approach tends to complicate methods in ways that make them less suited to taking advantage of general methods leveraging computation.
which seemed unlikely given how polymathic Gowers is — and of course he’s aware:
I have written a 54-page document that explains in considerable detail what the aims and approach of the project will be. … In brief, the approach taken will be what is often referred to as a GOFAI approach… As the phrase “old-fashioned” suggests, GOFAI has fallen out of favour in recent years, and some of the reasons for that are good ones. One reason is that after initial optimism, progress with that approach stalled in many domains of AI. Another is that with the rise of machine learning it has become clear that for many tasks, especially pattern-recognition tasks, it is possible to program a computer to do them very well without having a good understanding of how humans do them. …
However, while machine learning has made huge strides in many domains, it still has several areas of weakness that are very important when one is doing mathematics. Here are a few of them.
In general, tasks that involve reasoning in an essential way.
Learning to do one task and then using that ability to do another.
Learning based on just a small number of examples.
Common sense reasoning.
Anything that involves genuine understanding (even if it may be hard to give a precise definition of what understanding is) as opposed to sophisticated mimicry.
Obviously, researchers in machine learning are working in all these areas, and there may well be progress over the next few years [in fact, there has been progress on some of these difficulties already of which I was unaware — see some of the comments below], but for the time being there are still significant limitations to what machine learning can do. (Two people who have written very interestingly on these limitations are Melanie Mitchell and François Chollet.)
That post was from April 2022, an eternity ago in AI land, and I haven’t seen any updates by him since.
How to quantify how much impact being smarter makes? This is too big a question and there are many more interesting ways to answer it than the following, but computer chess is interesting in this context because it lets you quantify compute vs win probability, which seems like one way to narrowly proxy the original question. Laskos did an interesting test in 2013 with Houdini 3 by playing a large number of games on 2x nodes vs 1x nodes per move level and computing p(win | “100% smarter”). The win probability gain above chance i.e. 50% drops from +35.1% in the 4k vs 2k node case to +11.1% in the 4M vs 2M case:
w l d Elo
1) 4k nodes vs 2k nodes +3862 −352 =786 +303
2) 8k nodes vs 4k nodes +3713 −374 =913 +280
3) 16k nodes vs 8k nodes +3399 −436 =1165 +237
4) 32k nodes vs 16k nodes +3151 −474 =1374 +208
5) 64k nodes vs 32k nodes +2862 −494 =1641 +179
6) 128k nodes vs 64k nodes +2613 −501 =1881 +156
7) 256k nodes vs 128k nodes +942 −201 =855 +136
8) 512k nodes vs 256k nodes +900 −166 =930 +134
9) 1024k nodes vs 512k nodes +806 −167 =1026 +115
10) 2048k nodes vs 1024k nodes +344 −83 =572 +93
11) 4096k nodes vs 2048k nodes +307 −85 =607 +79
As an aside, the diminishing returns surprised me: I was expecting p(win | “X% smarter”) to be independent of the 1x node’s compute. My guess is this is because Houdini 3 is close enough to chess’ skill ceiling (4877 Elo on CCRL for the perfect engine according to Laskos, extrapolating from his data above, or 1707 points above Houdini 3 40⁄40′ CCRL level) that p(win) starts diminishing very early, and that you won’t see this in “IRL games” unless the 1x player somehow manages to steer the future into a lower skill ceiling domain somehow. Another aside is that this diminishing returns pattern seems reminiscent of the “scaling wall” talk which predicts that walls are an artifact of low skill ceilings and that the highest scaling gains will come from ~limitless skill ceiling domains (automated theorem proving?), but I don’t expect this observation to mean much either, mostly because I don’t know what I’m talking about at this point.
The diminishing returns isn’t too surprising, because you are holding the model size fixed (whatever that is for Houdini 3), and the search sigmoids hard. Hence, diminishing returns as you jump well past the initial few searches with the largest gains, to large search budgets like 2k vs 4k (and higher).
This is not necessarily related to ‘approaching perfection’, because you can see the sigmoid of the search budget even with weak models very far from the known oracle performance (as well as stronger models); for example, NNs playing Hex: https://arxiv.org/pdf/2104.03113#page=5 Since it’s a sigmoid, at a certain point, your returns will steeply diminish and indeed start to look like a flat line and a mere 2x increase in search budget does little. This is why you cannot simply replace larger models with small models that you search the hell out of: because you hit that sigmoid where improvement basically stops happening.
At that point, you need a smarter model, which can make intrinsically better choices about where to explore, and isn’t trapped dumping endless searches into its own blind spots & errors. (At least, that’s how I think of it qualitatively: the sigmoiding happens because of ‘unknown unknowns’, where the model can’t see a key error it made somewhere along the way, and so almost all searches increasingly explore dead branches that a better model would’ve discarded immediately in favor of the true branch. Maybe you can think of very large search budgets applied to a weak model as the weak model ‘approaching perfection… of its errors’? In the spirit of the old Dijkstra quip, ‘a mistake carried through to perfection’. Remember, no matter how deeply you search, your opponent still gets to choose his move, and you don’t; and what you predict may not be what he will select.)
Fortunately, ‘when making an axe handle with an axe, the model is indeed near at hand’, and a weak model which has been ‘policy-improved’ by search is, for that one datapoint, equivalent to a somewhat larger better model—if only you can figure out how to keep that improvement around...
Thanks, I especially appreciate that NNs playing Hex paper; Figure 8 in particular amazes me in illustrating how much more quickly perf. vs test-time compute sigmoids than I anticipated even after reading your comment. I’m guessing https://www.gwern.net/ has papers with the analogue of Fig 8 for smarter models, in which case it’s time to go rummaging around…
Lee Billings’ book Five Billion Years of Solitude has the following poetic passage on deep time that’s stuck with me ever since I read it in Paul Gilster’s post:
Deep time is something that even geologists and their generalist peers, the earth and planetary scientists, can never fully grow accustomed to.
The sight of a fossilized form, perhaps the outline of a trilobite, a leaf, or a saurian footfall can still send a shiver through their bones, or excavate a trembling hollow in the chest that breath cannot fill. They can measure celestial motions and list Earth’s lithic annals, and they can map that arcane knowledge onto familiar scales, but the humblest do not pretend that minds summoned from and returned to dust in a century’s span can truly comprehend the solemn eons in their passage.
Instead, they must in a way learn to stand outside of time, to become momentarily eternal. Their world acquires dual, overlapping dimensions— one ephemeral and obvious, the other enduring and hidden in plain view. A planet becomes a vast machine, or an organism, pursuing some impenetrable purpose through its continental collisions and volcanic outpourings. A man becomes a protein-sheathed splash of ocean raised from rock to breathe the sky, an eater of sun whose atoms were forged on an anvil of stars.
Beholding the long evolutionary succession of Earthly empires that have come and gone, capped by a sliver of human existence that seems so easily shaved away, they perceive the breathtaking speed with which our species has stormed the world. Humanity’s ascent is a sudden explosion, kindled in some sapient spark of self-reflection, bursting forth from savannah and cave to blaze through the biosphere and scatter technological shrapnel across the planet, then the solar system, bound for parts unknown. From the giant leap of consciousness alongside some melting glacier, it proved only a small step to human footprints on the Moon.
The modern era, luminous and fleeting, flashes like lightning above the dark, abyssal eons of the abiding Earth. Immersed in a culture unaware of its own transience, students of geologic time see all this and wonder whether the human race will somehow abide, too.
These quotes from When ChatGPT Broke an Entire Field: An Oral History stood out to me:
Wow. I knew academics were behind / out of the loop / etc. but this surprised me. I imagine these researchers had at least heard about GPT2 and GPT3 and the scaling laws papers; I wonder what they thought of them at the time. I wonder what they think now about what they thought at the time.
The full article sort of explains the bizarre kafkaesque academic dance that went on from 2020-2022, and how the field talked about these changes.
for anyone not wanting to go in and see the Kafka, I copied some useful examples:
ANNA ROGERS: I was considering making yet another benchmark, but I stopped seeing the point of it. Let’s say GPT-3 either can or cannot continue [generating] these streams of characters. This tells me something about GPT-3, but that’s not actually even a machine learning research question. It’s product testing for free.
JULIAN MICHAEL: There was this term, “API science,’’ that people would use to be like: “We’re doing science on a product? This isn’t science, it’s not reproducible.” And other people were like: “Look, we need to be on the frontier. This is what’s there.”
TAL LINZEN (associate professor of linguistics and data science, New York University; research scientist, Google): For a while people in academia weren’t really sure what to do.
R. THOMAS MCCOY: Are you pro- or anti-LLM? That was in the water very, very much at this time.
JULIE KALLINI (second-year computer science Ph.D. student, Stanford University): As a young researcher, I definitely sensed that there were sides. At the time, I was an undergraduate at Princeton University. I remember distinctly that different people I looked up to — my Princeton research adviser [Christiane Fellbaum] versus professors at other universities — were on different sides. I didn’t know what side to be on.
LIAM DUGAN: You got to see the breakdown of the whole field — the sides coalescing. The linguistic side was not very trusting of raw LLM technology. There’s a side that’s sort of in the middle. And then there’s a completely crazy side that really believed that scaling was going to get us to general intelligence. At the time, I just brushed them off. And then ChatGPT comes out.
+1, GPT3.5 was publicly available since January, and GPT3 was big news two years before and publicly available back then. I’m really surprised that people didn’t understand that these models were a big deal AND changed their minds when ChatGPT came out. Maybe it’s just a weird preference cascade, where this was enough to break a common false belief?
Something like
GPT-3.5/ChatGPT was qualitatively different.
I remember seeing the ChatGPT announcement and not being particularly impressed or excited, like “okay, it’s a refined version of InstructGPT from almost a year ago. It’s cool that there’s a web UI now, maybe I’ll try it out soon.” November 2022 was a technological advancement but not a huge shift compared to January 2022 IMO
Fair enough. My mental image of the GPT models was stuck on that infernal “talking unicorns” prompt, which I think did make them seem reasonably characterized as mere “stochastic parrots” and “glorified autocompletes,” and the obvious bullshit about the “safety and security concerns” around releasing GPT-2 also led me to conclude the tech was unlikely to amount to much more. InstructGPT wasn’t good enough to get me to update it; that took the much-hyped ChatGPT release.
Was there a particular moment that impressed you, or did you just see the Transformers paper, project that correctly into the future, and the releases that followed since then have just been following that trend you extrapolated and so been unremarkable?
I remember being very impressed by GPT-2. I think I was also quite impressed by GPT-3 even though it was basically just “GPT-2 but better.” To be fair, at the moment that I was feeling unimpressed by ChatGPT, I don’t think I had actually used it yet. It did turn out to be much more useful to me than the GPT-3 API, which I tried out but didn’t find that many uses for.
It’s hard to remember exactly how impressed I was with ChatGPT after using it for a while. I think I hadn’t fully realized how great it could be when the friction of using the API was removed, even if I didn’t update that much on the technical advancement.
The full article discusses the transformer paper (which didn’t have a large influence, as the implications weren’t clear), BERT (which did have a large influence) and GPT-3 (which also had a large influence). I assume the release of ChatGPT was the point where even the last NLP researchers couldn’t ignore LLMs anymore.
ChatGPT was “so good they can’t ignore you”; the Hugging Face anecdote is particularly telling. At some point, everyone else gets tired of waiting for your cargo to land, and will fire you if you don’t get with the program. “You say semantics can never be learned from syntax and you’ve proven that ChatGPT can never be useful? It seems plenty useful to me and everyone else. Figure it out or we’ll find someone who can.”
I think the most interesting part of the Quanta piece is the discussion of the octopus paper, which states that pure language models can’t actually understand text (as they only learn from form/syntax), and the bitter disputes that followed in the NLP community.
From the abstract:
Emily M. Bender, the first author, was also first author of the subsequent “stochastic parrot” paper: On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜[1]
(As a side note, Yudkowsky’s broadly verificationist theory of content seems to agree with her distinction: if “understanding” of a statement is knowing what experience would confirm it, or what experience it would predict, then understanding cannot come from syntactic form alone. The association of words and sensory data would be necessary. Did Yudkowsky ever comment on the apparent incompatibility between evident LLM understanding and his anticipated experience theory?)
Of course I assume that now it can hardly be denied that LLMs really do somehow understand text, even if they are merely trained on form. So the octopus paper argument must be wrong somewhere. Though at least in the Quanta piece, Bender doesn’t acknowledge any update of that sort. In fact, in the last quote she says:
First paper I have seen that uses an emoji in its title.
I don’t think there’s any necessary contradiction. Verification or prediction of what? More data. What data? Data. You seem to think there’s some sort of special reality-fluid which JPEGs or MP3s have but .txt files do not, but they don’t; they all share the Buddha-nature.
Consider Bender’s octopus example, where she says that it can’t learn to do anything from watching messages go back and forth. This is obviously false, because we do this all the time; for example, you can teach a LLM to play good chess simply by watching a lot of moves fly by back and forth as people play postal chess. Imitation learning & offline RL are important use-cases of RL and no one would claim it doesn’t work or is impossible in principle.
Can you make predictions and statements which can be verified by watching postal chess games? Of course. Just predict what the next move will be. “I think he will castle, instead of moving the knight.” [later] “Oh no, I was wrong! I anticipated seeing a castling move, and I did not, I saw something else. My beliefs about castling did not pay rent and were not verified by subsequent observations of this game. I will update my priors and do better next time.”
Well, in the chess example we do not have any obvious map/territory relation. Chess seems to be a purely formal game, as the pieces do not seem to refer to anything in the external world. So it’s much less obvious that training on form alone would also work for learning natural language, which does exhibit a map territory distinction.
For example, a few years ago, most people would have regarded it as highly unlikely that you could understand (decode) an intercepted alien message without any contextual information. But if you can understand text from form alone, as LLMs seem to prove, the message simply has to be long enough. Then you can train an LLM on it, which would then be able to understand the message. And it would also be able to translate it into English if it is additionally trained on English text.
That’s very counterintuive, or at least it was counterintuitive until recently. I doubt EY meant to count raw words as “anticipated experience”, since “experience” typically refers to sensory data only. (In fact, I think Guessing the Teacher’s Password also suggests that he didn’t.)
To repeat, I don’t blame him, as the proposition that large amounts of raw text can replace sensory data, that a sufficient amount of symbols can ground themselves, was broadly considered unlikely until LLMs came along. But I do blame Bender insofar as she didn’t update even in light of strong evidence that the classical hypothesis (you can’t infer meaning from form alone) was wrong.
Yes, there is. The transcripts are of 10 million games that real humans played to cover the distribution of real games, and then were annotated by Stockfish, to provide superhuman-quality metadata on good vs bad moves. That is the territory. The map is the set of transcripts.
I would say ‘diverse enough’, not ‘long enough’. (An encyclopedia will teach a LLM many things; a dictionary the same length, probably not.) Similar to meta-learning vs learning.
What external world does our ‘external world’ itself refer to things inside of? If the ‘external world’ doesn’t need its own external world for grounding, then why does lots of text about the external world not suffice? (And if it does, what grounds that external external world, or where does the regress end?) As I like to put it, for an LLM, ‘reality’ is just the largest fictional setting—the one that encompasses all the other fictional settings it reads about from time to time.
As someone who doubtless does quite a lot of reading about things or writing to people you have never seen nor met in real life and have no ‘sensory’ way of knowing that they exist, this is a position you should find sympathetic.
Sympathy or not, the position that meaning of natural language can be inferred from the symbolic form alone wasn’t obvious to me in the past, as this is certainly not how humans learn language, and I don’t know any evidence that someone else thought this plausible before machine learning made it evident. It’s always easy to make something sound obvious after the fact, but that doesn’t mean that it actually was obvious to anyone at the time.
Plenty of linguists and connectionists thought it was possible, if only to show those damned Chomskyans that they were wrong!
To be specific, some of the radical linguists believed in pure distributional semantics, or that there is no semantics beyond syntax. I don’t know anyone in particular, but considering how often Chomsky, Pinker, etc were fighting against the “blank slate” theory, they definitely existed.
The following people likely believed that it is possible to learn a language purely from reading using a general learning architecture like neural networks (blank-slate):
James L. McClelland and David Rumelhart.
They were the main proponents of neural networks in the “past tense debate”. Generally, anyone on the side of neural networks in the past tense debate probably believed this.
B. F. Skinner.
Radical syntacticians? Linguists have failed to settle the question of “Just what is semantics? How is it different from syntax?”, and some linguists have taken the radical position “There is no semantics. Everything is syntax.”. Once that is done, there simply is no difficulty: just learn all the syntax, and there is nothing left to learn.
Possibly some of the participants in the “linguistics wars” believed in it. Specifically, some believed in “generative semantics”, whereby semantics is simply yet more generative grammar, and thus not any different from syntax (also generative grammar). Chomsky, as you might imagine, hated that, and successfully beat it down.
Maybe some people in distributional semantics? Perhaps Leonard Bloomfield? I don’t know enough about the history of linguistics to tell what Bloomfield or the “Bloomfieldians” believed in exactly. However, considering that Chomsky was strongly anti-Bloomsfield, it is a fair bet that some Bloomsfieldians (or self-styled “neo-Bloomsfieldians”) would support blank-slate learning of language, if only to show Chomskyans that they’re wrong.
FYI your ‘octopus paper’ link is to Stochastic Parrots; it should be this link.
I’ve seen other quotes from Bender & relevant coauthors that suggest they haven’t really updated, which I find fascinating. I’d love to have the opportunity to talk with them about it and understand better how their views have remained consistent despite the evidence that’s emerged since the papers were published.
It makes a very intuitively compelling argument! I think that, as with many confusions about the Chinese Room, the problem is that our intuitions fail at the relevant scale. Given an Internet’s worth of discussion of bears and sticks and weapons, the hyper-intelligent octopus’s model of those things is rich enough for the octopus to provide advice about them that would work in the real world, even if it perhaps couldn’t recognize a bear by sight. For example it would know that sticks have a certain distribution of mass, and are the sorts of things that could be bound together by rope (which it knows is available because of the coconut catapult), and that the combined sticks might have enough mass to serve as a weapon, and what amounts of force would be harmful to a bear, etc. But it’s very hard to understand just how rich those models can be when our intuitions are primed by a description of two people casually exchanging messages.
Perhaps relevant, she famously doesn’t like the arXiv, so maybe on principle she’s disregarding all evidence not from “real publications.”
Dean Ball’s Out of Thin Air: A proposal for the grid is the most “big if true” thing I’ve read on AI infrastructure recently:
And a convenient win-win:
I do wonder how believable this is, given my personal experience that the more familiar I am with a policy claim’s quantitative modelling and the more domain knowledge I have about it the less I trust it generally speaking...
I came across a similar “hack” on LinkedIn from Tom Styer:
“California just pulled off the nation’s largest-ever test of a virtual power plant.
This successful test proved VPPs are a fast, low-cost, zero-emissions way to make better use of the clean energy we already have — and to keep the lights on as demand surges from data centers, heat waves, and electrification.”
Basically, they are talking about allowing residential batteries supply the grid during peak demand. I tend to be skeptical about stuff like this because in my own domain, water, there’s a lot of pop science and bold claims that ignore scaling and logistics. I asked a smart fellow in that industry about it
https://substack.com/@energycrystals
and I thought he gave a good answer that aligns with my experience with water, which is it always come down to implementation:
”The challenge is lining up customer telemetry to incentive structures that matter. With standard demand response products (which some people sell as VPPs), the incentives given to customers don’t pay for the battery and the software admin and API costs to the utilities outweigh the cost savings of a VPPVPPs are vaporware until someone can make the business model pencil and the API integration and UX not suck ass”
So, without knowing more, my prior is that this free capacity is there for a reason, and that utilities aren’t that dumb. On the flip side, I think it’s great that we are thinking this way. Probing our systems and looking for efficiencies are worthwhile. our legacy infrastructure is a mess of path dependent bureaucracy and I’m certain there’s gains to be made in addition to new construction.
Whether this is feasible depends on how concentrated that 0.25% of the year is (expected to be), because that determines the size of the battery that you’d need to cover the blackout period (which I think would be unacceptable for a lot of AI customers).
If it happens in a single few days then this makes sense, buying 22GWh of batteries for a 1GW dataset is still extremely expensive (2B$ for a 20h system at 100$ / kWh plus installation, maybe too expensive for reliability for a 1GW datacenter I would expect, assuming maybe 10B revenue from the datacenter??). If it’s much less concentrated in time then a smaller battery is needed (100M$ for a 1h system at 100$/kWh), and I expect AI scalers would happily pay this for the reliability of their systems if the revenue from those datacenters
Demand response could be done by covering the data center with battery energy or not. Demand response and batteries can stack: if the grid is really stressed, a data center can both turn off and discharge its battery into the grid.
Economically, it makes sense to accept some true downtime to avoid months-long delays in data center construction. This is clearly true for training workloads which are very important but don’t have live demand. But downtime for even inference clusters is acceptable: you can reduce the compute demand by temporarily slowing down token generation, or use dynamic rate limits. And any curtailment would almost certainly be isolated to one region, so inference data centers in other places would still be operational.
In any case, the paper says the curtailments would last about two hours each:
I’ve heard a rule of thumb that if you can avoid buying power off the California power grid’s spot pricing in the 1% most expensive times, you can roughly get a 50% discount on power.
Interesting anecdote on “von Neumann’s onion” and his general style, from P. R. Halmos’ The Legend of John von Neumann:
(tangent: I’m a bit peeved by Halmos’ “lesser men” throwaway remark, mainly because I think interpretive research labor and distillation is very valuable, very hard to do well, somewhat orthogonal to vN-style competence, and very underappreciated and undersupplied.)
von Neumann was also courageous, Halmos wrote, in the following way:
Terry Tao is similar, according to Allen Knutson:
von Neumann also had endless capacity for work. Halmos:
I thought this was striking: why waste time on such seeming trivialities? But I guess if you’re John von Neumann you just have such a glut of brain cycles that you can spend it in ridiculously poorly-optimised ways like this instead of needing to 80⁄20 and still get your many, many jobs done.
I have this experience with @ryan_greenblatt—he’s got an incredible ability to keep really large and complicated argument trees in his head, so he feels much less need to come up with slightly-lossy abstractions and categorizations than e.g. I do. This is part of why his work often feels like huge, mostly unstructured lists. (The lists are more unstructured before his pre-release commenters beg him to structure them more.) (His code often also looks confusing to me, for similar reasons.)
While Dyson’s birds and frogs archetypes of mathematicians is oft-mentioned, David Mumford’s tribes of mathematicians is underappreciated, and I find myself pointing to it often in discussions that devolve into “my preferred kind of math research is better than yours”-type aesthetic arguments:
Mumford’s examples of each, both results and mathematicians:
Explorers:
Theaetetus (ncient Greek list of the five Platonic solids)
Ludwig Schläfli (extended the Greek list to regular polytopes in n dimensions)
Bill Thurston (“I never met anyone with anything close to his skill in visualization”)
the list of finite simple groups
Michael Artin (discovered non-commutative rings “lying in the middle ground between the almost commutative area and the truly huge free rings”)
Set theorists (“exploring that most peculiar, almost theological world of ‘higher infinities’”)
Mappers:
Mumford himself
arguably, the earliest mathematicians (the story told by cuneiform surveying tablets)
the Mandelbrot set
Ramanujan’s “integer expressible two ways as a sum of two cubes”
the Concinnitas project of Bob Feldman and Dan Rockmore of ten aquatints
Alchemists:
Abraham De Moivre
Oscar Zariski, Mumford’s PhD advisor (“his deepest work was showing how the tools of commutative algebra, that had been developed by straight algebraists, had major geometric meaning and could be used to solve some of the most vexing issues of the Italian school of algebraic geometry”)
the Riemann-Roch theorem (“it was from the beginning a link between complex analysis and the geometry of algebraic curves. It was extended by pure algebra to characteristic p, then generalized to higher dimensions by Fritz Hirzebruch using the latest tools of algebraic topology. Then Michael Atiyah and Isadore Singer linked it to general systems of elliptic partial differential equations, thus connecting analysis, topology and geometry at one fell swoop”)
Wrestlers:
Archimedes (“he loved estimating π and concocting gigantic numbers”)
Calculus (“stems from the work of Newton and Leibniz and in Leibniz’s approach depends on distinguishing the size of infinitesimals from the size of their squares which are infinitely smaller”)
Euler’s strange infinite series formulas
Stirling’s formula for the approximate size of n!
Augustin-Louis Cauchy (“his eponymous inequality remains the single most important inequality in math”)
Sergei Sobolev
Shing-Tung Yau
Detectives:
Andrew Wiles is probably the archetypal example
Roger Penrose (“”My own way of thinking is to ponder long and, I hope, deeply on problems and for a long time … and I never really let them go.”)
Strip Miners:
Alexander Grothendieck (“he greatest contemporary practitioner of this philosophy in the 20th century… Of all the mathematicians that I have met, he was the one whom I would unreservedly call a “genius”. … He considered that the real work in solving a mathematical problem was to find le niveau juste in which one finds the right statement of the problem at its proper level of generality. And indeed, his radical abstractions of schemes, functors, K-groups, etc. proved their worth by solving a raft of old problems and transforming the whole face of algebraic geometry)
Leonard Euler from Switzerland and Carl Fredrich Gauss (“both showed how two dimensional geometry lay behind the algebra of complex numbers”)
Eudoxus and his spiritual successor Archimedes (“he level they reached was essentially that of a rigorous theory of real numbers with which they are able to calculate many specific integrals. Book V in Euclid’s Elements and Archimedes The Method of Mechanical Theorems testify to how deeply they dug”)
Aryabhata
Some miscellaneous humorous quotes:
Scott Alexander’s Mistakes, Dan Luu’s Major errors on this blog (and their corrections), Gwern’s My Mistakes (last updated 11 years ago), and Nintil’s Mistakes (h/t @Rasool) are the only online writers I know of who maintain a dedicated, centralized page solely for cataloging their errors, which I admire. Probably not coincidentally they’re also among the thinkers I respect the most for repeatedly empirically grounding their reasoning. Some orgs do this too, like 80K’s Our mistakes, CEA’s Mistakes we’ve made, and GiveWell’s Our mistakes.
While I prefer dedicated centralized pages like those to one-off writeups for long content benefit reasons, one-off definitely beats none (myself included). In that regard I appreciate essays like Holden Karnofsky’s Some Key Ways in Which I’ve Changed My Mind Over the Last Several Years (2016), Denise Melchin’s My mistakes on the path to impact (2020), Zach Groff’s Things I’ve Changed My Mind on This Year (2017), Michael Dickens’ things I’ve changed my mind on, and this 2013 LW repository for “major, life-altering mistakes that you or others have made”, as well as by orgs like HLI’s Learning from our mistakes.
In this vein I’m also sad to see mistakes pages get removed, e.g. ACE used to have a Mistakes page (archived link) but now no longer do.
I’m not convinced Scott Alexander’s mistakes page accurately tracks his mistakes. E.g. the mistake on it I know the most about is this one:
But that’s basically wrong. The study found women’s arousal to chimps having sex to be very close to their arousal to nonsexual stimuli, and far below their arousal to sexual stimuli.
Thanks, good example.
I don’t have a mistakes page but last year I wrote a one-off post of things I’ve changed my mind on.
Thanks Michael. On another note, I’ve recommended some of your essays to others, so thanks for writing them as well.
I’m glad to hear that! I often don’t hear much response to my essays so it’s good to know you’ve read some of them :)
You’re welcome :) in particular, your 2015 cause selection essay was I thought a particularly high-quality writeup of the end-to-end process from personal values to actual donation choice and (I appreciated this) where you were most likely to change your mind, so I recommended it to a few folks as well as used it as a template myself back in the day.
In general I think theory-practice gap bridging via writeups like those are undersupplied, especially the end-to-end ones — more writeups bridge parts of the “pipeline”, but “full pipeline integration” done well is rare and underappreciated, which combined with how effortful it is to do it makes me not surprised there isn’t more of it.
Another good blog:
https://nintil.com/mistakes
Thanks! Added to the list.
I really liked this extended passage on math circles from John Psmith’s REVIEW: Math from Three to Seven, by Alexander Zvonkin, it made me wish math circles existed in my home country when I was younger:
You can start math circles really really young:
(Sadly I only learned of the existence of math circles well after graduation, a few years ago when I used to spend more time on Quora and noticed that Alon Amit, the most respected writer on math topics and someone who’d done many interesting things in his life, described himself simply as a “mathcircler”.)
Really appreciate the random excerpts you post regarding math philosophy/reasoning/etc. They’re consistently interesting and insightful.
I feel like we perhaps need to reach some “escape velocity” to get something like that going, but for ~rationality / deliberately figuring out how to think and act better.
Peter Watts is working with Neill Blomkamp to adapt his novel Blindsight into an 8-10-episode series:
When I first read Blindsight over a decade ago it blew my brains clean out of my skull. I’m cautiously optimistic about the upcoming series, we’ll see…
Blindsight was very well written but based on a premise that I think is importantly and dangerously wrong. That premise is that consciousness (in the sense of cognitive self-awareness) is not important for complex cognition.
This is the opposite of true, and a failure to recognize this is why people are predicting fantastic tool AI that doesn’t become self-aware and goal-directed.
The proof won’t fit in the margin unfortunately. To just gesture in that direction: it is possible to do complex general cognition without being able to think about one’s self and one’s cognition. It is much easier to do complex general cognition if the system is able to think about itself and its own thoughts.
Just checking if I understood your argument: is the general point that an algorithm that can think about literally everything is simpler and therefore easier to make or evolve than an algorithm that can think about literally everything except for itself and how other agents perceive it?
Exactly.
I’d go a bit farther and say it’s easier to develop an algorithm that can think about literally everything than one that can think about roughly half of things. That’s because the easiest general intelligence algorithms are about learning and reasoning, which apply to everything.
I don’t see where you get that. I saw no suggestion that the aliens (or vampires) in Blindsight were unaware of their own existence, or that they couldn’t think about their own interactions with the world. They didn’t lack any cognitive capacities at all. They just had no qualia, and therefore didn’t see the point of doing anything just for the experience.
There’s a gigantic difference between cognitive self-awareness and conscious experience.
I believe the Scramblers from blindsight weren’t self aware, which means they couldn’t think about their own interactions with the world.
As I recall the crew was giving one of the Scramblers a series of cognitive tests. It aced all the tests that had to do with numbers and spatial reasoning, but failed a test that required the testee to be self aware.
I guess it depends on how it’s described in context. And I have to admit it’s been a long time. I’d go reread it to see, but I don’t think I can handle any more bleakness right now...
Thanks, is there anything you can point me to for further reading, whether by you or others?
I can see this making sense in one frame, but not in another. The frame which seems most strongly to support the ‘Blindsight’ idea is Friston’s stuff—specifically how the more successful we are at minimizing predictive error, the less conscious we are.[1]
My general intuition, in this frame, is that as intelligence increases more behaviour becomes automatic/subconscious. It seems compatible with your view that a superintelligent system would possess consciousness, but that most/all of its interactions with us would be subconscious.
Would like to hear more about this point, could update my views significantly. Happy for you to just state ‘this because that, read X, Y, Z etc’ without further elaboration—I’m not asking you to defend your position, so much as I’m looking for more to read on it.
This is my potentially garbled synthesis of his stuff, anyway.
I’m not sure about Friston’s stuff to be honest.
But Watts lists a whole bunch of papers in support of the blindsight idea, contra Seth’s claim — to quote Watts:
“In fact, the nonconscious mind usually works so well on its own that it actually employs a gatekeeper in the anterious cingulate cortex to do nothing but prevent the conscious self from interfering in daily operations”
footnotes: Matsumoto, K., and K. Tanaka. 2004. Conflict and Cognitive Control. Science 303: 969-970; 113 Kerns, J.G., et al. 2004. Anterior Cingulate Conflict Monitoring and Adjustments in Control. Science 303: 1023-1026; 114 Petersen, S.E. et al. 1998. The effects of practice on the functional anatomy of task performance. Proceedings of the National Academy of Sciences 95: 853-860
“Compared to nonconscious processing, self-awareness is slow and expensive”
footnote: Matsumoto and Tanaka above
“The cost of high intelligence has even been demonstrated by experiments in which smart fruit flies lose out to dumb ones when competing for food”
footnote: Proceedings of the Royal Society of London B (DOI 10.1098/rspb.2003.2548)
“By way of comparison, consider the complex, lightning-fast calculations of savantes; those abilities are noncognitive, and there is evidence that they owe their superfunctionality not to any overarching integration of mental processes but due to relative neurological fragmentation”
footnotes: Treffert, D.A., and G.L. Wallace. 2004. Islands of genius. Scientific American 14: 14-23; Anonymous., 2004. Autism: making the connection. The Economist, 372(8387): 66
“Even if sentient and nonsentient processes were equally efficient, the conscious awareness of visceral stimuli—by its very nature— distracts the individual from other threats and opportunities in its environment”
footnote: Wegner, D.M. 1994. Ironic processes of mental control. Psychol. Rev. 101: 34-52
“Chimpanzees have a higher brain-to-body ratio than orangutans, yet orangs consistently recognise themselves in mirrors while chimps do so only half the time”
footnotes: Aiello, L., and C. Dean. 1990. An introduction to human evolutionary anatomy. Academic Press, London; 123 Gallup, G.G. (Jr.). 1997. On the rise and fall of self-conception in primates. In The Self Across Psychology— self-recognition, self-awareness, and the Self Concept. Annals of the NY Acad. Sci. 818:4-17
“it turns out that the unconscious mind is better at making complex decisions than is the conscious mind”
footnote: Dijksterhuis, A., et al. 2006. Science 311:1005-1007
(I’m also reminded of DFW’s How Tracy Austin Broke My Heart.)
To be clear I’m not arguing that “look at all these sources, it must be true!” (we know that kind of argument doesn’t work). I’m hoping for somewhat more object-level counterarguments is all, or perhaps a better reason to dismiss them as being misguided (or to dismiss the picture Watts paints using them) than what Seth gestured at. I’m guessing he meant “complex general cognition” to point to something other than pure raw problem-solving performance.
What fraction of economically-valuable cognitive labor is already being automated today? How has that changed over time, especially recently?
I notice I’m confused about these ostensibly extremely basic questions, which arose in reading Open Phil’s old CCF-takeoff report, whose main metric is “time from AI that could readily[2] automate 20% of cognitive tasks to AI that could readily automate 100% of cognitive tasks”. A cursory search of Epoch’s data, Metaculus, and this forum didn’t turn up anything, but I didn’t spend much time at all doing so.
I was originally motivated by wanting to empirically understand recursive AI self-improvement better, which led to me stumbling upon the CAIS paper Examples of AI Improving AI, but I don’t have any sense whatsoever of how the paper’s 39 examples as of Oct-2023 translate to OP’s main metric even after constraining “cognitive tasks” in its operational definition to just AI R&D.
I did find this 2018 survey of expert opinion
which would suggest that OP’s clock should’ve started ticking in 2018, so that incorporating CCF-takeoff author Tom Davidson’s “~50% to a <3 year takeoff and ~80% to <10 year i.e. time from 20%-AI to 100%-AI, for cognitive tasks in the global economy” means takeoff should’ve already occurred… so I’m dismissing this survey’s relevance to my question (sorry).
Did e.g. a telephone operator in 1910 perform cognitive labor, by the definition we want to use here?
I’m mainly wondering how Open Phil, and really anyone who uses fraction of economically-valuable cognitive labor automated / automatable (e.g. the respondents to that 2018 survey; some folks on the forum) as a useful proxy for thinking about takeoff, tracks this proxy as a way to empirically ground their takeoff-related reasoning. If you’re one of them, I’m curious if you’d answer your own question in the affirmative?
I am not one of them—I was wondering the same thing, and was hoping you had a good answer.
If I was trying to answer this question, I would probably try to figure out what fraction of all economically-valuable labor each year was cognitive, the breakdown of which tasks comprise that labor, and the year-on-year productivity increases on those task, then use that to compute the percentage of economically-valuable labor that is being automated that year.
Concretely, to get a number for the US in 1900 I might use a weighted average of productivity increases across cognitive tasks in 1900, in an approach similar to how CPI is computed
Look at the occupations listed in the 1900 census records
Figure out which ones are common, and then sample some common ones and make wild guesses about what those jobs looked like in 1900
Classify those tasks as cognitive or non-cognitive
Come to estimate that record-keeping tasks are around a quarter to a half of all cognitive labor
Notice that typewriters were starting to become more popular - about 100,000 typewriters sold per year
Note that those 100k typewriters were going to the people who would save the most time by using them
As such, estimate 1-2% productivity growth in record-keeping tasks in 1900
Multiply the productivity growth for record-keeping tasks by the fraction of time (technically actually 1-1/productivity increase but when productivity increase is small it’s not a major factor)
Estimate that 0.5% of cognitive labor was automated by specifically typewriters in 1900
Figure that’s about half of all cognitive labor automation in 1900
and thus I would estimate ~1% of all cognitive labor was automated in 1900. By the same methodology I would probably estimate closer to 5% for 2024.
Again, though, I am not associated with Open Phil and am not sure if they think about cognitive task automation in the same way.
I chose to study physics in undergrad because I wanted to “understand the universe” and naively thought string theory was the logically correct endpoint of this pursuit, and was only saved from that fate by not being smart enough to get into a good grad school. Since then I’ve come to conclude that string theory is probably a dead end, albeit an astonishingly alluring one for a particular type of person. In that regard I find anecdotes like the following by Ron Maimon on Physics SE interesting — the reason string theorists believe isn’t the same as what they tell people, so it’s better to ask for their conversion stories:
The rest of Ron’s answer elaborates on his own conversion story. The interesting part to me is that Ron began by trying to “kill string theory”, and in fact he was very happy that he was going to do so, but then was annoyed by an argument of his colleague that mathematically worked, and in the year or two he spent puzzling over why it worked he had an epiphany that convinced him string theory was correct, which sounds like nonsense to the uninitiated. (This phenomenon where people who gain understanding of the thing become incomprehensible to others sounds a lot like the discussions on LW on enlightenment by the way.)
In pure math, mathematicians seek “morality”, which sounds similar to Ron’s string theory conversion stories above. Eugenia Cheng’s Mathematics, morally argues:
That last part is quite reminiscent of what the late Bill Thurston argued in his classic On proof and progress in mathematics:
Thurston’s personal reflections below on the sociology of proof exemplify the search for mathematical morality instead of fully formally rigorous correctness. I remember being disquieted upon first reading “There were published theorems that were generally known to be false” a long time ago:
The more you know about particle physics and quantum field theory, the more inevitable string theory seems. There are just too many connections. However, identifying the specific form of string theory that corresponds to our universe is more of a challenge, and not just because of the fabled 10^500 vacua (though it could be one of those). We don’t actually know either all the possible forms of string theory, or the right way to think about the physics that we can see. The LHC, with its “unnaturally” light Higgs boson, already mortally wounded a particular paradigm for particle physics (naturalness) which in turn was guiding string phenomenology (i.e. the part of string theory that tries to be empirically relevant). So along with the numerical problem of being able to calculate the properties of a given string vacuum, the conceptual side of string theory and string phenomenology is still wide open for discovery.
I asked a well-known string theorist about the fabled 10^500 vacua and asked him whether he worried that this would make string theory a vacuous theory since a theory that fits anything fits nothing. He replied ′ no, no the 10^500 ‘swampland’ is a great achievement of string theory—you see… all other theories have infinitely many adjustable parameters’. He was saying string theory was about ~1500 bits away from the theory of everything but infinitely ahead of its competitors.
Diabolical.
Much ink has been spilled on the scientific merits and demerits of string theory and its competitors. The educated reader will recognize that this all this and more is of course, once again, solved by UDASSA.
Re other theories, I don’t think that all other theories in existence have infinitely many adjustable parameters, and if he’s referring to the fact that lots of theories have adjustable parameters that can range over the real numbers, which are infinitely complicated in general, than that’s different, and string theory may have this issue as well.
Re string theory’s issue of being vacuous, I think the core thing that string theory predicts that other quantum gravity models don’t is that at the large scale, you recover general relativity and the standard model, whereas no other theory can yet figure out a way to properly include both the empirical effects of gravity and quantum mechanics in the parameter regimes where they are known to work, so string theory predicts more just by predicting the things other quantum mechanics predicts while having the ability to include in gravity without ruining the other predictions, whereas other models of quantum gravity tend to ruin empirical predictions like general relativity approximately holding pretty fast.
I used to consider it a mystery that math was so unreasonably effective in the natural sciences, but changed my mind after reading this essay by Eric S. Raymond (who’s here on the forum, hi and thanks Eric), in particular this part, which is as good a question dissolution as any I’ve seen:
(it’s a shame this chart isn’t rendering properly for some reason, since without it the rest of Eric’s quote is ~incomprehensible)
I also think I was intuition-pumped to buy Eric’s argument by Julie Moronuki’s beautiful meandering essay The Unreasonable Effectiveness of Metaphor.
Interesting. This reminds me of a related thought I had: Why do models with differential equations work so often in physics but so rarely in other empirical sciences? Perhaps physics simply is “the differential equation science”.
Which is also related to the frequently expressed opinion that philosophy makes little progress because everything that gets developed enough to make significant progress splits off from philosophy. Because philosophy is “the study of ill-defined and intractable problems”.
Not saying that I think these views are accurate, though they do have some plausibility.
(To be honest, to first approximation my guess mirrors yours.)
Flagging that those two examples seem false. The weather is chaotic, yes, and there’s a sense in which the economy is anti-inductive, but modeling methods are advancing, and will likely find more loop-holes in chaos theory.
For example, in thermodynamics, temperature is non-chaotic while the precise kinetic energies and locations of all particles are. A reasonable candidate similarity in weather are hurricanes.
Similarly as our understanding of the economy advances it will get more efficient which means it will be easier to model. eg (note: I’ve only skimmed this paper). And definitely large economies are even more predictable than small villages, talk about not having a competitive market!
Thanks for the pointer to that paper, the abstract makes me think there’s a sort of slow-acting self-reinforcing feedback loop between predictive error minimisation via improving modelling and via improving the economy itself.
re: weather, I’m thinking of the chart below showing how little gain we get in MAE vs compute, plus my guess that compute can’t keep growing far enough to get MAE < 3 °F a year out (say). I don’t know anything about advancements in weather modelling methods though; maybe effective compute (incorporating modelling advancements) may grow indefinitely in terms of the chart.
I didn’t say anything about temperature prediction, and I’d also like to see any other method (intuition based or otherwise) do better than the current best mathematical models here. It seems unlikely to me that the trends in that graph will continue arbitrarily far.
Yeah, that was my claim.
Matt Leifer, who works in quantum foundations, espouses a view that’s probably more extreme than Eric Raymond’s above to argue why the effectiveness of math in the natural sciences isn’t just reasonable but expected-by-construction. In his 2015 FQXi essay Mathematics is Physics Matt argued that
(Matt notes as an aside that he’s arguing for precisely the opposite of Tegmark’s MUH.)
Why “scale-free network”?
As an aside, Matt’s theory of theory-building explains (so he claims) what mathematical intuition is about: “intuition for efficient knowledge structure, rather than intuition about an abstract mathematical world”.
So what? How does this view pay rent?
Matt further develops the argument that the structure of human knowledge being networked-not-hierarchical implies that the idea that there is a most fundamental discipline, or level of reality, is mistaken in Against Fundamentalism, another FQXi essay published in 2018.
I would also comment that, if the environment was so chaotic that roughly everything important to life could not be modeled—if general-purpose modeling ability was basically useless—then life would not have evolved that ability, and “intelligent life” probably wouldn’t exist.
The two concepts that I thought were missing from Eliezer’s technical explanation of technical explanation that would have simplified some of the explanation were compression and degrees of freedom. Degrees of freedom seems very relevant here in terms of how we map between different representations. Why are representations so important for humans? Because they have different computational properties/traversal costs while humans are very computationally limited.
Can you say more about what you mean? Your comment reminded me of Thomas Griffiths’ paper Understanding Human Intelligence through Human Limitations, but you may have meant something else entirely.
Griffiths argued that the aspects we associate with human intelligence – rapid learning from small data, the ability to break down problems into parts, and the capacity for cumulative cultural evolution – arose from the 3 fundamental limitations all humans share: limited time, limited computation, and limited communication. (The constraints imposed by these characteristics cascade: limited time magnifies the effect of limited computation, and limited communication makes it harder to draw upon more computation.) In particular, limited computation leads to problem decomposition, hence modular solutions; relieving the computation constraint enables solutions that can be objectively better along some axis while also being incomprehensible to humans.
Thanks for the link. I mean that predictions are outputs of a process that includes a representation, so part of what’s getting passed back and forth in the diagram are better and worse fit representations. The degrees of freedom point is that we choose very flexible representations, whittle them down with the actual data available, then get surprised that that representation yields other good predictions. But we should expect this if Nature shares any modular structure with our perception at all, which it would if there was both structural reasons (literally same substrate) and evolutionary pressure for representations with good computational properties i.e. simple isomorphisms and compressions.
Neat example of mundane LLM utility: Automation of Systematic Reviews with Large Language Models
Pretty cool since “SRs are incredibly resource-intensive, typically taking over 16 months and costing upwards of $100,000 to complete”. They used GPT-4.1 for screening articles and o3-mini-high for data extraction.
otto-SR seems much better than Elicit in particular, which is notable to me for being the gold standard DR tool according to Sarah Constantin’s review.
This remark at 16:10 by Dwarkesh Patel on his most recent podcast interview AMA: Career Advice Given AGI, How I Research ft. Sholto & Trenton was pretty funny:
(After Sarah Paine comes geneticist David Reich, then Satya Nadella and Mark Zuckerberg, “then [Sholto & Trenton] or Leopold (Aschenbrenner) or something, then you get to the lab CEOs or something”)
You can see it as an example of ‘alpha’ vs ‘beta’. When someone asks me about the value of someone as a guest, I tend to ask: “do they have anything new to say? didn’t they just do a big interview last year?” and if they don’t but they’re big, “can you ask them good questions that get them out of their ‘book’?” Big guests are not necessarily as valuable as they may seem because they are highly-exposed, which means both that (1) they have probably said everything they will said before and there is no ‘news’ or novelty, and (2) they are message-disciplined and careful to “talk their book”. (In this analogy, “alpha” represents undiscovered or neglected interview topics which can be extracted mostly just by finding it and then asking the obvious question, usually by interviewing new people; “beta” represents doing standard interview topics/people, but much more so—harder, faster, better—and getting new stuff that way.)
Lex Fridman podcasts are an example of this: he often hosts very big guests like Mark Zuckerberg, but nevertheless, I will sit down and skim through the transcript of 2-4 hours of content, and find nothing even worth excerpting for my notes. Fridman notoriously does no research and asks softball questions, and invites the biggest names he can get regardless of overexposure, and so if you do that, you will get nothing new. He has found no alpha, and he doesn’t interview hard enough to extract beta. So he’s sort of the high-expense ratio index fund of podcast interviews.
Sarah Paine, on the other hand, seems to have been completely unknown and full of juicy nuggets, and is like winning the lottery: you can make a career off a really good trade like Paine before it gets crowded. However, if another successful podcaster has her on, they will probably not discover Paine is their most popular or growth-productive guest ever. The well is dry. Paine may have more to say someday, but that day is probably closer to “5 years from today” than “tomorrow”.
(So a good interviewer adopts an optimal foraging mindset: once you have harvested a patch of its delicious food, you have to move on to another patch, which hasn’t been exhausted yet, and let the original patch slowly recover.)
So a great guest for Dwarkesh’s blog would be, say Hans Moravec or Paul J. Werbos: Moravec hasn’t done anything publicly in at least a decade, and is fallow; while Werbos has been more active and in the public eye, but still not much and is such a weird guy that just about any questions will be interesting. Reich was also a good guest because while Reich is very ‘public’ in some senses (he’s written popularizing books, even), he is still obscure, almost none of what he has published is well-known, and he is involved in so much fast-paced research that even the book is now substantially obsolete and he has a lot of new stuff to say. (And Reich will have more stuff to say if revisited in, say, 2 years for an update, so a harvester will be making a note to revisit him if the current crop of interview candidates in the pipeline is looking marginal.) A difficult or mediocre guest would be Tony Blair: he can surely say many interesting things about the current geopolitical context and his work since being PM… but he is a super-experienced career politician who has survived countless Question Times, and may eat you for breakfast and exploit you for ulterior purposes (rather than vice-versa). Similarly, Mark Zuckerberg and Satya Nadella are tough nuts: there’s meat there, but are you willing enough to bring down the hammer or will you settle for a mediocre result that mostly just fills space and is not a must-watch? A bad guest might be someone controlling and extremely PR-savvy like MrBeast; this is the sort of guy who will give you a bad interview pushing his ‘book’ shamelessly, and then might wind up spiking the interview anyway if he felt he wasn’t getting enough out of it, and just drops it as a sunk cost (though it was weeks of work on your part and blows a hole in your schedule—that’s not his problem).
I like the optimal forager take, seems intuitively correct. I’d add that Dwarkesh struck gold by getting you on his podcast too. (Tangentially: this grand theory of intelligence video snippet reminds me of a page-ish-long writeup on that I stumbled upon deep in the bowels of https://gwern.net/ which I’ve annoyingly never been able to find again.)
Also thanks for the pointer to Werbos, his website Welcome to the Werbos World! funnily enough struck me as crackpot-y and I wouldn’t have guessed just from the landing page that he’s the discoverer of backprop, respected former program director at the NSF, etc.
Probably https://gwern.net/newsletter/2021/05#master-synthesis
That’s what makes it alpha! If he was as legible as, say, Hinton, he would be mined out by now, and nothing but beta. (Similar situation to Schmidhuber - ‘obvious crackpot’ - although he’s such a self-promoter that he overcomes it, and so at this point there’s no alpha talking to him; the stuff that would be interesting, like his relationship to certain wealthy Italians, or to King Bonesaws, or how he’s managed to torpedo his career so spectacularly, he will not talk about. Also, I understand he likes to charge people for the privilege of talking to him.) You have to have both domain knowledge and intellectual courage to know about Werbos and eg. read his old interviews and be willing to go out on a limb and interview him.
This seems to underrate the value of distribution. I suspect another factor to take into account is the degree of audience overlap. Like there’s a lot of value in booking a guest who has been on a bunch of podcasts, so long as your particular audience isn’t likely to have been exposed to them.
I’d guess that was “I have a lecture series with her” :-)
D’oh, you’re obviously right, thanks!
Balioc’s A taxonomy of bullshit jobs has a category called Worthy Work Made Bullshit which resonated with me most of all:
Potentially relevant: this thread about a massive software service that did useful work, but ultimately could’ve been outperformed (at 100x) by a small, easy-to-implement adjustment to the overarching system.
I just learned about the idea of “effectual thinking” from Cedric Chin’s recent newsletter issue. He notes, counterintuitively to me, that it’s the opposite of causal thinking, and yet it’s the one thing in common in all the successful case studies he could find in business:
(see also search term “forward chaining vs back-chaining.”)
This seems like reasonable life advice for people generally trying to accumulate resources and do something cool. I’m not sure about people who actually have specific goals they want to accomplish. I think in the domain of AI safety, forward chaining is insufficient (seems like the kind of thing that gets you OpenAI and Anthropic)
The principles I sort of try to live by are, each 2-weeks, I should have done:
some actions that forward chain towards more compounding resources
some actions explicitly backchaining from longterm goals
ship something concrete to users
do something wholesome
Which is I think aiming to accomplish similar goals towards the OP, without losing the plot on my more specific goals.
This maybe reminds me:
I currently feel confused about how to integrate “the kind of thinking that is good at momentum / action” and “the kind of good that is good at creative strategy”. And it seems like there should be more of a way to unify them into a holistic way-of-being.
The four checksums above are there to make sure I’m not being myopic in some way in a broader sense, but they apply more at the timescale of weeks than hours or days.
You might just say “well, idk, each week or day, just figure out if it’s more like a momentum week or more like a creative strategy week”. I feel dissatisfied with this for some reason.
At least part of it is “I think on average people/me could use to be in creative/broader strategy mode more often, even when in a Momentum mode period.”
Another part is “there are strategy skills I want to be practicing, that are hard to practice if I don’t do them basically every day. They aren’t as relevant in a momentum-period, but they’re not zero relevant.
Hrm. I think maybe what’s most dissatisfying right now is that I just haven’t compressed all the finnicky details of it, and it feels overwhelming to think about the entire “how to think” project, which is usually an indicator I am missing the right abstraction.
Unbundling Tools for Thought is an essay by Fernando Borretti I found via Gwern’s comment which immediately resonated with me (emphasis mine):
I’ve tried three different times to create a personal wiki, using the last one for a solid year and a half before finally giving up and just defaulting to a janky combination of Notion and Google Docs/Sheets, seduced by sites like Cosma Shalizi’s and Gwern’s long content philosophy (emphasis mine):
Fernando unbundles the use cases of a tool for thought in his essay; I’ll just quote the part that resonated with me:
(Tangentially, an interesting example of how comprehensively subsuming spaced repetition is is Michael Nielsen’s Using spaced repetition systems to see through a piece of mathematics, in which he describes how he used “deep Ankification” to better understand the theorem that a complex normal matrix is always diagonalizable by a unitary matrix, as an illustration of a heuristic one could use to deepen one’s understanding of a piece of mathematics in an open-ended way, inspired by Andrey Kolmogorov’s essay on, of all things, the equals sign. I wish I read that while I was still studying physics in school.)
Fernando, emphasis mine:
Minimizing friction is surprisingly difficult. I keep plain-text notes in a hierarchical editor (cherrytree), but even that feels too complicated sometimes. This is not just about the tool… what you actually need is a combination of the tool and the right way to use it.
(Every tool can be used in different ways. For example, suppose you write a diary in MS Word. There are still options such as “one document per day” or “one very long document for all”, and things in between like “one document per month”, which all give different kinds of friction. The one megadocument takes too much time to load. It is more difficult to search in many small documents. Or maybe you should keep your current day in a small document, but once in a while merge the previous days into the megadocument? Or maybe switch to some application that starts faster than MS Word?)
Forgetting is an important part. Even if you want to remember forever, you need some form of deprioritizing. Something like “pages you haven’t used for months will get smaller, and if you search for keywords, they will be at the bottom of the result list”. But if one of them suddenly becomes relevant again, maybe the connected ones become relevant, too? Something like associations in brain. The idea is that remembering the facts is only a part of the problem; making the relevant ones more accessible is another. Because searching in too much data is ultimately just another kind of friction.
It feels like a smaller version of the internet. Years ago, the problem used to be “too little information”, now the problem is “too much information, can’t find the thing I actually want”.
Perhaps a wiki, where the pages could get flagged as “important now” and “unimportant”? Or maybe, important for a specific context? And by default, when you choose a context, you would only see the important pages, and the rest of that only if you search for a specific keyword or follow a grey link. (Which again would require some work creating and maintaining the contexts. And that work should also be as frictionless as possible.)
@dkl9 wrote a very eloquent and concise piece arguing in favor of ditching “second brain” systems in favor of SRSs (Spaced Repetition Systems, such as Anki).
I like to think of learning and all of these things as self-contained smaller self-contained knowledge trees. Building knowledge trees that are cached, almost like creatin zip files and systems where I store a bunch of zip files similar to what Elizier talks about in The Sequences.
Like when you mention the thing about Nielsen on linear algebra it opens up the entire though tree there. I might just get the association to something like PCA and then I think huh, how to ptimise this and then it goes to QR-algorithms and things like a householder matrix and some specific symmetric properties of linear spaces...
If I have enough of these in an area then I might go back to my anki for that specific area. Like if you think from the perspective of schedulling and storage algorithms similar to what is explored in algorithms to live by you quickly understand that the magic is in information compression and working at different meta-levels. Zipped zip files with algorithms to expand them if need be. Dunno if that makes sense, agree with the exobrain creep that exists though.
Interesting quote by Michael Nielsen:
Venkatesh Rao surprised me in What makes a good teacher? by saying the opposite of what I expected him to say re: his educational experience, given who he is:
Turning to his own personal experience:
I like this steelman.
(Not a take, just pulling out infographics and quotes for future reference from the new DeepMind paper outlining their approach to technical AGI safety and security)
Overview of risk areas, grouped by factors that drive differences in mitigation approaches:
Overview of their approach to mitigating misalignment:
Overview of their approach to mitigating misuse:
Path to deceptive alignment:
How to use interpretability:
Interpretability techniques:
learning
vectors
attribution
Attribution
(causal)
Assorted random stuff that caught my attention:
They consider Exceptional AGI (Level 4) from Morris et al. (2023), defined as an AI system that matches or exceeds that of the 99th percentile of skilled adults on a wide range of non-physical tasks (contra the Metaculus “when AGI?” question that has diverse robotic capabilities, so their 2030 is probably an overestimate)
The irrelevance of physical limits to the paper’s scope: “By considering the construction of “the ultimate laptop”, Lloyd (2000) suggests that Moore’s law (formalized as an 18 month doubling) cannot last past 2250. Krauss and Starkman (2004) consider limits on the total computation achievable by any technological civilization in our expanding universe—this approach imposes a (looser) 600-year limit in Moore’s law. However, since we are very far from these limits, we do not expect them to have a meaningful impact on timelines to Exceptional AGI”
Structural risks are “out of scope of this paper” because they’re “a much bigger category, often with each risk requiring a bespoke approach. They are also much harder for an AI developer to address, as they often require new norms or institutions to shape powerful dynamics in the world” (although “much of the technical work discussed in this paper will also be relevant for structural risks”)
Mistakes are also out of scope because “standard safety engineering practices (e.g. testing) can drastically reduce risks, and should be similarly effective for averting AI mistakes as for human mistakes… so we believe that severe harm from AI mistakes will be significantly less likely than misuse or misalignment, and is further reducible through appropriate safety practices”
The paper focuses “primarily on techniques that can be integrated into current AI development, due to our focus on anytime approaches to safety” i.e. excludes “research bets that pay out over longer periods of time but can provide increased safety, such as agent foundations, science of deep learning, and application of formal methods to AI”
Algorithmic progress papers: “Erdil and Besiroglu (2022) sought to decompose AI progress in a way that can be attributed to the separate factors of scaling (compute, model size and data) and algorithmic innovation, and concluded that algorithmic progress doubles effective compute budgets roughly every nine months. Ho et al. (2024) further extend this approach to study algorithmic improvements in the pretraining of language models for the period of 2012 − 2023. During this period, the authors estimate that the compute required to reach a set performance threshold halved approximately every eight months”
Explosive economic growth paper: “Recent modeling by Erdil et al. (2025) that draws on empirical scaling laws and semi-endogenous growth theory and models changes in compute, automation and production supports the plausibility of very rapid growth in Gross World Product (e.g. exceeding 30% per year in 2045) when adopting parameters from empirical data, existing literature and reasoned judgment” (I’m still wondering how this will get around johnswentworth’s objection to using GDP to track this)
General competence scales smoothly with compute: “Owen (2024) find that aggregate benchmarks (BIG-Bench (Srivastava et al., 2023), MMLU (Hendrycks et al., 2020)) are predictable with up to 20 percentage points of error when extrapolating through one order of magnitude (OOM) of compute. Gadre et al. (2024) similarly find that aggregate task performance can be predicted with relatively high accuracy, predicting average top-1 error across 17 tasks to within 1 percentage point using 20× less compute than is used for the predicted model. Ruan et al. (2024) find that 8 standard downstream LLM benchmark scores across many model families are well-explained in terms of their top 3 principal components. Their first component scales smoothly across 5 OOMs of compute and many model families, suggesting that something like general competence scales smoothly with compute”
“given that total labor compensation represents over 50% of global GDP (International Labour Organisation, 2022), it is clear that the economic incentive for automation is extraordinarily large”
I currently work in policy research, which feels very different from my intrinsic aesthetic inclination, in a way that I think Tanner Greer captures well in The Silicon Valley Canon: On the Paıdeía of the American Tech Elite:
While I don’t work in Washington, this description rings true to my experience, and I find it aesthetically undesirable. Greer contrasts this with the Silicon Valley aesthetic, which is far more like the communities I’m familiar with:
Asked ChatGPT o3 to read 6 links (nothing intensive, just the familiar essays on cognitive decoupling by Jacob Falkovich, Sarah Constantin, John Nerst, drossbucket) and its reasoning repeatedly mentions this which I’ve never seen before (emphasis mine):
Maybe I’m missing something, but I thought the 200k token context window would obviate “overwhelming myself with >250 lines of text” which seems oddly low, not sure what’s going on...
I think this might be an attempted countermeasure against prompt injection. That is, it wants to mix autoregressive and reconstructed residuals. Otherwise, it might lose it’s train of thought (end up continuing the article not following the prompt).
Saving mathematician Robert Ghrist’s tweet here for my own future reference re: AI x math:
Someone asked why split things between o3 and 2.5 Pro; Ghrist:
As an aside, I’ve noticed that the math subreddit tends to be exceedingly negative on AI x math in a way that seems ignorant of recent progress and weirdly defensive without being all that aware of it, while some of the top mathematicians seem to be pretty excited about it, like Terry Tao cf. his most recent post A proof of concept tool to verify estimates:
I enjoyed Brian Potter’s Energy infrastructure cheat sheet tables over at Construction Physics, it’s a great fact post. Here are some of Brian’s tables — if they whet your appetite, do check out his full essay.
Energy quantities:
Units and quantities
It’s amazing that a Tesla Model 3′s standard battery pack has an OOM less energy capacity than a typical 15-gallon ICE car gas tank, and is probably heavier too, yet a Model 3 isn’t too far behind in range and is far more performant. It’s also amazing that an oil supertanker carries ~3 megatons(!) of TNT worth of energy.
Energy of various activities:
Power output:
Activity or infrastructure
This observation by Brian is remarkable:
US energy flows Sankey diagram by LLNL (a “quad” is short for “a quadrillion British Thermal Units,” or 293 terawatt-hours):
I had a vague inkling that a lot of energy is lost on the way to useful consumption, but I was surprised by the two-thirds fraction; the 61.5 quads of rejected energy is more than every other country in the world consumes except China. I also wrongly thought that the largest source of inefficiency was in transmission losses. Brian explains:
I also didn’t realise that biomass is so much larger than solar in the US (I expect this of developing countries), although likely not for long given the ~25% annual growth rate.
Energy conversion efficiency:
Energy equipment or infrastructure
Finally, (US) storage:
Type
I vaguely knew grid energy storage was much less than hydrocarbon, but I didn’t realise it was 10,000 times less!
A subgenre of fiction I wish I could read more of is rationalist-flavored depictions of utopia that centrally feature characters who intentionally and passionately pursue unpleasant experiences, which I don’t see much of. It’s somewhat surprising since it’s a pretty universal orientation.
For instance, and this is a somewhat extreme version, I’m a not-that-active member of a local trail running group (all professionals with demanding day jobs) that meets regularly for creative sufferfests like treasure hunt races in the mountains, some of whom regularly fly to regional races on weekends. The suffering (and overcoming it) is almost the point, everyone excitingly trades stories in this vein, and the long-timers especially seem to derive tremendous meaning from this almost regardless of how badly they do (finishing near the bottom, throwing up and crying multiple times, getting lost, etc).
The Barkley Marathons is the logical endpoint of this. I think of physicist-turned-quant Brett Maune’s race reports for instance, think to myself “he really does deliberately subject himself to this on weekends, wtf”, and wonder what his ilk would do in their versions of utopia. Maybe another way to put this is what their utopias’ laws of fun would be like. Maybe they’re just too busy enjoying sufferfests and looking for the next ones to join to be writing stories…
Have you read The Metamorphosis of Prime Intellect? Fits the bill.
In books about the Culture sci fi universe such things are described a couple of times. E.g. in the novel “Use of Weapons” the “crew” (the ship is fully automated, so more like permanent passengers) of a ship deliberately weaken their immune system to basically get a seasonal cold just for the experience, which otherwise could not happen due to their genetically enhanced immune system.
Also lava rafting and other extreme sports, maybe in Look to Windward which focuses a bit more on the Culture. Many of the human protagonists in the Culture experience significant self-hatred, although that’s not the only reason to seek out experiences so difficult they may become net negative. It’s as though the Culture is missing advanced therapeutic techniques along with a desire for immortality. I’d like an updated utopia.
Well, the disturbed protagonists in the Culture series (as in: books, and in the whole of the fictional universe) are usually not from the “Culture” (one particular civilizations within the whole fictional universe), but outsiders hired to act as agents.
Hm, interesting. I remembered that about Zakalwe but my memory for the others is vague. So maybe Culture citizens are so well-adjusted that they wouldn’t risk their lives?
Pilish is a constrained writing style where the number of letters in consecutive words match the digits of pi. The canonical intro-to-Pilish sentence is “How I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics!”; my favorite Pilish poetry is Mike Keith’s Near a Raven, a retelling of Edgar Allan Poe’s “The Raven” stretching to 740 digits of pi (nowhere near Keith’s longest, that would be the 10,000-word world record-setting Not a Wake), which begins delightfully like so:
I wondered if today’s leading LLMs could write interesting Pilish poems. The answer is “emphatically not”.
Here’s the start of Claude Sonnet 3.7′s attempt. It doesn’t even realise how wrong it is:
Sonnet confidently and verbosely remarks that “Each word’s length follows the digits of π (3.14159...), creating a hidden mathematical structure beneath the text. The poem spans approximately 100 digits of π, which is modest compared to the example you shared but still captures the essence of the constraint. The theme connects well with your physics background, …” and more such nonsense.
ChatGPT 4.5 got the first 3 words right but then quickly derailed:
Gemini 2.5 Pro, America’s next top large language model, was the only one that realised it kept derailing and would redo again and again. It ended up almost getting it, but then dramatically derailed at the end. Full response this time:
Gemini’s poem doesn’t, in fact, follow the first 16 digits of pi, starting from when it inexplicably replaced ‘peace’ with ‘faraway’.
I hereby propose a new AI progress benchmark: “Pilish poem length”, or PPL.
I asked GPT 4.5 to write a system prompt and user message for models to write Pilish poems, feeding it your comment as context.
Then I gave these prompts to o1 (via OpenAI’s playground).
GPT 4.5′s system prompt
You are an expert composer skilled in writing poetry under strict, unusual linguistic constraints, specifically “Pilish.” Pilish is a literary constraint in which the length of consecutive words precisely matches each digit of π (pi). The first word contains 3 letters, second word 1 letter, third word 4 letters, fourth word 1 letter, fifth word 5 letters, sixth word 9 letters, and so forth, accurately reflecting the sequence of pi’s digits.
For example, the classic Pilish sentence is:
”How I need a drink, alcoholic of course, after the heavy lectures involving quantum mechanics!”
As another reference, Mike Keith famously wrote a poem titled “Near a Raven,” a Pilish version of Edgar Allan Poe’s “The Raven,” precisely encoding hundreds of pi digits into elegant verse.
Important guidelines:
You must strictly adhere to the Pilish constraint without exception or deviation.
Carefully recount each word to match each digit exactly.
Maintain coherence, clarity, and literary quality in your writing.
Make sure the poem or text is meaningful, evocative, creative, and intriguing, not just mechanically correct.
Verify each word length explicitly to ensure compliance with Pilish.
Your task is to craft a compelling, original poem in Pilish form. Do not explain the poem afterward; the precision and artistry of the piece itself is sufficient.
--
GPT 4.5′s user message
Write an original Pilish poem about pi itself: reflecting upon the wonder, mystery, and endless fascination humans have felt over millennia towards this famous mathematical constant. Make it vivid, thoughtful, and meticulously correct in matching the Pilish constraint. Begin now.
--
o1′s completion
I asked o1 to count the letters in each word, comparing with pi:
And this seems correct!
This is great, thanks! Didn’t think of the model-prompting-model trick.
LLMs use tokens instead of letters, so counting letters is sufficiently unnatural to them relative to their other competencies that I don’t see much value in directly asking LLMs to do this kind of thing. At least give them some basic scaffolding, like a full English dictionary with a column which explicitly indicates respective word lengths. In particular, the Gemini models have a context window of 1M tokens, which should be enough to fit most of the Oxford English Dictionary in there (since it includes 171k words which are in current use).
I think I failed to implicitly convey that I meant all this in jest, that I get a lot of personal enjoyment value out of silly poetry constrained by artificial rules, and that I was guessing at least someone else on the forum would share this enjoyment. I do like your scaffolding idea, might just try it out.
There’s a lot of fun stuff in Anders Sandberg’s 1999 paper The Physics of Information Processing Superobjects: Daily Life Among the Jupiter Brains. One particularly vivid detail was (essentially) how the square-cube law imposes itself upon Jupiter brain architecture by forcing >99.9% of volume to be comprised of comms links between compute nodes, even after assuming a “small-world” network structure allowing sparse connectivity between arbitrarily chosen nodes by having them be connected by a short series of intermediary links with only 1% of links being long-range.
For this particular case (“Zeus”), a 9,000 km sphere of nearly solid diamondoid consisting mainly of reversible quantum dot circuits and molecular storage systems surrounded by a concentric shield protecting it from radiation and holding radiators to dissipate heat into space, with energy provided by fusion reactors distributed outside the shield, only the top 1.35 km layer is compute + memory (a lot thinner comparatively than the Earth’s crust), and the rest of the interior is optical comms links. Sandberg calls this the “cortex model”.
In a sense this shouldn’t be surprising since both brains and current semiconductor chips are mostly interconnect by volume already, but a 1.35 km thick layer of compute + memory encompassing a 9,000 km sphere of optical comms links seems a lot more like a balloon to me than anything, so from now on I’ll probably think of them as Jupiter balloons.
From Brian Potter’s Construction Physics newsletter I learned about Taara, framed as “Google’s answer to Starlink” re: remote internet access, using ground-based optical communication instead of satellites (“fiber optics without the fibers”; Taara calls them “light bridges”). I found this surprising. Even more surprisingly, Taara isn’t just a pilot but a moneymaking endeavor if this Wired passage is true:
How does Taara make it work? Wired again:
Taara’s light bridges look like this:
A bit more on the commercial aspect, from their website:
Peter Watts’ 2006 novel Blindsight has this passage on what it’s like to be a “scrambler”, superintelligent yet nonsentient (in fact superintelligent because it’s unencumbered by sentience), which I read a ~decade ago and found unforgettable:
Imagine a proliferation of Dyson swarms throughout the cosmos, all computing about as efficiently as physics allows, containing no sentience whatsoever. Bostrom’s Disneyland with no children indeed.
(When I first learned about ChatGPT some years later, my first thought was “they’re eerily reminiscent of scramblers and Rorschach”.)
Why would this be plausible? Watts:
Back to scramblers, this time the crew attempting to communicate with them, and the scramblers eventually demonstrating superhuman problem-solving:
It’s very funny that Rorschach linguistic ability is totally unremarkable comparing to modern LLMs.
Found an annotated version of Vernor Vinge’s A Fire Upon The Deep.
How interesting, I was curious about copyright etc but this is annotated by the author himself!
Ravi Vakil’s advice for potential PhD students includes this bit on “tendrils to be backfilled” that’s stuck with me ever since as a metaphor for deepening understanding over time:
I don’t think “mathematics [being] so rich and infinite that it is impossible to learn it systematically” is the only reason (or maybe it subsumes the next point, I’m not sure what Vakil meant exactly). I think the other reason is what Bill Thurston pointed out in On proof and progress in mathematics:
The classic MathOverflow thread on thinking and explaining that Thurston himself started has a lot of memorable examples of what he referred to above by “One-on-one, people use wide channels of communication that go far beyond formal mathematical language”. I suspect one category of examples that the LW crowd would especially resonate with is this “adversarial perspective” described by Terry Tao:
It’s a pity this sort of understanding is harder to convey via text or in lectures.
Kyle Kingsbury’s technical interview pentalogy of short stories is unlike anything else I’ve read. Here’s how the first story begins:
It gets much better.
Attention conservation notice: the following is GPT5-Thinking’s attempt to create a larger backstory out of the pentalogy; it’s not as good as the stories themselves, but still quite good as far as AI output goes. If you dislike AI slop, do check out the original stories instead!
Out of curiosity — how relevant is Holden’s 2021 PASTA definition of TAI still to the discourse and work on TAI, aside from maybe being used by Open Phil (not actually sure that’s the case)? Any pointers to further reading, say here or on AF etc?
When I first read this I thought Holden had pointed to the right working definition, because he worked backward from the outcome that matters (explosive econ growth by debottlenecking innovation by decoupling it from human pop growth; call it “transformative” for short).
In contrast, I think (say) Metaculus’ when AGI? doesn’t use quite the right definition in the “transformative” context, which is fine since I read it as operationalising the sort of AGI we think about in e.g. sci-fi, but given its additional definitional constraints it seems like AI system collectives could start radically transforming the economy before any single robot-embodied Metaculus-approved AGI arrives to resolve the question in the affirmative, rendering the 2030 prediction a bit of a red herring:
I admittedly don’t follow the whole TAI discourse that closely, I just take an occasional interest, but it seems like it leans more towards the latter (maybe sans the robotics component of Metaculus’ definition)?
(The rest of this is an irrelevant series of tangents)
Tangentially, Drexler’s CAIS model seems PASTA-flavored; Bostrom’s “functional soup” seems midway as it presupposes agents but not necessarily the kind most folks think of
and even more tangentially — this is far afield of my original question but I’ve always wondered what other folks think of this — Charles Stross’ dystopian depiction below in Accelerando of a possible future civilisational endpoint seems to follow pretty naturally from explosive econ growth of the misaligned sort:
(I think of this as Stross’ answer to the Fermi paradox: “they got dissembled by corporations, because the destiny of intelligent tool-using life was to be a stepping-stone in the evolution of corporate instruments”. Very disquieting the first time I read it. Ivan Vendrov’s Meditations on machinic desire gives me the same vibe.)
When I first read Hannu Rajaniemi’s Quantum Thief trilogy c. 2015 I had two reactions: delight that this was the most my-ingroup-targeted series I had ever read, and a sinking feeling that ~nobody else would really get it, not just the critics but likely also most fans, many of whom would round his carefully-chosen references off to technobabble. So I was overjoyed to recently find Gwern’s review of it, which Hannu affirms “perfectly nails the emotional core of the trilogy and, true to form, spots a number of easter eggs I thought no one would ever find”, in particular the first few passages:
The parts of the science I understand were all quite plausible (mind duplication/fractioning and motivations for doing so).
Beyond the accuracy of the science, this was one of the most staggeringly imaginative and beautifully written scifi books I’ve ever read. It’s for a very particular audience, but if you’re here you might be that audience. If you are, this might be the best book you’ve read.
Attention conservation notice: 3,000+ words of longform quotes by various folks on the nature of personal identity in a posthuman future, and hiveminds / clans
As an aside, one of the key themes running throughout the Quantum Thief trilogy is the question of how you might maintain personal identity (in the pragmatic security sense, not the philosophical one) in a future so posthuman that minds can be copied and forked indefinitely over time. To spoil Hannu’s answer:
I take Anders Sandberg’s answer to be on the other end of this spectrum; he doesn’t mind changing over time such that he might end up wanting different things:
(I have mixed feelings about Anders’ take: I have myself changed so profoundly since youth that that my younger self would not just disendorse but be horrified by the person I am now, yet I did endorse every step along the way, and current-me still does upon reflection (but of course I do). Would current-me also endorse a similar degree of change going forward, even subject to every step being endorsed by the me right before change? Most likely not, perhaps excepting changes towards some sort of reflective equilibrium.)
I interpret Holden Karnofsky’s take to be somewhere in between, perhaps closer to Hannu’s answer. Holden remarked that he doesn’t find most paradoxical thought experiments about personal identity (e.g. “Would a duplicate of you be “you?”″ or “If you got physically destroyed and replaced with an exact duplicate of yourself, did you die?”) all that confounding because his personal philosophy on “what counts as death” dissolves them, and that his philosophy is simple, comprising just 2 aspects: constant replacement (“in an important sense, I stop existing and am replaced by a new person each moment”) and kinship with future selves. Elaborating on the latter:
Richard Ngo goes in a different direction with the “personal identity in a posthuman future” question:
(I thought it was both interesting and predictable that Rob would find the idea discomfiting; coming from a non-WEIRD culture, I found Richard’s idea immediately attractive and aesthetically “right”.)
Richard gives an fictional example of what this might look like from a first-person perspective in his recent short story The Gentle Romance—if you’re reading this Richard, do let me know if you want this removed:
I think Richard’s notion of ‘hivemind’ is cousin to Robin Hanson’s ‘clan’ from Age of Em (although unlike Richard’s lovely story, Hanson’s depiction of an em-filled future has never stopped seeming dystopian to me, Hanson’s protestation to the contrary that “[readers repelled by aspects of the em era should] try hard to see this world from its residents’ point of view, before disowning these their plausible descendants”, albeit far more granular, comprehensive and first-principles-based):
The short story The Epiphany of Gliese 581 by Fernando Borretti has something of the same vibe as Rajaniemi’s QT trilogy; Borretti describes it as inspired by Orion’s Arm and the works of David Zindell. Here’s a passage describing a flourishing star system already transformed by weakly posthuman tech:
Another star system, this time still being transformed:
One subsubgenre of writing I like is the stress-testing of a field’s cutting-edge methods by applying it to another field, and seeing how much knowledge and insight the methods recapitulate and also what else we learn from the exercise. Sometimes this takes the form of parables, like Scott Alexander’s story of the benevolent aliens trying to understand Earth’s global economy from orbit and intervening with crude methods (like materialising a billion barrels of oil on the White House lawn to solve a recession hypothesised to be caused by an oil shortage) to intuition-pump the current state of psychiatry and the frame of thinking of human minds as dynamical systems. Sometimes they’re papers, like Eric Jonas and Konrad P. Kording’s Could a Neuroscientist Understand a Microprocessor? (they conclude that no, regardless of the amount of data, “current analytic approaches in neuroscience may fall short of producing meaningful understanding of neural systems” — “the approaches reveal interesting structure in the data but do not meaningfully describe the hierarchy of information processing in the microprocessor”). Unfortunately I don’t know of any other good examples.
I enjoyed these passages from Henrik Karlsson’s essay Cultivating a state of mind where new ideas are born on the introspections of Alexander Grothendieck, arguably the deepest mathematical thinker of the 20th century.
On the capacity to be alone as necessary prerequisite to doing groundbreaking work:
(I think that last sentence resonates with me in a way that I don’t think it does for most science & math folks I know, for whom discovery (as opposed to rediscovery) takes precedent emotionally.)
On the willingness to linger in confusion, and the primacy of good question generation over answering them:
On working with others without losing yourself:
A few days ago, I was thinking about matrices and determinants. I noticed that I know the formula for the determinant, but I still lack the feeling of what the determinant is. I played with that thought for some time, and then it occurred to me, that if you imagine the rows in the matrix as vectors in n-dimensional space, then the determinant of that matrix is the volume of the n-dimensional body whose edges are those vectors.
And suddenly it all made a fucking sense. The determinant is zero when the vectors are linearly dependent? Of course, that means that the n-dimensional body has been flattened into n-1 dimensions (or less), and therefore its volume is zero. The determinant doesn’t change if you add a multiple of a row to some other row? Of course, that means moving the “top” of the n-dimensional body in a direction parallel to the “bottom”, so that neither the bottom nor the height changes; of course the volume (defined as the area of the bottom multiplied by the height) stays the same. What about the determinant being negative? Oh, that just means whether the edges are “clockwise” or “counter-clockwise” in the n-dimensional space. It all makes perfect sense!
Then I checked Wikipedia… and yeah, it was already there. So much for my Nobel prize.
But it still felt fucking good. (And if I am not too lazy, one day I may write a blog article about it.)
Reinventing the wheel is not a waste of time. I will probably remember this forever, and the words “determinant of the matrix” will never feel the same. Who knows, maybe this will help me figure out something else later. And if I keep doing that, hypothetically speaking, some of those discoveries might even be original.
(The practical problem is that none of this can pay my bills.)
I kind of envy that you figured this out yourself — I learned the parallelipiped hypervolume interpretation of the determinant from browsing forums (probably this MSE question’s responses). Also, please do write that blog article.
Yeah, I hope you will! I’m reminded of what Scott Aaronson said recently:
Just stumbled upon a spreadsheet ranking “well-known” googological numbers created by redditor u/Laxxius1 as a passion project. I’ve been idly interested in googology since childhood, so perusing this list felt like getting the Christmas present I’ve always wanted but kept forgetting to ask for. (It’s also just intrinsically fun if your mathematical aesthetic is wrestler, which I’m guessing is overrepresented among LWers.)
Yudkowsky’s number, which Eliezer created back in 2007 to win this contest on the xkcd forum, features at #12. (Eliezer also wrote Why isn’t googology a recognized field of math? whose sentiment I predictably agree with.) Graham’s number is all the way down at #98; TREE(3) is far larger at #49; 3^^^3 (as in torture vs dust specks) is at #145 even though it’s already an exponential tower of 3s that’s 7,625,597,484,987 layers tall. The “most finite damage with 3 MTG cards” ranks #134 at 2^^2^^7, which makes me wonder whether there are other popular games where you can easily reach very large numbers. The largest code golf-related number in the spreadsheet is loader.c at #18, Ralph Loader’s winning entry in the Bignum Bakeoff contest (“write a C program in 512 characters or less that generates the largest possible output on a theoretical machine with infinite memory”), although there are a couple of numbers purporting to be larger on code golf SE generated by ultra-short programs written in JavaScript, Python, Ruby, and binary lambda calcululs.
The #1-ranked number in Laxxius1′s spreadsheet is DaVinci103′s “random idea” for beating Rayo’s number (and LNGN) attempting to diagonalize over set theory:
The #2-ranked number is LNGN, short for the quirkily-named Large Number Garden Number, which also attempts to diagonalize over set theory. It’s for now still considered by the googology community to be the largest well-defined googologism that’s not a salad number, as it hasn’t yet been proved to be ill-defined, and LNGN’s creator P進大好きbot disputes DaVinci’s superiority claim. I’m mentioning it here because of its quirky etymology, translated from the original Japanese:
Scott Aaronson wrote that “the Busy Beaver game seems like about as good a yardstick as any for gauging humanity’s progress against the uncomputable”. I think of googology the same way, just lower-status than BusyBeaverology.
I wonder to what extent these impressions by Some Guy over at Extelligence are shared by others:
(I can think of plenty of personal counterexamples.)
Kishore Mahbubani, Singaporean diplomat and former president of the UN Security Council, studied philosophy full-time as an undergraduate in the late 60s. Recounting that period in his autobiography Living the Asian Century he wrote
Wittgenstein was Mahbubani’s favorite philosopher; back then, “like all other philosophy departments in the Anglo-Saxon world, our department had been captured by the linguistic-analytic school of philosophy that Wittgenstein had launched with his Philosophical Investigations”.
At risk of revealing possible narrow-mindedness, a three-hour free response exam to the question “Can a stone feel pain?” makes me think of Luke’s philosophy: a diseased discipline. The questions Richard Ngo answered in his All Souls Fellowship exam got wacky at times, but never “can a stone feel pain?”-wacky.
Mahbubani continued:
I find myself completely unpersuaded by his applied example here, but I suppose I’m just the wrong audience...
Why doesn’t Applied Divinity Studies’ The Repugnant Conclusion Isn’t dissolve the argumentative force of the repugnant conclusion?
First, this is not the phrase I associate with the repugnant conclusion. “Net positive” does not mean “there is nothing bad in each of these lives”.
Second, I do think a key phrase & motivating description is “all they have is muzak and potatoes”. That is all they have. I like our world where people can be and do great things. I won’t describe it in poetic terms, since I don’t think that makes good moral philosophy. If you do want something more poetic, idk read Terra Ignota or The Odyssey. Probably Terra Ignota moreso than The Odyssey.
I will say that I like doing fun things, and I think many other people like doing fun things, and though my life may be net positive sitting around in a buddhist temple all day, I would likely take a 1-in-a-million chance of death to do awesome stuff instead. And so, I think, would many others.
And we could all make a deal, we draw straws, and those 1-in-a-million who draw short give the rest their resources and are put on ice until we figure out a way to get enough resources so they could do what they love. Or, if that’s infeasible (and in most framings of the problem it seems to be), willfully die.
I mean, if nothing else, you can just gather all those who love extreme sports (which will be a non-trivial fraction of the population), and ask them to draw straws & re-consolidate the relevant resources to the winners. Their revealed preference would say “hell yes!” (we can tell, given the much lower stakes & much higher risk of the activities they’re already doing).
And I don’t think the extreme sports lovers would be the only group who would take such a deal. Anyone who loves doing anything will take that deal, and (especially in a universe with the resources able to be filled to the brim with people just above the “I’ll kill myself” line) I think most will have such a passion able to be fulfilled (even if it is brute wireheading!).
And then, if we know this will happen ahead of time—that people will risk death to celebrate their passions—why force them into that situation? We could just… not overproduce people. And that would therefore be a better solution than the repugnant one.
And these incentives we’ve set up by implementing the so-called repugnant conclusion, where people are willfully dying for the very chance to do something in fact are repugnant. And that’s why its called repugnant, even if most are unable to express why or what we lose.
A big factor against making 1-in-a-million higher for most people is the whole death aspect, but death itself is a big negative, much worse to die than to never have been born (or so I claim), so the above gives a lower bound on the factor by which the repugnant conclusion will be off by.
The Parfit quote from the blog post is taken out of context. Here is the relevant section in Parfit’s essay:
(Each box represents a possible population, with the height of a box representing how good overall an individual life is in that population, and the width representing the size of the population. The area of a box is the sum total “goodness”/”welfare”/”utility” (e.g. well-being, satisfied preferences, etc) in that population. The areas increase from A to Z, with Z being truncated here.)
Note that Parfit describes two different ways in which an individual life in Z could be barely worth living (emphasis added):
Then he goes on to describe the second possibility (which is arguably unrealistic and much less likely than the first, and which contains the quote by the blog author). The author of the blog posts mistakenly ignores Parfit’s mentioning the first possibility. After talking about the second, Parfit returns (indicated by “similarly”) to the first possibility:
The “greatest quantity” here can simply be determined by the weight of all the positive things in an individual life minus the weight of all the negative things. Even if the result is just barely positive for an individual, for a large enough population, the sum welfare of the “barely net positive” individual lives would outweigh the sum for a smaller population with much higher average welfare. Yet intuitively, we should not trade a perfect utopia with relatively small population (A) for a world that is barely worth living for everyone in a huge population (Z).
That’s the problem with total utilitarianism, which simply sums all the “utilities” of the individual lives to measure the overall “utility” of a population. Taking the average instead of the sum avoids the repugnant conclusion, but it leads to other highly counterintuitive conclusions, such as that a population of a million people suffering strongly is less bad than a population of just a single person suffering slightly more strongly, as the latter has a worse average. So arguably both total and average utilitarianism are incorrect, at least without strong modifications.
(Personally I think a sufficiently developed version of person-affecting utilitarianism (an alternative to average and total utilitarianism) might well solve all these problems, though the issue is very difficult. See e.g. here.)
The comment you made a little later looks like your answer to that question.
What is the current best understanding of why o3 and o4-mini hallucinate more than o1? I just got round to checking out the OpenAI o3 and o4-mini System Card and in section 3.3 (on hallucinations) OA noted that
as of publication on April 16, so it’s only been a few weeks, but wondering anyhow if people have figured this out.
This is one potential explanation:
o3 has some sort of internal feature like “Goodhart to the objective”/”play in easy mode”.
o3′s RL post-training environments have opportunities for reward hacks.
o3 discovers and exploits those opportunities.
RL rewards it for that, reinforcing the “Goodharting” feature.
This leads to specification-hack-y behavior generalizing out of distribution, to e. g. freeform conversations. It ends up e. g. really wanting to sell its interlocutor on what it’s peddling, so it deliberately[1] confabulates plausible authoritative-sounding claims and justifications for them.
Sounds not implausible, though I’m not wholly convinced.
In whatever sense this term can be applied to an LLM.
Thank you, sounds somewhat plausible to me too. For others’ benefit, here’s the chart from davidad’s linked tweet:
I’ve found the original paper of this chart https://arxiv.org/pdf/2503.11926v1
> We use prompted GPT-4o models to monitor a frontier reasoning agent, an agent in the same family as OpenAI o1 and o3-mini. During training, the agent discovered two hacks affecting nearly all training environments:
The model is in the same family as o1 and o3-mini. Maybe o3 but not comfirmed.
Venkatesh Rao’s recent newsletter article Terms of Centaur Service caught my eye for his professed joy of AI-assisted writing, both nonfiction and fiction:
I think this made an impression on me because Venkat’s joy contrasts so much to many people’s criticism of Sam Altman’s recent tweet re: their new creative fiction model’s completion to the prompt “Please write a metafictional literary short story about AI and grief”, including folks like Eliezer, who said “To be clear, I would be impressed with a dog that wrote the same story, but only because it was a dog”. I liked the AI’s output quite a lot actually, more than I did Eliezer’s (and I loved HPMOR so I should be selected for Eliezer-fiction-bias), and I found myself agreeing with Roon’s pushback to him.
Although Roshan’s remark that “AI fiction seems to be in the habit of being interesting only to the person who prompted it” does give me pause. While this doesn’t seem to be true in the AI vs Eliezer comparison specifically, I do find plausible a hyperpersonalisation-driven near-future where AI fiction becomes superstimuli-level interesting only to the prompter. But I find the contra scenario plausible too. Not sure where I land here.
Most human fiction is only interesting to the human who wrote it. The popular stuff is but a tiny minority out of all that was ever written.
Nice reminiscence from Stephen Wolfram on his time with Richard Feynman:
Feynman and Wolfram had very different problem-solving styles:
The way he grappled with Wolfram’s rule 30 exemplified this (I’ve omitted a bunch of pictures, you can check them out in the article):
This is a top-level comment collecting various quotes discussing the posthuman condition.
Hal Finney’s reflections on the comprehensibility of posthumans, from the Vinge singularity discussion which took place on the Extropians email list back in the day:
Linking to a previous comment: 3,000+ words of longform quotes by various folks on the nature of personal identity in a posthuman future, and hiveminds / clans, using Hannu Rajaniemi’s Quantum Thief trilogy as a jumping-off point.
Terry Tao recently wrote a nice series of toots on Mathstodon that reminded me of what Bill Thurston said:
Tao’s toots:
That said, while I’m not quite as bullish as some folks who think FrontierMath Tier 4 problems may fall in 1-2 years and mathematicians will be rapidly obsoleted thereafter, I also don’t think Tao is quite feeling the AGI here.
If Thurston is right here and mathematicians want to understand why some theorem is true (rather than to just know the truth values of various conjectures), and if we “feel the AGI” … then it seems future “mathematics” will consist in “mathematicians” asking future ChatGPT to explain math to them. Whether something is true, and why. There would be no research anymore.
The interesting question is, I think, whether less-than-fully-general systems, like reasoning LLMs, could outperform humans in mathematical research. Or whether this would require a full AGI that is also smarter than mathematicians. Because if we had the latter, it would likely be an ASI that is better than humans in almost everything, not just mathematics.
The OECD working paper Miracle or Myth? Assessing the macroeconomic productivity gains from Artificial Intelligence, published quite recently (Nov 2024), is strange to skim-read: its authors estimate just 0.24-0.62 percentage points annual aggregate TFP growth (0.36-0.93 pp. for labour productivity) over a 10-year horizon, depending on scenario, using a “novel micro-to-macro framework” that combines “existing estimates of micro-level performance gains with evidence on the exposure of activities to AI and likely future adoption rates, relying on a multi-sector general equilibrium model with input-output linkages to aggregate the effects”.
I checked it out both to get a more gears-y sense of how AI might transform the economy soon and to get an outside-my-bubble data-grounded sense of what domain experts think, but 0.24-0.62 pp TFP growth and 0.36-0.93 pp labor seem so low (relative to say L Rudolf L’s history of the future, let alone AI 2027) that I’m tempted to just dismiss them as not really internalising what AGI means. A few things prevent me from dismissing them: it seems epistemically unvirtuous to do so, they do predicate their forecasts on a lot of empirical data, anecdotes like lc’s recent AI progress feeling mostly like bullshit (although my own experience is closer to this), and (boring technical loophole) they may end up being right in the sense that real GDP would still look smooth even after a massive jump in AI, due to GDP growth being calculated based on post-jump prices deflating the impact of the most-revolutionised goods & services.
Why so low? They have 3 main scenarios (low adoption, high adoption and expanded capabilities, and latter plus adjustment frictions and uneven gains across sectors, which I take to be their best guess), plus 2 additional scenarios with “more extreme assumptions” (large and concentrated gains in most exposed sectors, which they think are ICT services, finance, professional services and publishing and media, and AI + robots, which is my own best guess); all scenarios assume just +30% micro-level gains from AI, except the concentrated gains one which assumes 100% gains in the 4 most-exposed sectors. From this low starting point they effectively discount further by factors like Acemoglu (2024)’s estimate that 20% of US labor tasks are exposed to AI (ranging from 11% in agriculture to ~50% in IT and finance), exposure to robots (which seems inversely related to AI exposure, e.g. ~85% in agriculture vs < 10% in IT and finance), 23-40% AI adoption rates, restricted factor allocation across sectors, inelastic demand, Baumol effect kicking in for scenarios with uneven cross-sectoral gains, etc.
Why just +30% micro-level gain from AI? They explain in section 2.2.1; to my surprise they’re already being more generous than the authors they quote, but as I’d guessed they just didn’t bother to predict whether micro-level gains would improve over time at all:
Why not at least try to forecast micro-level gains improvement over the next 10 years?
Ah, okay then.
What about that 23-40% AI adoption rate forecast over the next 10 years, isn’t that too conservative?
I feel like they’re failing to internalise the lesson from this chart that adoption rates are accelerating over time:
Ben Evans’ Are better models better? (from a business/consumer perspective, not LW/AF etc):
By analogy:
This seems loosely reminiscent of his other essay How to lose a monopoly (emphasis mine):
which segued into a discussion on ‘moats’ (emphasis mine):
From John Nerst’s All the World’s a Trading Zone, and All the Languages Merely Pidgins:
Peter Galison’s Trading Zone: Coordinating Action and Belief begins dramatically:
Okay, but what examples does Galison write about? Here are some:
(That last paragraph is the only way I can make sense of string theory devotees’ real beliefs.)
Back to the physicist subculture trading zone:
One more example:
Last quote (this “shortform” is clearly no longer short):
Scott’s The Colors Of Her Coat is the best writing I’ve read by him in a long while. Quoting this part in particular as a self-reminder and bulwark against the faux-sophisticated world-weariness I sometimes slip into:
How to see old things for the first time? I thought of the following passage by LoganStrohl describing a SIM card ejection tool:
That last paragraph, and especially the use of ‘disrespectful’, strikes me a bit like the rationalist version of what Chesterton was talking about in Scott’s post.
I find both the views below compellingly argued in the abstract, despite being diametrically opposed, and I wonder which one will turn out to be the case and how I could tell, or alternatively if I were betting on one view over another, how should I crystallise the bet(s).
One is exemplified by what Jason Crawford wrote here:
As an example of how tech progress can give us tools to help keep up with tech-accelerated change, Jason quotes Jungwon’s tweet on how Elicit Systematic Review aims to shorten systematic reviews from “3-12 months” to “minutes”.
The other is the view Jason is reacting to (Toffler et al), which I’m immersed in. The most memorable (fictional) descriptions I’ve read recently are from Rudolf’s A History of the Future, 2025-2040. Sample quotes:
(Seems like a production web-flavored story, except gone mostly right because Anthropic solved intent alignment)
The following aren’t really examples but feel adjacent, or maybe I’m being too loose with my pattern-matching: Michael Nielsen’s observation in 2008 that science had for some time progressed beyond individual understanding (e.g. LHC, classification of finite simple groups); Dan Luu’s observation that even explaining something as ostensibly familiar as “what happens when you load a URL?” seems to be beyond any individual person’s understanding today; etc.
I don’t know how I’d crystallise the bet(s). Aesthetically I greatly prefer Jason’s perspective of course…
Some ongoing efforts to mechanize mathematical taste, described by Adam Marblestone in Automating Math:
I wondered whether Gowers was simply unaware of Sutton’s bitter lesson that
which seemed unlikely given how polymathic Gowers is — and of course he’s aware:
That post was from April 2022, an eternity ago in AI land, and I haven’t seen any updates by him since.
How to quantify how much impact being smarter makes? This is too big a question and there are many more interesting ways to answer it than the following, but computer chess is interesting in this context because it lets you quantify compute vs win probability, which seems like one way to narrowly proxy the original question. Laskos did an interesting test in 2013 with Houdini 3 by playing a large number of games on 2x nodes vs 1x nodes per move level and computing p(win | “100% smarter”). The win probability gain above chance i.e. 50% drops from +35.1% in the 4k vs 2k node case to +11.1% in the 4M vs 2M case:
As an aside, the diminishing returns surprised me: I was expecting p(win | “X% smarter”) to be independent of the 1x node’s compute. My guess is this is because Houdini 3 is close enough to chess’ skill ceiling (4877 Elo on CCRL for the perfect engine according to Laskos, extrapolating from his data above, or 1707 points above Houdini 3 40⁄40′ CCRL level) that p(win) starts diminishing very early, and that you won’t see this in “IRL games” unless the 1x player somehow manages to steer the future into a lower skill ceiling domain somehow. Another aside is that this diminishing returns pattern seems reminiscent of the “scaling wall” talk which predicts that walls are an artifact of low skill ceilings and that the highest scaling gains will come from ~limitless skill ceiling domains (automated theorem proving?), but I don’t expect this observation to mean much either, mostly because I don’t know what I’m talking about at this point.
The diminishing returns isn’t too surprising, because you are holding the model size fixed (whatever that is for Houdini 3), and the search sigmoids hard. Hence, diminishing returns as you jump well past the initial few searches with the largest gains, to large search budgets like 2k vs 4k (and higher).
This is not necessarily related to ‘approaching perfection’, because you can see the sigmoid of the search budget even with weak models very far from the known oracle performance (as well as stronger models); for example, NNs playing Hex: https://arxiv.org/pdf/2104.03113#page=5 Since it’s a sigmoid, at a certain point, your returns will steeply diminish and indeed start to look like a flat line and a mere 2x increase in search budget does little. This is why you cannot simply replace larger models with small models that you search the hell out of: because you hit that sigmoid where improvement basically stops happening.
At that point, you need a smarter model, which can make intrinsically better choices about where to explore, and isn’t trapped dumping endless searches into its own blind spots & errors. (At least, that’s how I think of it qualitatively: the sigmoiding happens because of ‘unknown unknowns’, where the model can’t see a key error it made somewhere along the way, and so almost all searches increasingly explore dead branches that a better model would’ve discarded immediately in favor of the true branch. Maybe you can think of very large search budgets applied to a weak model as the weak model ‘approaching perfection… of its errors’? In the spirit of the old Dijkstra quip, ‘a mistake carried through to perfection’. Remember, no matter how deeply you search, your opponent still gets to choose his move, and you don’t; and what you predict may not be what he will select.)
Fortunately, ‘when making an axe handle with an axe, the model is indeed near at hand’, and a weak model which has been ‘policy-improved’ by search is, for that one datapoint, equivalent to a somewhat larger better model—if only you can figure out how to keep that improvement around...
Thanks, I especially appreciate that NNs playing Hex paper; Figure 8 in particular amazes me in illustrating how much more quickly perf. vs test-time compute sigmoids than I anticipated even after reading your comment. I’m guessing https://www.gwern.net/ has papers with the analogue of Fig 8 for smarter models, in which case it’s time to go rummaging around…
Just reread Scott Aaronson’s We Are the God of the Gaps (a little poem) from 2022:
Feels poignant.
Philosophy bear’s response to Scott is worth reading too.
Lee Billings’ book Five Billion Years of Solitude has the following poetic passage on deep time that’s stuck with me ever since I read it in Paul Gilster’s post:
(I still think it will.)