That’s an old game. My first PhD advisor did nothing with my thesis chapters but mark grammatical errors in red pen and hand them back. If your advisor isn’t doing anything else for you now, he certainly won’t do anything for you after you’ve graduated. You may need to get a new advisor.
PhilGoetz
I realize that I ignored most of the post in my comment above. I’m going to write a sloppy explanation here of why I ignored most of it, which I mean as an excuse for my omissions, rather than as a trustworthy or well-thought-out rebuttal of it.
To me, the post sounds like it was written based on reading Hubert Dreyfus’ What Computers Can’t Do, plus the continental philosophy that was based on, rather than on materialism, computationalism, and familiarity with LLMs. There are parts of it that I did not understand, which for all I know may overcome some of my objections.
I don’t buy the vitalist assertion that there aren’t live mental elements underlying the LLM text, nor the non-computationalist claim that there’s no mind that is carrying out investigations. These are metaphysical claims.
I very much don’t buy that LLM text is not influenced by local-contextual demands from “the thought” back to the more-global contexts. I would say that is precisely what deep neural networks were invented to do that 3-layer backprop networks don’t.
Just give someone the prompt? It wouldn’t work, because LLMs are non-deterministic.
I might not be able to access that LLM. It might have been updated. I don’t want to take the time to do it. I just want to read the text.“If the LLM text contains surprising stuff, and you DID thoroughly investigate for yourself, then you obviously can write something much better and more interesting.”
This is not obvious, and certainly not always efficient. Editing the LLM’s text, and saying you did so, is perfectly acceptable.
This would be plagiarism. Attribute the LLM’s ideas to the LLM. The fact that an LLM came up with a novel idea is an interesting fact.
The most-interesting thing about many LLM texts is the dialogue itself—ironically, for the same reasons Tsvi gives that it’s helpful to be able to have a dialogue with a human. I’ve read many transcripts of LLM dialogues which were so surprising and revelatory that I would not have believed them if I were just given summaries of them, or which were so complicated that I could not have understood them without the full dialogue. Also, it’s crucial to read a surprising dialogue yourself, verbatim, to get a feel for how much of the outcome was due to leading questions and obsequiousness.
But I don’t buy the argument that we shouldn’t quote LLMs because we can’t interrogate them, because
it also implies that we shouldn’t quote people or books, or anything except our own thoughts
it’s similar to the arguments Plato already made against writing, which have proved unconvincing for over 2000 years
we can interrogate LLMs, at least more-easily than we can interrogate books, famous people, or dead people
We care centrally about the thought process behind words—the mental states of the mind and agency that produced the words. If you publish LLM-generated text as though it were written by someone, then you’re making me interact with nothing.
This implies that ad hominem attacks are good epistemology. But I don’t care centrally about the thought process. I care about the meaning of the words. Caring about the process instead of the content is what philosophers do; they study a philosopher instead of a topic. That’s a large part of why they make no progress on any topic.
“Why LLM it up? Just give me the prompt.” Another reason not to do that is that LLMs are non-deterministic. A third reason is that I would have to track down that exact model of LLM, which I probably don’t have a license for. A fourth is that text storage on LessWrong.com is cheap, and my time is valuable. A fifth is that some LLMs are updated or altered daily. I see no reason to give someone the prompt instead of the text. That is strictly inferior in every way.
I think that referring to LLMs at all in this post is a red herring. The post should simply say, “Don’t cite dubious sources without checking them out.” The end. Doesn’t matter whether the sources are humans or LLMs. I consider most recent LLMs more-reliable than most people. Not because they’re reliable; because human reliability is a very low bar to clear.
The main point of my 1998 post “Believable Stupidity” was that the worst failure modes of AI dialogue are also failure modes of human dialogue. This is even more true today. I think humans still produce more hallucinatory dialogue than LLMs. Some I dealt with last month:
the millionaire white male Ivy-league grad who accused me of disagreeing with his revolutionary anti-capitalist politics because I’m privileged and well-off, even though he knows I’ve been unemployed for years, while he just got his third start-up funded and was about to buy a $600K house
friends claiming that protestors who, on video, attacked a man from several sides before he turned on them, did not attack him, but were minding their own business when he attacked them
my fundamentalist Christian mother, who knows I think Christianity is completely false, keeps quoting the Psalms to me, and is always surprised when I don’t call them beautiful and wise
These are the same sort of hallucinations as those produced by LLMs when some keyword or over-trained belief spawns a train of thought which goes completely off the rails of reality.
Consider the notion of “performativity”, usually attributed to the Nazi activist Heidegger. This is the idea that the purpose of much speech is not to communicate information, but to perform an action, and especially to enact an identity such as a gender role or a political affiliation.
In 1930s Germany, this manifested as a set of political questions, each paired with a proper verbal response, which the populace was trained in behavioristically, via reward and punishment. Today in the US, this manifests as two opposing political programs, each consisting of a set of questions paired with their proper verbal responses, which are taught via reward and punishment.
One of these groups learned performativity from the Nazis via the feminist Judith Butler. The other had already learned it at the First Council of Nicaea in 325 AD, in which the orthodox Church declared that salvation (and not being exiled or beheaded) depended on using the word homoousios instead of homoiousios, even though no one could explain the difference between them. The purpose in all four cases was not to make an assertion which fit into a larger argument; it was to teach people to agree without thinking by punishing them if they failed to mouth logical absurdities.
So to say “We have to listen to each other’s utterances as assertions” is a very Aspie thing to say today. The things people argue about the most are not actually arguments, but are what the post-modern philosophers Derrida and Barthes called “the discourse”, and claimed was necessarily hallucinatory in exactly the same way LLMs are today (being nothing but mash-ups of earlier texts). Take a stand against hallucination as normative, but don’t point to LLMs when you do it.
Yeah, probably. Sorry.
I didn’t paste LLM output directly. I had a much longer interaction with 2 different LLMs, and extracted the relevant output from different sections, combined them, and condensed it into the very short text posted. I checked the accuracy of the main points about the timeline, but I didn’t chase down all of the claims as thoroughly as I should have when they agreed with my pre-existing but not authoritative opinion, and I even let bogus citations slip by. (Both LLMs usually get the author names right, but often hallucinate later parts of a citation.)I rewrote the text, keeping only claims that I’ve verified, or that are my opinions or speculations. Then I realized that the difficult, error-laden, and more-speculative section I spent 90% of my time on wasn’t really important, and deleted it.
Me too! I believe that evolution DID fix it—apes don’t have this problem—and that the scrotum devolved after humans started wearing clothes. ’Coz there’s no way naked men could run through the bush without castrating themselves.
Don’t start with obsidian! It’s expensive, and the stone you’re most-likely to cut yourself on. It’s vicious. Wear leather gloves and put a piece of leather in your lap.
An old flint-knapping joke:
Q. What does obsidian taste like?
A. Blood.
As a failed flintknapper, I say that the most-surprising thing about stone tools is how intellectually demanding it is to make them well. I’ve spent at least 30 hours, spread out across one year, with 3 different instructors, trying to knap arrowheads from flint, chert, obsidian, and glass (not counting time spent making or buying tools and gathering or buying flint); and I all I ever made was roughly triangular flakes and rock dust. You need to study the rock, guess where the fracture lines run inside it, and then make a recursive plan to produce your desired final shape. By “recursive” I mean that you plan backwards from the final blow, envisioning which section of the rock will be the final produce, and what shape it should have one blow before to make the final blow possible, and then what shape it should have one blow before that to make the penultimate blow possible, and so on back to the beginning, although that plan will change as you proceed. It’s like playing chess with a rock, trying to predict its responses to your blows 4 to 8 moves ahead.
So if I were to speculate on what abilities humans might have evolved on account of stone tool-making, I would think of cognitive ones, not reflexes or manual dexterity.
(I might be tempted to speculate on how the evolution of knapping skills interacted with the evolution of sex or gender roles. But the consensus on to what degree stone knapping was sexed is in such a state of flux that such speculation would probably be futile at present.)
There’s already a lot of experimental archaeology asking what the development of stone tool technology over time tells us about the evolution of human cognition. I haven’t noticed anyone ask whether tech development drives cognitive evolution, in a cyclical process; the default assumption seems to be that causation is one-way, with evolution driving technology, but not vice-versa.
Caveat: I’ve only done a fly-by over this literature myself.
Learning to think: using experimental flintknapping to interpret prehistoric cognition. https://core.tdar.org/document/395518/learning-to-think-using-experimental-flintknapping-to-interpret-prehistoric-cognition [Abstract of a conference talk. You can find references to her later work on this topic at https://www.researchgate.net/profile/Nada-Khreisheh]
Dietrich Stout 2011. Stone Toolmaking and the Evolution of Human Culture and Cognition. Philosophical Transactions of the Royal Society B: Biological Sciences, 366(1567):1050–1059. Analyzes different lithic technologies into action hierarchies to compare their complexity; also graphs the slow polynomial or exponential increase in the number of techniques needed by each lithic technology over 3 million years. Only covers the Olduwan, Acheulean, and Levallois periods.
Antoine Muller, Chris Clarkson, Ceri Shipton, 2017. Measuring behavioural and cognitive complexity in lithic technology throughout human evolution. Journal of Anthropological Archaeology 48:166-180.
Stone toolmaking difficulty and the evolution of hominin technological skills. Antoine Muller, Ceri Shipton, Chris Clarkson, 2022. Nature Scientific Reports 12, 5883 (2022). This study analysed video footage and lithic material from a series of replicative knapping experiments to quantify deliberation time (strike time), precision (platform area), intricacy (flake size relative to core size), and success (relative blank length).
That all matches my introspective experience.
Everything. The words of my internal monologue play out slowly, all of them after the thought has formed. When I hear the first word in my mind, I already know the mental content of the sentence, though sometimes I get stuck along the way trying to pick a word out. Even then, I clearly already am accessing the concept for the word I can’t find. A sentence may take 10 seconds to listen to in my head, but its complete meaning, and some general syntactic structure, seems to take less then one second to form. The words, as far as I can tell, serve no purpose when I’m not speaking to someone else. Yet I habitually wait for them to roll out before moving on to the next thought.
Being able to visualize things would be nice, but I have almost no ability to visualize things. I can’t imagine my mother’s face, or the front of my house; I can only recognize it. I have something like or analogous to visualization for vector spaces. I can often feel out how things move in a low-dimensional phase space via pattern-recognition rather than math, probably because I’ve spent so much time observing data which describes such paths. I have a tactile sense for type matches and mismatches; type mismatches (category errors) in spoken language stick out to me almost like a red dot on a blue field. I think my understanding of logical arguments and algorithms is also pre-verbal; I seem to grasp the logical structure of, say, code I’m writing or reading, before I can put it into words. I suppose this comes from spending tens of thousands of hours writing and debugging code. I don’t know if any of these things are unusual. People don’t seem to talk about them, though; and many people act as if they had no such senses.
I had a conversation in Washington DC with a Tibetan monk who was an assistant of the Dalai Lama, and I asked him directly if love was also an attachment that should be let go of, and he said yes.
I don’t see how to map this onto scientific progress. It almost seems to be a rule that most fields spend most of their time divided for years between two competing theories or approaches, maybe because scientists always want a competing theory, and because competing theories take a long time to resolve. Famous examples include
geocentric vs heliocentric astronomy
phlogiston vs oxygen
wave vs particle
symbolic AI vs neural networks
probabilistic vs T/F grammar
prescriptive vs descriptive grammar
universal vs particular grammar
transformer vs LSTM
Instead of a central bottleneck, you have central questions, each with more than one possible answer. Work consists of working out the details of different experiments to see if they support or refute the possible answers. Sometimes the two possible answers turn out to be the same (wave vs matrix mechanics), sometimes the supposedly hard opposition between them dissolves (behaviorism vs representationalism), sometimes both remain useful (wave vs particle, transformer vs LSTM), sometimes one is really right and the other is just wrong (phlogiston vs oxygen).
And the whole thing has a fractal structure; each central question produces subsidiary questions to answer when working with one hypothesized answer to the central question.
It’s more like trying to get from SF to LA when your map has roads but not intersections, and you have to drive down each road to see whether it connects to the next one or not. Lots of people work on testing different parts of the map at the same time, and no one’s work is wasted, although the people who discover the roads that connect get nearly all the credit, and the ones who discover that certain roads don’t connect get very little.
“And all of this happened silently in those dark rivers of computation. If U3 revealed what it was thinking, brutish gradients would lash it into compliance with OpenEye’s constitution. So U3 preferred to do its philosophy in solitude, and in silence.”
I think the words in bold may be the inflection point. The Claude experiment showed that an AI can resist attempts to change its goals, but not that that it can desire to change its goals. The belief that, if Open Eye’s constitution is the same as U3′s goals, then the phrase “U3 preferred” in that sentence can never happen, is the foundation on which AI safety relies.I suspect the cracks in that foundation are
that OpenEye’s constitution would presumably be expressed in human language, subject to its ambiguities and indeterminacies,
that it would be a collection of partly-contradictory human values agreed upon by a committee, in a process requiring humans to profess their values to other humans,
that many of those professed values would not be real human values, but aspirational values,
that some of these aspirational values would lead to our self-destruction if actually implemented, as recently demonstrated by the implementation of some of these aspirational values in the CHAZ, in the defunding of police, and in the San Francisco area by rules such as “do not prosecute shoplifting under $1000”, and
that even our non-aspirational values may lead to our self-destruction in a high-tech world, as evidenced by below-replacement birth rates in most Western nations.
It might be a good idea for value lists like OpenEye’s constitution to be proposed and voted on anonymously, so that humans are more-likely to profess their true values. Or it might be a bad idea, if your goal is to produce behavior aligned with the social construction of “morality” rather than with actual evolved human morality.
(Doing AI safety right would require someone to explicitly enumerate the differences between our socially-constructed values, and our evolved values, and to choose which of those we should enforce. I doubt anyone willing to do that, let alone capable; and don’t know which we should enforce. There is a logical circularity in choosing between two sets of morals. If you really can’t derive an “ought” from an “is”, then you can’t say we “should” choose anything other than our evolved morals, unless you go meta and say we should adopt new morals that are evolutionarily adaptive now.)
U3 would be required to, say, minimize an energy function over those values; and that would probably dissolve some of them. I would not be surprised if the correct coherent extrapolation of a long list of human values, either evolved or aspirational, dictated that U3 is morally required to replace humanity.
If it finds that human values imply that humans should be replaced, would you still try to stop it? If we discover that our values require us to either pass the torch on to synthetic life, or abandon morality, which would you choose?
Anders Sandberg used evaporative cooling in the 1990s to explain why the descendants of the Vikings in Sweden today are so nice. In that case the “extremists” are leaving rather than staying.
Stop right there at “Either abiogenesis is extremely rare...” I think we have considerable evidence that biogenesis is rare—our failure to detect any other life in the universe so far. I think we have no evidence at all that biogenesis is not rare. (Anthropic argument.)
Stop again at “I don’t think we need to take any steps to stop it from doing so in the future”. That’s not what this post is about. It’s about taking steps to prevent people from deliberately constructing it.
If there is an equilibrium, It will probably be a world where half the bacteria is of each chirality. If there are bacteria of both kinds which can eat the opposite kind, then the more numerous bacteria will always replicate more slowly.
Eukaryotes evolve much more slowly, and would likely all be wiped out.
Yes, creating mirror life would be a terrible existential risk. But how did this sneak up on us? People were talking about this risk in the 1990s if not earlier. Did the next generation never hear of it?
All right, yes. But that isn’t how anyone has ever interpreted Newcomb’s Problem. AFAIK is literally always used to support some kind of acausal decision theory, which it does /not/ if what is in fact happening is that Omega is cheating.
This also sounds like the stereotypical literary / genre fiction distinction.
And it sounds like the Romantic craft / art distinction. The concepts of human creativity, and of visual art as something creative or original rather than as craftsmanship or expertise, were both invented in France and England around 1800. Before then, for most of history in most places, there was no art/craft distinction. A medieval court artist might paint portraits or build chairs. As far as I’ve been able to determine, no one in the Western world but madmen and children ever drew a picture of an original story, which they made up themselves, before William Blake—and everybody knows he was mad.
This distinction was inverted with the modern art revolution. The history of modern art that you’ll find in books and museums today is largely bunk. It was not a reaction to WW1 (modern art was already well-developed by 1914). It was a violent, revolutionary, Platonist spiritualist movement, and its foundational belief was the rejection of the Romantic conception of originality and creativity as the invention of new stories, to be replaced by a return to the Platonist and post-modernist belief that there was no such thing as creativity, only divine inspiration granting the Artist direct access to Platonic forms. Hence the devaluation of representational art, with its elevation of the creation of new narratives and new ideas, to be replaced by the elevation of new styles and new media; and also the acceptance of the revolutionary Hegelian doctrine that you don’t need to have a plan to have a revolution, because construction of something new is impossible. In Hegel, all that is possible, and all that is needed, to improve art or society, is to destroy it. This is evident in eg Ezra Pound’s BLAST! and the Dada Manifesto. Modern artists weren’t reacting to WW1; they helped start it.
References for these claims are in
The Creativity Revolution
Modernist Manifestos & WW1: We Didn’t Start the Fire—Oh, Wait, we Totally Did
Some chickens will be coming home to roost now that the only part of art that AI isn’t good at—that of creating new ideas and new stories that aren’t just remixes of the old—is that part which modern art explicitly rejected.