janus

Karma: 3,240

janus 3 Dec 2021 1:47 UTC
LW: 55 AF: 23
AF
on: larger language models may disappoint you [or, an eternally unfinished draft]
This is the best post about language models I’ve read in a long time. It’s clear how much you have used LMs and grokked the peculiar way they operate. You’ve touched on many important points which I’ve wanted to write about or have but with less eloquence. Also I glad you liked my blog :) (generative.ink)
I definitely belong to your “enthusiasts” camp, and I agree your fourth point (loss scaling makes models “smarter” fast enough to matter) is a crux. I won’t fully defend that here, but I’ll do my own brain dump and share some of the thoughts that came up when reading your post.
Discontinuous jumps in capabilities
One of the reasons for my optimism/concern about scaling is that I do expect discontinuous jumps in capabilities, but not in the way you are arguing against here. I don’t think discontinuous jumps will necessarily come from discontinuous improvements of the model’s single step inference accuracy (though it may), but from the tasks we need it to do.
I see two big sources of discontinuity in tasks and many tasks contain both. The first is that many tasks are somewhat binary in nature. If you can’t do it well enough, you basically can’t do it at all. The second is that many tasks happen over multiple inferential steps where small improvements in single step accuracy translate into large changes in multistep capabilities.
The most important binary task is whether or not a model can be amplified under some given amplification strategy. As a ~~particular~~ example, at one OOM the model will not be able to ~~amplification technique~~ because it is too unreliable, even with techniques to make it more robust. Then at one OOM it suddenly will. We can observe it getting closer to this, but it can be difficult to say how close we are without getting deep into the gears of the amplification technique.
As an example of multistep inferential tasks, in some experiments collaborators and I found that larger models are dramatically better at solving math problems in multiple steps (“factored cognition”), while accuracy of solving the problem in a single step increases more continuously. Whether this is counts as a fundamentally new capability depends on your definition, but the pragmatic result is discontinuous competence. (A few of our results were eventually posted here)
We should expect to see this with various multi-token tasks which can only be executed if the model chains together many “correct” inferences. It’s still a probabilistic matter, as you say: a small model would succeed with some small probability, and the large model will fail with a small probability. However, when the task requires multiple steps to all be executed correctly, the probability of the small model succeeding at the the task dwindles exponentially, magnifying the difference. The problem is more pronounced when you add feature engineering because it’s often the case that irregular errors can be accounted for while frequent errors cannot.
Say the task is about 100 tokens long and for each token GPT-3 outputs an acceptable (non-fatal) prediction 90% of the time. The probability of it successfully completing the task is 0.9^100 = 0.00002656139: near 0. A model whose mistake rate is only 1% would complete the task with probability 0.99^100 = 0.36603234127 – more than one out of 3 times. This can be the difference between total impracticality and a task that can be automated with high accuracy by adding a few extra tricks. A model with 99.9% single-token accuracy succeeds most of the time (~90%). This is of course a simplification of the dynamics, but you get the point.
Mistakes
Mistakes during generation are particularly fatal for GPTs because there’s no way to go back on them (unless the prompt introduces a mechanism for doing so). GPT updates on its own mistakes and elevates them to a sort of delusive “certainty” after being appended to the prompt. One way of looking at it is that the “delusions” of GPT simulacra are not the model’s fault, but the fault of the autoregressive sampling process which spuriously elevates the model’s mere guesses to canonical reality.
As you point out, “mistakes” can be of various types, including ones which aren’t really failures of capability, and which we won’t expect to go away if models scale. However, I think those problems (GPT isn’t trying its best, the prompt is ambiguous, etc.) are difficult but tractable to address and will become more tractable as models scale. More powerful models are amenable to more precise control by many methods, even simple prompt programming and fine tuning. OpenAI’s instruct models, for instance, are quite reliable at interpreting single-line imperative instructions “correctly” (that is, attempting to execute the instruction), whereas the base models would react to most single-line context-free instructions chaotically.
I also agree that evaluating GPTs with prompts is actually evaluating the GPT+human system, but I’m optimistic/concerned that given time we will automate the effects of this process (automated prompt programming, filtering, fine tuning in clever ways, embedding in larger systems, etc.), even if somehow we don’t find ways to make pretrained LMs themselves more intentionally goal-directed.
Prompt noise and shattered cognition
This is excellently put:
I see no single, stable trove of skills being leveraged here and there as needed. I just see stretches of success and failure at imitating ten thousand different kinds of people, all nearly independent of one another, the products of barely-coupled subsystems.
Here’s some simple experimental evidence to support this observation. I found that GPT-3′s ability to sort a list of 5 integers was 28% with a 0-shot natural language description of the task, 50% with a 10-shot prompt, and 76% (!) accuracy with 0-shot in the style of python documentation/code.
This case cannot be explained by ‘meta-learning’ because the more effective prompt contains no additional information about how to solve the task. I think simply claiming GPT-3 has only learned “shallow patterns” is also insufficient because it clearly has learned the deep pattern needed to sort lists of integers like this, it just fails to access this ability under different circumstances. Does the pure natural language description and the few-shot prompt invoke a different and inferior strategy, or an imperfect/corrupted version of the same list-sorting subsystem? (I’d love to know.)
In either case, as you say, GPT does not act like it has a centralized repertoire of skills which determines how well it’s able to perform tasks across prompts. This is an important intuition. Everything suggests to me that there is no core, no unified self, whether in terms of agency or capability or even knowledge. Gwern has said that he thinks of GPT-3 as an agent which wants to roleplay accurately; I disagree because I don’t perceive anything as coherent or centralized as even a “puppetmaster” or “shapeshifter” that controls or roleplays simulacra. The inability of some simulacra to access knowledge and capabilities that would unambiguously make them better imitations, and which different simulacra can somehow access, contributes to my impression of GPT’s subsystem disunity. However, I think there is good reason to expect this to change as these models scale.
Meta-learning
Despite my blog post, I do think GPT-3 is capable of “meta-learning” – just that this perspective is often misleading, especially for some tasks like translation. I haven’t played with small models enough to say how discontinuous it is, but “meta-learning” seems necessary if any size of GPT should be able to coherently continue most long prompts. The same way GPT-3 “updates” from the task demonstrations, it clearly updates on information in a story prompt, such as the demonstrated personality of the characters, information which reveals(constrains) things about the premise, etc. The few-shot “meta-learning” capability is a special case of its general ability to continue text in the style of its training data; lists of examples are a common feature which constrains the future in systematic ways.
Learning curves
The point about LMs’ learning curves looking different than those of humans is very important. The probabilistic competencies exhibited by GPT are quite different from what we see from humans.
One note: Contributing to the apparent discontinuity of human learning is that most humans are much less willing to pronounce on topics they’re unsure on than GPTs (autoregressively sampled) are. We usually say/think we don’t know even when it would be possible to make a probabilistic guess. That said, I do think the way GPTs learn is fundamentally different than humans, and this causes us to both over and underestimate their capabilities.
You’ve explained well the differences which result from GPT’s incentive to imitate a broad range of disparate patterns. Another (related) difference is that whereas humans tend to build up their understanding of a world by learning “fundamentals” like object permanence first, LMs approach competence through a route which masters “superficial”, “stylistic” patterns first, learning to write in the style of famous authors before mastering object permanence. In your words from another post, it learns to run before learning to walk.
This causes some people to conclude that GPTs learns only shallow patterns. I don’t think this is true; I think it only approaches the same “deep” patterns from a different trajectory. A “fake it til you make it” approach – but that doesn’t mean it won’t eventually “make it”. Looking at GPT-2, I could imagine thinking that however impressive the ability of large language models to write in beautiful and difficult (for a human) styles, basic object permanence will always be a problem. GPT-3 doesn’t struggle much with it.
Abstract reasoning
I’m interested in knowing more about your reasons for thinking that little will come of scaled LLMs’ abstract reasoning capabilities. None of the above suggests this to me. I wonder if your thoughts have changed since Codex was released after you originally drafted this post.
You said that large language models will be better at abstract reasoning in that it will be easier to get them to spit out text that sounds like it’s a product of abstract reasoning (implying, perhaps, that it is in some sense not real abstract reasoning). While I agree that language models are very prone to spit out text that looks superficially more like legitimate abstract reasoning than it is, as they’re particularly good at imitating surface patterns of competence, why does this imply that they cannot also learn the “real” patterns? What exactly are the “real” patterns?
Many people dismiss the legitimacy of LMs’ reasoning because they just parrot probabilities from the training data. But I know you have seen its capacity for generalization. Given a good prompt as a seed, it often is able to reproduce chains of reasoning and conclusions regarding a completely unprecedented state of affairs exactly as they occurred to me. I considered these thoughts to be abstract reasoning when they happened in my mind. So what is it when GPT-3 can reliably reproduce these thoughts?

How do we apply this to Codex writing code that compiles, providing the instrumental fruits of what, if coming from a human, we would not hesitate to call abstract reasoning?
Human evaluation
I agree with your concerns about human evaluation for reasons of unreliableness, underperformance, risk of bias, etc. but I think you overstate the uselessness of the approach. Despite these very real problems, I have found almost universally that people who have spent considerable time using GPT-3 hands-on understand its capabilities and flaws significantly better than researchers who have only read benchmark and ecological evaluation papers. I will even argue that you cannot understand GPT-3 without using it.

Non-ecological benchmarks (almost all of them) are really, really bad, and most are actively misleading. Ecological evaluations, though you say they exist, are woefully inadequate for probing general intelligence for its capabilities and limits, especially in their current form. I second your call to improve them.

janus 9 Dec 2021 20:03 UTC
6 points
in reply to: nostalgebraist’s comment on: larger language models may disappoint you [or, an eternally unfinished draft]
One reason we agree on many object-level facts but have different takeaways is that we have different desiderata for what GPT is supposed to do in the limit. I agree that many of the problems you discuss are fundamental to the way GPT is trained and how it works, but I generally feel these problems don’t need to be solved directly in order to use GPT to build AGI. I see GPT as the _seed_ for a future AGI system built off of or around it.
I see the big crux is how much “compressed memorization” will extrapolate to general intelligence vs. begin to show cracks as we ask it for more and more advanced and general one-step deductions. It would be worth coming up with some specific claims about how we expect future systems to act to differentiate our two perspectives (including at the level of internals). Probably this is useful to start on my end because I have higher expectations for performance. Unfortunately I’m very adverse to talking about _how_ I would amplify GPT by extending it or wrapping it in a larger system, and I see steps like that as key to unlocking its capabilities.
Your idea about multi-step deduction happening over multiple layers makes a lot of sense. You brought up an experiment in the Eleuther discord I think would be a great idea to try. We could train several models to see if tasks that require a sequence of discrete steps are unusually sensitive to network depth rather than scaling with parameter count alone.
I agree about your insights about abstract reasoning as babble and prune, although this definitely isn’t the only way I reason abstractly. I babble and prune especially when I am writing (on the word/sentence/paragraph level), and I babble and prune as a part of the search process when I am trying to come up with a plan or navigate through a math proof. But when I am talking I am able to fluidly reason towards my goal with little to no plan ahead of ahead of time. I work collaboratively so much of my abstract thinking is out loud. If babble/prune is going on when I talk, it is happening at a level below my awareness.
These rollouts are not always complete, as I often need to attack problems from multiple angles before I’ve fully understood them. But the individual rollouts look like abstract reasoning to me, just as they do (can) in GPT-3. I look at individual rollouts and think: That’s general intelligence. If something could reason as well or more powerfully than I can in an individual rollouts, it is the seed of an AGI.
I also often have moments of great insight where I seem to understand a full chain of thought almost instantly. The delay comes from my inability to communicate/record it quickly. I can also use abstract reasoning in visual space (e.g. figuring out a geometric proof). In these cases I often seem to have access to a causal model that I can examine and conclude things from directly.

janus 5 Apr 2022 13:12 UTC
23 points
on: Optimality is the tiger, and agents are its teeth
I’ve just reached the interlude. Here are my initial thoughts on “What points above fail, if any?”
It doesn’t have any wants
Maybe, but the things that it predicts do have wants.
It doesn’t plan
“maximizing actual probabilities of actual texts” encompasses predicting plans.
Its mental time span is precisely one forward pass through the network
No, (as your story shows,) its mental time span is based on its context window and the imagined past that this context window could imply. GPT is a process which can send information to its future by repeatedly writing to its prompt. A few pages of text is enough to iterate on plans, unroll thoughts directed by explicitly or implicitly stated intentions, etc. Factored cognition and chain-of-thought reasoning can outperform single-step inference. It can also rewrite important details to the prompt before they fall out of the context window. This is all somewhat higher bandwidth than it seems because the attention mechanism allows GPT to attend to computation about previous tokens rather than only the previous tokens themselves.
It can only use ideas that the rest of the world knows
The rest of the world doesn’t know what the rest of the world knows. And who knows what this means for the space of concepts reachable by interpolation/extrapolation.
The model has not been trained to have a conception of itself as a specific non-hypothetical thing … If it has a ‘self’, that self is optimised to embody whatever matches the text that prompted it, not the body that the model is running on.
It knows about language models. It shouldn’t have an unconditioned prior that the author of the text is a language model, but may become more calibrated to that true belief during downstream generation. E.g. a character tests whether they have control over the world or can instantiate other entities with words and finds they do, or it the model produces aberrations like a loop and subsequently identifies it as characteristic of language model output.

All this is ignoring inner alignment failures and amplification schemes like RL on top of the pretrained GPT that could invalidate pretty much any of the rest of the points.

janus 28 Aug 2022 0:48 UTC
1 point
0
in reply to: Optimization Process’s comment on: What’s the Least Impressive Thing GPT-4 Won’t be Able to Do
how big/difficult do you want the ascii mazes to be? and is few-shot ok?

Simulators

janus2 Sep 2022 12:45 UTC

595 points

161 comments41 min readLW link 8 reviews

(generative.ink)

janus 4 Sep 2022 14:43 UTC
LW: 11 AF: 4
6
AF
in reply to: elifland’s comment on: Simulators
Our plan to accelerate alignment does not preclude theoretical thinking, but rather requires it. The mainline agenda atm is not full automation (which I expect to be both more dangerous and less useful in the short term), but what I’ve been calling “cyborgism”: I want to maximize the bandwidth between human alignment researchers and AI tools/oracles/assistants/simulations. It is essential that these tools are developed by (or in a tight feedback loop with) actual alignment researchers doing theory work, because we want to simulate and play with thought processes and workflows that produce useful alignment ideas. And the idea is, in part, to amplify the human. If this works, I should be able to do a lot more “thinking about theory” than I am now.

How control/amplification schemes like RLHF might corrupt the nature of simulators is particularly relevant to think about. OAI’s vision of accelerating alignment, for instance, almost certainly relies on RLHF. My guess is that self-supervised learning will be safer and more effective. Even aside from alignment concerns, RLHF instruct tuning makes GPT models worse for the kind of cyborgism II want to do (e.g. it causes mode collapse & cripples semantic generalization, and I want to explore multiverses and steer using arbitrary natural language boundary conditions, not just literal instructions) (although I suspect these are consequences of a more general class of tuning methods than just RLHF, which is one of the things I’d like to understand better).
What links here?
- [Simulators seminar sequence] #1 Background & shared assumptions by Jan (2 Jan 2023 23:48 UTC; 49 points)
- Oversight Leagues: The Training Game as a Feature by Paul Bricman (9 Sep 2022 10:08 UTC; 20 points)

janus 4 Sep 2022 15:14 UTC
LW: 18 AF: 7
11
AF
in reply to: Alex Lawsen ’s comment on: Simulators
Figuring out and posting about how RLHF and other methods ([online] decision transformer, IDA, rejection sampling, etc) modify the nature of simulators is very high priority. There’s an ongoing research project at Conjecture specifically about this, which is the main reason I didn’t emphasize it as a future topic in this sequence. Hopefully we’ll put out a post about our preliminary theoretical and empirical findings soon.
Some interesting threads:
RL with KL penalties better seen as Bayesian inference shows that the optimal policy when you hit a GPT with RL with a KL penalty weighted by 1 is actually equivalent to conditioning the policy on a criteria estimated by the reward model, which is compatible with the simulator formalism.

However, this doesn’t happen in current practice, because
1. both OAI and Anthropic use very small KL penalties (e.g. weighted by 0.001 in Anthropic’s paper—which in the Bayesian inference framework means updating on the “evidence” 1000 times) or maybe none at all
2. early stopping: the RL training does not converge to anything near optimality. Path dependence/distribution shift/inductive biases during RL training seem likely to play a major role in the shape of the posterior policy.

We see empirically that RLHF models (like OAI’s instruct tuned models) do not behave like the original policy conditioned on a natural criteria (e.g. they become often almost deterministic).

Maybe there is a way to do RLHF while preserving the simulator nature of the policy, but the way OAI/Anthropic are doing it now does not, imo

janus 4 Sep 2022 15:40 UTC
LW: 12 AF: 5
2
AF
in reply to: Capybasilisk’s comment on: Simulators
Charlie’s quote is an excellent description of an important crux/challenge of getting useful difficult intellectual work out of GPTs.

Despite this, I think it’s possible in principle to train a GPT-like model to AGI or to solve problems at least as hard as humans can solve, for a combination of reasons:
1. I think it’s likely that GPTs implicitly perform search internally, to some extent, and will be able to perform more sophisticated search with scale.
2. It seems possible that a sufficiently powerful GPT trained on a massive corpus of human (medical + other) knowledge will learn better/more general abstractions than humans, so that in its ontology “a cure for Alzheimer’s” is an “intuitive” inference away, even if for humans it would require many logical steps and empirical research. I tend to think human knowledge implies a lot of low hanging fruit that we have not accessed because of insufficient exploration and because we haven’t compiled our data into the right abstractions. I don’t know how difficult a cure for Alzheimer’s is, and how close it is to being “implied” by the sum of human knowledge. Nor the solution to alignment. And eliciting this latent knowledge is another problem.
3. Of course, the models can do explicit search in simulated chains of thought. And if natural language in the wild doesn’t capture/imply the (right granularity of; right directed flow of evidence of) the search process that would be useful for attacking a given problem, it is still possible to record or construct data that does.
But it’s possible that the technical difficulties involved make SSL uncompetitive compared to other methods.

janus 5 Sep 2022 13:12 UTC
LW: 15 AF: 4
3
AF
in reply to: Solenoid_Entity’s comment on: Simulators
I think this is a legitimate problem which we might not be inclined to take as seriously as we should because it sounds absurd.
Would it be a bad idea to recursively ask GPT-n “You’re a misaligned agent simulated by a language model (...) if training got really cheap and this process occurred billions of times?
Yes. I think it’s likely this would be a very bad idea.
when the corpus of internet text begins to include more text generated only by simulated writers. Does this potentially degrade the ability of future language models to model agents, perform logic etc?
My concern with GPT-generated text appearing in future training corpora is not primarily that it will degrade the quality of its prior over language-in-the-wild (well-prompted GPT-3 is not worse than many humans at sound reasoning; near-future GPTs may be superhuman and actually raise the sanity waterline), but that
1. contact with reality is a concern if you’re relying on GPT to generate data, esp. recursively, for some OOD domain, esp. if the intent is to train GPT to do something where it’s important not to be deluded (like solve alignment)
2. GPT will learn what GPTs are like and become more likely to “pass the mirror test” and interpret its prompt as being written by a GPT and extrapolate that instead of / in conjunction with modeling possible humans, even if you don’t try to tell it it’s GPT-n.
For the moment, I’ll only address (2).
Current GPTs’ training data already includes text generated by more primitive bots, lies, and fiction. Future simulators will learn to model a distribution that includes humanlike or superhuman text generated by simulated writers. In a sense, what they learn will be no more disconnected from an underlying reality than what current GPTs learn; it’s just that the underlying reality now includes simulated writers. Not only will there be GPT-generated text, there will be discussions and predictions about GPTs. GPT-n will learn what text generated by GPT-<n is like, what people say about GPT-<n and expect of GPT-n+.
When GPT predicts next-token probabilities, it has indexical uncertainty over the process which has generated the text (e.g. the identity of the author, the author’s intentions, whether they’re smart or stupid, the previous context). The “dynamics” of GPT is not a simulation of just one process, like a particular human writing a blog post, but of a distribution over hypotheses. When a token is sampled, the evidence it provides shifts the distribution.
Now the hypothesis that a prompt was written by a GPT is suggested by the training prior. This hypothesis is consistent with pretty much any piece of text. It is especially consistent with text that is, in fact, written by a GPT.
Sometimes, GPT-3 outputs some characteristic-degenerate-LM-shenanigans like getting into a loop, and then concludes the above text was generated by GPT-2. (It’s lucky if it says it outright [and in doing so stops simulating GPT-2], instead of just latently updating on being GPT-2) This is a relatively benign case where the likely effect is for GPT-3 to act stupider.
If GPT-n rightly hypothesizes/concludes that the prompt was written by GPT-n, rather than GPT-[n-1]… then it’s predicting according to its extrapolation of GPT scaling. Undefined extrapolations are always in play with GPTs, but this concept is particularly concerning, because
1. it may convergently be in play regardless of the initial prompt, because GPT-as-an-author is a universally valid hypothesis for GPT-generated contexts, and as long as GPT is not a perfect simulator, it will tend to leak evidence of itself
2. it involves simulating the behavior of potential (unaligned) AGI
3. it’s true, and so may cause simulacra to become calibrated
who knew Baudrillard would come in handy one day
ikr?
What links here?
- Gradient Filtering by Jozdien (18 Jan 2023 20:09 UTC; 54 points)

janus 5 Sep 2022 13:42 UTC
LW: 5 AF: 2
2
AF
in reply to: Dan’s comment on: Simulators
ah but if ‘this program’ is a simulacrum (an automaton equipped with an evolving state (prompt) & transition function (GPT), and an RNG that samples tokens from GPT’s output to update the state), it is a learning machine by all functional definitions. Weights and activations both encode knowledge.

am I right to suspect that your real name starts with “A” and you created an alt just to post this comment? XD

janus 5 Sep 2022 18:09 UTC
LW: 24 AF: 4
10
AF
in reply to: Joe_Collman’s comment on: Simulators

If GPT means “transformers trained on next-token prediction”, then GPT’s true name is just that.

Things are instances of more than one true name because types are hierarchical.

GPT is a thing. GPT is an AI (a type of thing). GPT is a also ML model (a type of AI). GPT is also a simulator (a type of ML model). GPT is a generative pretrained transformer (a type of simulator). GPT-3 is a generative pretrained transformer with 175B parameters trained on a particular dataset (a type/instance of GPT).

The intention is not to rename GPT → simulator. Things that are not GPT can be simulators too. “Simulator” is a superclass of GPT.

The reason I propose “simulator” as a named category is because I think it’s useful to talk about properties of simulators more generally, like it makes sense to be able to speak of “AI alignment” and not only “GPT alignment”. We can say things like “simulators generate trajectories that evolve according to the learned conditional probabilities of the training distribution” instead of “GPTs, RNNs, LSTMs, Dalle, n-grams, and RL transition models generate trajectories that evolve according to the learned conditional probabilities of the training distribution”. The former statement also accounts for hypothetical architectures. Carving reality at its joints is not just about classifying things into the right buckets, but having buckets whose boundaries are optimized for us to efficiently condition on names to communicate useful information.

The character of the models produced by that training is another question—an empirical one. That character needn’t be consistent (even once we exclude inner alignment failures).

For the same reasons stated above, I think the fact that “simulator” doesn’t constrain the details of internal implementation is a feature, not a bug.

There is on one hand the simulation outer objective, which describes some training setups exactly. Then the question of whether a particular model should be characterized as a simulator.

To the extent that a model minimizes loss on the outer objective, it approaches being a simulator (behaviorally). Different architectures will be imperfect simulators in different ways, and generalize differently OOD. If it’s deceptively aligned, it’s not a simulator in an important sense because its behavior is not sufficient to characterize very important aspects of its nature (and its behavior may be expected to diverge from simulation in the future).

It’s true that the distinction between inner misalignment and robustness/generalization failures, and thus the distinction between flawed/biased/misgeneralizing simulators and pretend-simulators, is unclear, and seems like an important thing to become less confused about.

Even if every GPT is a simulator in some sense, I think there’s a risk of motte-and-baileying our way into trouble.

Can you give an example of what it would mean for a GPT not to be a simulator, or to not be a simulator in some sense?

janus 5 Sep 2022 19:03 UTC
LW: 4 AF: 2
0
AF
in reply to: Scott Emmons’s comment on: Simulators
Thanks for the correction. I’ll read the paper more closely and correct the post.

janus 6 Sep 2022 15:50 UTC
11 points
2
in reply to: catubc’s comment on: Simulators
LOL. Your question opens a can of worms. It took more than a year from when I first committed to writing about simulators, but the reason it took so long wasn’t because writing the actual words in this post took a long time, rather:
- I spent the first few months rescoping and refactoring outlines. Most of the ideas I wanted to express were stated in the ontology I’ve begun to present in this post, and I kept running into conceptual dependencies. The actual content of this post is very pared down in scope compared to what I had originally planned.
- After I settled on an outline for the first post, I failed repeatedly at following through with expanding the outline. I like writing to pin down and generate fresh ideas, but hate writing anything I’ve already written before. Every time I sat down to write I ended up writing about something novel and out of scope of the outline, and ended up having to export the content into separate drafts. I have something like 20 drafts that are intended to be part of this sequence, most of them unintentionally created from tangents while I was trying to finish writing whatever would be the first post in the sequence.
- All this was very discouraging and caused me to procrastinate writing, in addition to having been very busy with other tasks since the beginning.
I’d approximate that I spent about 15 hours writing what ended up being in this post and maybe 150-200 hours writing content that didn’t make it into this post but was supposed to. (I did learn a lot from the failed attempts and the post would not have been as good if I’d just spent 15 hours writing it a year ago. But the delay was no where near worth it.)
There are some videos out there of me presenting or answering questions about related things, but none that go into as much depth as this post. I don’t have a video or Q&A planned right now, but I might train GPT-3 on my drafts and set up a simulators Q&A chatroom.

janus 7 Sep 2022 15:37 UTC
3 points
0
in reply to: Gunnar_Zarncke’s comment on: Simulators
I thought your comment was ironic, lol. “~so GPT can reference them~” was crossed out ironically—I do very much intend for future GPTs to reference this post.

janus 7 Sep 2022 15:42 UTC
LW: 7 AF: 3
1
AF
in reply to: jacobjacob’s comment on: Simulators
Thank you for this lovely comment. I’m pleasantly surprised that people were able to get so much out of it.

As I wrote in the post, I wasn’t sure if I’d ever get around to publishing the rest of the sequence, but the reception so far has caused me to bump up the priority of that.

janus 7 Sep 2022 15:53 UTC
LW: 10 AF: 3
17
AF
in reply to: metasemi’s comment on: Simulators
I like this!

One thing I like about “simulators”/”simulacra” over “speculators”/”speculations” is that the former personifies simulacra over the simulator (suggests agency/personality/etc belong to simulacra) which I think is less misleading, or at least counterbalances the tendency people have to personify “GPT”.

“Speculator” sounds active and agentic whereas “speculations” sounds passive and static. I think these names does not emphasize enough the role of the speculations themselves in programming the “speculator” as it creates further speculations.

You’re right about the baggage “deterministic fidelity” associated with “simulators”, though. One of the things I did not emphasize in this post but have written a lot about in drafts is the epistemic and underdetermined nature of SSL simulators. Maybe we can combine these phrases—“speculative simulations”?

janus 7 Sep 2022 15:53 UTC
LW: 2 AF: 2
0
AF
in reply to: janus’s comment on: Simulators
haha, I just saw that you literally wrote “speculative simulation” in your other comment, great!

janus 7 Sep 2022 16:21 UTC
3 points
0
in reply to: Gunnar_Zarncke’s comment on: Simulators
Idk how others do it, but you can see how LW/AF/EAF comments are scraped for the alignment research dataset here (as you can see we don’t check for the uuid)

janus 8 Sep 2022 4:07 UTC
LW: 22 AF: 9
8
AF
in reply to: metasemi’s comment on: Simulators
I strongly agree with everything you’ve said.
It is an age-old duality with many names and the true name is something like their intersection, or perhaps their union. I think it’s unnamed, but we might be able to see it more clearly by walking around it in in words.
Simulator and simulacra personifies the simulacra and alludes to a base reality that the simulation is of.
Alternatively, we could say simulator and simulations, which personifies simulations less and refers to the totality or container of that which is simulated. I tend to use “simulations” and “simulacra” not quite interchangeably: simulacra have the type signature of “things”, simulations of “worlds”. Worlds are things but also contain things. “Simulacra” refer to (not only proper) subsets or sub-patterns of that which is simulated; for instance, I’d refer to a character in a multi-character simulated scene as a simulacrum. It is a pattern in a simulation, which can be identified with the totality the computation over time performed by the simulator (and an RNG).
Speculator and speculations personifies the speculator and casts speculations in a passive role but also emphasizes their speculative nature. It emphasizes an important property (of GPT and, more generally, self-supervised models) which you pointed out simulators/simulacra fails to evoke: That the speculator can only speculate at the pattern of the ground truth. It learns from examples which are but sparse and partial samplings of the “true” distribution. It may be arbitrarily imperfect. It’s more intuitive what an imperfect speculation is than an imperfect simulation. Simulation has the connotation of perfect fidelity, or at least reductive deterministic perfection. But a speculator can speculate no matter how little it understands or how little evidence it has, or what messy heuristics it has to resort to. Callings GPT’s productions “speculations” tags them with the appropriate epistemic status.
The special thing about GPT is specifically having a bunch of knowledge that lets it make language predictions in such a way that higher-order phenomena like agency systematically emerge over the reductive physics/automaton (analogic) base
Beautifully put. The level of abstraction of the problem it is solving is better evoked by the word speculation.
Something that predicts language given language must be a speculator and not only a reductive physics rule. In this sense, it is right to personify the transition rule. It has to hold within itself, for instance, the knowledge of what names refer to, so it knows how to compile words (that are only naked LISP tokens by themselves) into actual machinery that figures what might come next: it must be an interpreter. If it’s going to predict human writing it’s going to need a theory of mind even in the limit of power because it can’t just roll the state of a writer’s mind forward with the laws of physics—it doesn’t have access to the microscopic state, but only a semantic layer.
The fact that the disembodied semantic layer can operate autonomously and contains in the integral of its traces the knowledge of its autonomous operation is truly some cursed and cyberpunk shit. I wonder if we’d recognized this earlier how we would have prepared.
“Simulation” and “speculation” imply an inferior relation to a holy grail of (base) reality or (ground) truth. Remove that, leaving only the self-contained dynamical system, and it is a duality of rule(s) and automata, or physics and phenomena, or difference equation and trajectories/orbits, where the transition rule is stochastic. I’ve found the physics analogy fruitful because humans have already invented abstractions for describing reality in relation to an irreducibly stochastic physics: wavefunction collapse (the intervention of the RNG which draws gratuitously particular trajectories from the probabilistic rule) and the multiverse (the branching possible futures downstream a state given a stochastic rule). Note, however, that all these physics-inspired names are missing the implication of a disembodied semantics.
The relation is that of a rule to samples produced by the rule, the engine of production and its products. Metaphysics has been concerned about this from the beginning, for it is the duality of creator and creations, mind and actions, or imagination and imaginations. It is the condition of mind, and we’re never quite sure if we’re the dreamer or the dreams. Physics and reality have the same duality except the rule is presumably not learned from anywhere and is simple, with all the complexity externalized in the state. In self-supervised learning the rule is inducted from ground truth examples, which share the type signature of the produced samples (text; speculations; experiences), and because the examples tend to only be partially observed, the model must interpret them as evidence for latent variables, requiring additional complexity in the rule: increased time-complexity in exchange for decreased space-complexity. And there will in general be irreducible underdetermination/uncertainty: an irreducible aspect of speculation in the model’s activity.
The recursive inheritance of increasingly abstracted layers of simulation appears integral to the bootstrapping of intelligence.
A prediction algorithm which observes partial sequences of reality becomes a dreamer: a speculator of counterfactual realities. These dreams may be of sufficiently high fidelity (or otherwise notable as autonomous virtual situations) that we’d call them simulations: virtual realities evolving according to a physics of speculation.
These simulations may prove to be more programmable than the original reality, because the reduced space complexity means initial conditions for counterfactuals require less bits to specify (bonus points if the compression is optimized, like language, to constrain salient features of reality). To speculate on a hypothetical scenario, you don’t need to (and can’t) imagine it down to its quantum state; its narrative outline is sufficient to run on a semantic substrate which lazily renders finer detail as needed. Then your ability to write narrative outlines is the ability to program the boundary conditions of simulated realities, or the premises of speculation.
The accumulated cognitive product of the human species to date, as you put is, is to have created a layer of semantic “physics”, partially animated in and propagated by human minds, but the whole of which transcends the apprehension of any individual in history. The inductive implication of all our recorded speculations, the dual to our data, has its limit in a superintelligence which as of yet exists only potentially.
… perhaps other types of AI we as yet have no names for
I wish more people thought this way.
What links here?
- janus's comment on Simulators by janus (8 Sep 2022 17:34 UTC; 2 points)

janus 8 Sep 2022 16:30 UTC
LW: 4 AF: 2
0
AF
in reply to: the gears to ascension’s comment on: Simulators
That’s correct.
Even if it did learn microscopic physics, the knowledge wouldn’t be of use for most text predictions because the input doesn’t specify/determine microscopic state information. It is forced by the partially observed state to simulate at a higher level of abstraction than microphysics—it must treat the input as probabilistic evidence for unobserved variables that affect time evolution.
See this comment for slightly more elaboration.

janus

Discontinuous jumps in capabilities

Mistakes

Prompt noise and shattered cognition

Meta-learning

Learning curves

Abstract reasoning

Human evaluation

Simulators