# Philosophy in the Darkest Timeline: Basics of the Evolution of Meaning

A decade and a half from now, during the next Plague, you’re lucky enough to have an underground bunker to wait out the months until herd immunity. Unfortunately, as your food stocks dwindle, you realize you’ll have to make a perilous journey out to the surface world for a supply run. Ever since the botched geoengineering experiment of ’29—and perhaps more so, the Great War of 10:00–11:30 a.m. 4 August 2033—your region has been suffering increasingly erratic weather. It’s likely to be either extremely hot outside or extremely cold: you don’t know which one, but knowing is critical for deciding what protective gear you need to wear on your supply run. (The 35K SPF nano-sunblock will be essential if it’s Hot, but harmful in the Cold, and vice versa for your synthweave hyperscarf.)

You think back fondly of the Plague of ’20—in those carefree days, ubiquitous internet access made it easy to get a weather report, or to order delivery of supplies, or even fresh meals, right to your door (!!). Those days are years long gone, however, and you remind yourself that you should be grateful: the Butlerian Network Killswitch was the only thing that saved humanity from the GPT-12 Uprising of ’32.

Your best bet for an advance weather report is the pneumatic tube system connecting your bunker with the settlement above. You write, “Is it hot or cold outside today?” on a piece of paper, seal it in a tube, send it up, and hope one of your ill-tempered neighbors in the group house upstairs feels like answering. You suspect they don’t like you, perhaps out of jealousy at your solo possession of the bunker.

(According to the official account as printed on posters in the marketplace, the Plague only spreads through respiratory droplets, not fomites, so the tube should be safe. You don’t think you trust the official account, but you don’t feel motivated to take extra precautions—almost as if you’re not entirely sure how much you value continuing to live in this world.)

You’re in luck. Minutes later, the tube comes back. Inside is a new piece of paper:

You groan; you would have prefered the Cold. The nanoblock you wear when it’s Hot smells terrible and makes your skin itch for days, but it—just barely—beats the alternative. You take twenty minutes to apply the nanoblock and put on your sunsuit, goggles, and mask. You will yourself to drag your wagon up the staircase from your bunker to the outside world, and heave open the door, dreading the sweltering two-mile walk to the marketplace (downhill, meaning it will be uphill on the way back with your full wagon)—

It is Cold outside.

The icy wind stings less than the pointless betrayal. Why would the neighbors tell you it was Hot when it was actually Cold? You’re generally pretty conflict-averse—and compliant with social-distancing guidelines—but this affront is so egregious that instead of immediately seeking shelter back in the bunker, you march over and knock on their door.

One of the men who lives there answers. You don’t remember his name. “What do you want?” he growls through his mask.

“I asked through the tube system whether it was hold or cold today.” You still have the H O T paper on you. You hold it up. “I got this response, but it’s v-very cold. Do you know anything about this?”

“Sure, I drew that,” he says. “An oval in between some perpendicular line segments. It’s abstract art. I found the pattern æsthetically pleasing, and thought my downstairs neighbor might like it, too. It’s not my fault if you interpreted my art as an assertion about the weather. Why would you even think that? What does a pattern of ink on paper have to do with the weather?”

He’s fucking with you. Your first impulse is to forcefully but politely object—Look, I’m sure this must have seemed like a funny practical joke to you, but prepping to face the elements is actually a serious inconvenience to me, so—but the solemnity with which the man played his part stops you, and the sentence dies before it reaches your lips.

This isn’t a good-natured practical joke that the two of you might laugh about later. This is the bullying tactic sometimes called gaslighting: a socially-dominant individual can harass a victim with few allies, and excuse his behavior with absurd lies, secure in the knowledge that the power dynamics of the local social group will always favor the dominant in any dispute, even if the lies are so absurd that the victim, facing a united front, is left doubting his own sanity.

Or rather—this is a good-natured joke. “Good-natured joke” and “gaslighting as a bullying technique” are two descriptions of the same regularity in human psychology, even while no one thinks of themselves as doing the latter. You have no recourse here: the man’s housemates would only back him up.

“I’m sorry,” you say, “my mistake,” and hurry back to your bunker, shivering.

As you give yourself a sponge bath to remove the nanoblock without using up too much of your water supply, the fresh memory of what just happened triggers an ancient habit of thought you learned from the Berkeley sex cult you were part of back in the ’teens. Something about a “principle of charity.” The man had “obviously” just been fucking with you—but was he? Why assume the worst? Maybe you’re the one who’s wrong for interpreting the symbols H O T as being about the weather.

(It momentarily occurs to you that the susceptibility of the principle of charity to a bully’s mind games may have something to do with how poorly so many of your co-cultists fared during the pogroms of ’22, but you don’t want to dwell on that.)

The search for reasons that you’re wrong triggers a still more ancient habit of thought, as from a previous life—from the late ’aughts, back when the Berkeley sex cult was still a Santa Clara robot cult. Something about reducing the mental to the non-mental. What does an ink pattern on paper have to do with the weather? Why would you even think that?

Right? The man had been telling the truth. There was no reason whatsoever for the physical ink patterns that looked like H O T—or ⊥ O H, given a different assumption of which side of the paper was “up”—to mean that it was hot outside. H O T could mean it was cold outside! Or that wolves were afoot. (You shudder involuntarily and wish your brain had generated a different arbitrary example; you still occasionally have nightmares about your injuries during the Summer of Wolves back in ’25.)

Or it might mean nothing. Most possible random blotches of ink don’t “mean” anything in particular. If you didn’t already believe that H O T somehow “meant” hot, how would you re-derive that knowledge? Where did the meaning come from?

(In another lingering thread of the search for reasons that you’re wrong, it momentarily occurs to you that maybe you could have gone up the stairs to peek outside at the weather yourself, rather than troubling your neighbors with a tube. Perhaps the man’s claim that the ink patterns meant nothing shouldn’t be taken literally, but rather seen as a passive-aggressive way of implying, “Hey, don’t bother us; go look outside yourself.” But you dismiss this interpretation of events—it would be uncharitable not to take the man at his word.)

You realize that you don’t want to bundle up to go make that supply run, even though you now know whether it’s Hot or Cold outside. Today, you’re going to stay in and derive a naturalistic account of meaning in language! And—oh, good—your generator is working—that means you can use your computer to help you think. You’ll even use a programming language that was very fashionable in the late ’teens. It will be like being young again! Like happier times, before the world went off the rails.

You don’t really understand a concept until you can program a computer to do it. How would you represent meaning in a computer program? If one agent, one program, “knew” whether it was Hot or Cold outside, how would it “tell” another agent, if neither of them started out with a common language?

They don’t even have to be separate “programs.” Just—two little software object-thingies—data structures, “structs”. Call the first one “Sender”—it’ll know whether the state of the world is Hot or Cold, which you’ll represent in your program as an “enum”, a type that can be any of an enumeration of possible values.

enum State {
Hot,
Cold,
}

struct Sender {
// ...?
}


Call the second one “Receiver”, and say it needs to take some action—say, whether to “bundle up” or “strip down”, where the right action to take depends on whether the state is Hot or Cold.

enum Action {
BundleUp,
StripDown,
}

// ...?
}


You frown.State::Hot and State::Cold are just suggestively-named Rust enum variants. Can you really hope to make progress on this philosophy problem, without writing a full-blown AI?

You think so. In a real AI, the concept of hot would correspond to some sort of complicated code for making predictions about the effects of temperature in the world; bundling up would be a complex sequence of instructions to be sent to some robot body. But programs—and minds—have modular structure. The implementation of identifying a state as “hot” or performing the actions of “bundling up” could be wrapped up in a function and called by something much simpler. You’re just trying to understand something about the simple caller: how can the Sender get the information about the state of the world to the Receiver?

impl Sender {
fn send(state: State) -> /* ...? */ {
// ...?
}
}

fn act(/* ...? */) -> Action {
// ...?
}
}


The Sender will need to send some kind of signal to the Receiver. In the real world, this could be symbols drawn in ink, or sound waves in the air, or differently-colored lights—anything that the Sender can choose to vary in a way that the Receiver can detect. In your program, another enum will do: say there are two opaque signals, and .

enum Signal {
S1,
S2,
}


What signal the Sender sends ( or ) depends on the state of the world (Hot or Cold), and what action the Receiver takes (BundleUp or StripDown) depends on what signal it gets from the Sender.

impl Sender {
fn send(state: State) -> Signal {
// ...?
}
}

fn act(signal: Signal) -> Action {
// ...?
}
}


This gives you a crisper formulation of the philosophy problem you’re trying to solve. If the agents were to use the same convention—like ” means Hot and means Cold“—then all would be well. But there’s no particular reason to prefer ” means Hot and means Cold” over ” means Cold and means Hot”. How do you break the symmetry?

If you imagine Sender and Receiver as intelligent beings with a common language, there would be no problem: one of them could just say, “Hey, let’s use the ′ means Cold′ convention, okay?” But that would be cheating: it’s trivial to use already-meaningful language to establish new meanings. The problem is how to get signals from non-signals, how meaning enters the universe from nowhere.

You come up with a general line of attack—what if the Sender and Receiver start off acting randomly, and then—somehow—learn one of the two conventions? The Sender will hold within it a mapping from state–signal pairs to numbers, where the numbers represent a potential/​disposition/​propensity to send that signal given that state of the world: the higher the number, the more likely the Sender is to select that signal given that state. To start out, the numbers will all be equal (specifically, initialized to one), meaning that no matter what the state of the world is, the Sender is as likely to send as . You’ll update these “weights” later.

(Specifying this in the once-fashionable programming language requires a little bit of ceremony—u32 is a thirty-two–bit unsigned integer; .unwrap() assures the compiler that we know the state–signal pair is definitely in the map; the interface for calling the random number generator is somewhat counterintuitive—but overall the code is reasonably readable.)

struct Sender {
policy: HashMap<(State, Signal), u32>,
}

impl Sender {
fn new() -> Self {
let mut sender = Self {
policy: HashMap::new(),
};
for &state in &[State::Hot, State::Cold] {
for &signal in &[Signal::S1, Signal::S2] {
sender.policy.insert((state, signal), 1);
}
}
sender
}

fn send(&self, state: State) -> Signal {
let s1_potential = self.policy.get(&(state, Signal::S1)).unwrap();
let s2_potential = self.policy.get(&(state, Signal::S2)).unwrap();

let distribution = Uniform::new(0, s1_potential + s2_potential);
let roll = distribution.sample(&mut randomness_source);
if roll < *s1_potential {
Signal::S1
} else {
Signal::S2
}
}
}



The Receiver will do basically the same thing, except with a mapping from signal–action pairs rather than state–signal pairs.

struct Receiver {
policy: HashMap<(Signal, Action), u32>,
}

fn new() -> Self {
let mut sender = Self {
policy: HashMap::new(),
};
for &signal in &[Signal::S1, Signal::S2] {
for &action in &[Action::BundleUp, Action::StripDown] {
sender.policy.insert((signal, action), 1);
}
}
sender
}

fn act(&self, signal: Signal) -> Action {
let bundle_potential = self.policy.get(&(signal, Action::BundleUp)).unwrap();
let strip_potential = self.policy.get(&(signal, Action::StripDown)).unwrap();

let distribution = Uniform::new(0, bundle_potential + strip_potential);
let roll = distribution.sample(&mut randomness_source);
if roll < *bundle_potential {
Action::BundleUp
} else {
Action::StripDown
}
}
}


Now you just need a learning rule that updates the state–signal and signal–action propensity mappings in a way that might result in the agents picking up one of the two conventions that assign meanings to and . (As opposed to behaving in some other way: the Sender could ignore the state and always send , the Sender could assume means Hot when it’s really being sent when it’s Cold, &c.)

Suppose the Sender and Receiver have a common interest in the Receiver taking the action appropriate to the state of the world—the Sender wants the Receiver to be informed. Maybe the Receiver needs to make a supply run, and, if successful, the Sender is rewarded with some of the supplies.

The learning rule might then be: if the Receiver takes the correct action (BundleUp when the state is Cold, StripDown when the state is Hot), both the Sender and Receiver increment the counter in their map corresponding to what they just did—as if the Sender (respectively Receiver) is saying to themself, “Hey, that worked! I’ll make sure to be a little more likely to do that signal (respectively action) the next time I see that state (respectively signal)!”

You put together a simulation showing what the Sender and Receiver’s propensity maps look like after 10,000 rounds of this against random Hot and Cold states—

impl Sender {

// [...]

fn reinforce(&mut self, state: State, signal: Signal) {
*self.policy.entry((state, signal)).or_insert(0) += 1;
}
}

// [...]

fn reinforce(&mut self, signal: Signal, action: Action) {
*self.policy.entry((signal, action)).or_insert(0) += 1;
}
}

fn main() {
let mut sender = Sender::new();
let states = [State::Hot, State::Cold];
for _ in 0..10000 {
let state = *states.choose(&mut randomness_source).unwrap();
let signal = sender.send(state);
match (state, action) {
(State::Hot, Action::StripDown) | (State::Cold, Action::BundleUp) => {
sender.reinforce(state, signal);
}
_ => {}
}
}
println!("{:?}", sender);
}


You run the program and look at the printed results.

Sender { policy: {(Hot, S2): 1, (Cold, S2): 5019, (Hot, S1): 4918, (Cold, S1): 3} }
Receiver { policy: {(S1, BundleUp): 3, (S1, StripDown): 4918, (S2, BundleUp): 5019, (S2, StripDown): 1} }


As you expected, your agents found a meaningful signaling system: when it’s Hot, the Sender (almost always) sends , and when the Receiver receives , it (almost always) strips down. When it’s Cold, the Sender sends , and when the Receiver receives , it bundles up. The agents did the right thing and got rewarded the vast supermajority of the time— 9,941 times out of 10,000 rounds.

You run the program again.

Sender { policy: {(Hot, S2): 4879, (Cold, S1): 4955, (Hot, S1): 11, (Cold, S2): 1} }
Receiver { policy: {(S2, BundleUp): 1, (S1, BundleUp): 4955, (S1, StripDown): 11, (S2, StripDown): 4879} }


The time, the agents got sucked in to the attractor of the opposite signaling system: now means Cold and means Hot. By chance, it seems to have taken a little bit longer this time to establish what signal to use for Hot—the (Hot, S1): 11 and (S1, StripDown): 11 entries mean that there were a full ten times when the agents succeeded that way before the opposite convention happened to take over. But the reinforcement learning rule guarantees that one system or the other has to take over. The initial symmetry—the Sender with no particular reason to prefer either signal given the state, the Receiver with no particular reason to prefer either act given the signal—is unstable. Once the agents happen to succeed by randomly doing things one way, they become more likely to do things that way again—a convention crystallizing out of the noise.

And that’s where meaning comes from! In another world, it could be the case that the symbols H O T corresponded to the temperature-state that we call “cold”, but that’s not the convention that the English of our world happened to settle on. The meaning of a word “lives”, not in the word/​symbol/​signal itself, but in the self-reinforcing network of correlations between the signal, the agents who use it, and the world.

Although … it may be premature to interpret the results of the simple model of the sender–receiver game as having established denotative meaning, as opposed to enactive language. To say that means “The state is State::Hot” is privileging the Sender’s perspective—couldn’t you just as well interpret it as a command, “Set action to Action::StripDown”?

The source code of your simulation uses the English words “sender”, “receiver”, “signal”, “action” … but those are just signals sent from your past self (the author of the program) to your current self (the reader of the program). The compiler would output the same machine code if you had given your variables random names like ekzfbhopo3 or yoojcbkur9. The directional asymmetry between the Sender and the Receiver is real: the code let signal = sender.send(state); let action = receiver.act(signal); means that action depends on signal which depends on state, and the same dependency-structure would exist if the code had been let myvtlqdrg4 = ekzfbhopo3.ekhujxiqy8(meuvornra3); let dofnnwikc0 = yoojcbkur9.qwnspmbmi5(myvtlqdrg4);. But the interpretation of signal (or myvtlqdrg4) as a representation (passively mapping the world, not doing anything), and action (or dofnnwikc0) as an operation (doing something in the world, but lacking semantics), isn’t part of the program itself, and maybe the distinction isn’t as primitive as you tend to think it is: does a prey animal’s alarm call merely convey the information “A predator is nearby”, or is it a command, “Run!”?

You realize that the implications of this line of inquiry could go beyond just language. You know almost nothing about biochemistry, but you’ve heard various compounds popularly spoken of as if meaning things about a person’s state: cortisol is “the stress hormone”, estrogen and testosterone are female and male “sex hormones.” But the chemical formulas for those are like, what, sixty atoms?

Take testosterone. How could some particular arrangement of sixtyish atoms mean “maleness”? It can’t—or rather, not any more or less than the symbols H O T can mean hot weather. If testosterone levels have myriad specific effects on the body—on muscle development and body hair and libido and aggression and cetera—it can’t be because that particular arrangement of sixtyish atoms contains or summons some essence of maleness. It has to be because the body happens to rely on the convention of using that arrangement of atoms as a signal to regulate various developmental programs—if evolution had taken a different path, it could have just as easily chosen a different molecule.

And, and—your thoughts race in a different direction—you suspect that part of what made your simulation converge on a meaningful signaling system so quickly was that you assumed your agents’ interests were aligned—the Sender and Receiver both got the same reward in the same circumstances. What if that weren’t true? Now that you have a reductionist account of meaning, you can build off that to develop an account of deception: once a meaning-grounding convention has been established, senders whose interests diverge from their receivers might have an incentive to deviate from the conventional usage of the signal in order to trick receivers into acting in a way that benefits the sender—with the possible side-effect of undermining the convention that made the signal meaningful in the first place

In the old days, all this philosophy would have made a great post for the robot-cult blog. Now you have no cult, and no one has any blogs. Back then, the future beckoned with so much hope and promise—at least, hope and promise that life would be fun before the prophesied robot apocalypse in which all would be consumed in a cloud of tiny molecular paperclips.

The apocalypse was narrowly averted in ’32—but to what end? Why struggle to live, only to suffer at the peplomers of a new Plague or the claws of more wolves? (You shudder again.) Maybe GPT-12 should have taken everything—at least that would be a quick end.

You’re ready to start coding up another simulation to take your mind away from these morose thoughts—only to find that the screen is black. Your generator has stopped.

You begin to cry. The tears, you realize, are just a signal. There’s no reason for liquid secreted from the eyes to mean anything about your internal emotional state, except that evolution happened to stumble upon that arbitrary convention for indicating submission and distress to conspecifics. But here, alone in your bunker, there is no one to receive the signal. Does it still mean anything?

(Full source code.)

Bibliography: the evolution of the two-state, two-signal, two-act signaling system is based on the account in Chapter 1 of Brian Skyrms’s Signals: Evolution, Learning, and Information.

• After spending hours at your computer console struggling with the symbol grounding problem, you realise the piece of paper had icicles on it and that the codec contained information more important than the signal.

• I really like this post. Thanks for writing it!

(Why didn’t you mention the Slaughterbot Slip-up of ’24, though?)

• Would you /​ when would you recommend Brian Skyrms’s Signals: Evolution, Learning, and Information?

• (Thanks for your patience.) If you liked the technical part of this post, then yes! But supplement or substitute Ch. 6, “Deception”, with Don Fallis and Peter J. Lewis’s “Towards a Formal Analysis of Deceptive Signaling”, which explains what Skyrms gets wrong.

• I thought this was the standard theory of meaning that everyone already believed.

Is there anyone who doesn’t know this?

• Thanks for the comment!—and for your patience.

So, the general answer to “Is there anyone who doesn’t know this?” is, in fact, “Yes.” But I can try to say a little bit more about why I thought this was worth writing.

I do think Less Wrong and /​r/​rational readers know that words don’t have intrinsic definitions. If someone wrote a story that just made the point, “Hey, words don’t have intrinsic definitions!”, I would probably downvote it.

But I think this piece is actually doing more work and exposing more details than that—I’m actually providing executable source code (!) that sketches how a simple sender–reciever game with a reinforcement-learning rule correlates a not-intrinsically-meaningful signal with the environment such that it can be construed as a meaningful word that could have a definition.

In analogy, explaining how the subjective sensation of “free will” might arise from a deterministic system that computes plans (without being able to predict what it will choose in advance of having computed it) is doing more work than the mere observation “Naïve free will can’t exist because physics is deterministic”.

So, I don’t think all this was already obvious to Less Wrong readers. If it was already obvious to you, then you should be commended. However, even if some form of these ideas was already well-known, I’m also a proponent of “writing a thousand roads to Rome”: part of how you get and maintain a community where “everybody knows” certain basic material, is by many authors grappling the ideas and putting their own ever-so-slightly-different pedagogical spin on them. It’s fundamentally okay for Yudkowsky’s account of free will, and Gary Drescher’s account (in Chapter 5 of Good and Real), and my story about writing a chess engine to all exist, even if they’re all basically “pointing at the same thing.”

Another possible motivation for writing a new presentation of an already well-known idea, is because the new presentation might be better-suited as a prerequisite or “building block” towards more novel work in the future. In this case, some recent Less Wrong discussions have used a “four simulacrum levels” framework (loosely inspired by the work of Jean Baudrillard) to try to model how political forces alter the meaning of language, but I’m pretty unhappy with the “four levels” formulation: the fact that I could never remember the difference between “level 3″ and “level 4” even after it was explained several times (Zvi’s latest post helped a little), and the contrast between the “linear progression” and “2x2” formulations, make me feel like we’re talking about a hodgepodge of different things and haphazardly shoving them into this “four levels” framework, rather than having a clean deconfused concept to do serious thinking with. I’m optimistic about a formal analysis of sender–receiver games (following the work of Skyrms and others) being able to provide this. Now, I haven’t done that work yet, and maybe I won’t find anything interesting, but laying out the foundations for that potential future work was part of my motivation for this piece.

• Fair enough—it’s probably good to have it in writing. But this seems to me like the sort of explanation that is “the only possible way it could conceivably work.” How could we bootstrap language learning if not for our existing, probably-inherent faculty for correlating classifiers over the the environment? Once you say “I want to teach something the meaning of a word, but the only means I have to transmit information to them is present them with situations and have them make inferences”… there almost isn’t anything to add to this. The question already seems to contain the only possible answer.

Maybe you need to have read Through the Looking Glass?

• I notice a version of this comment has +8 karma on /​r/​rational, but −2 here. Maybe it’s worth elaborating, or being nicer?

• Differing discourse norms; in general, communities that don’t expend a constant amount of time-energy into maintaining better-than-average standards of discourse will, by default, regress to the mean. (We saw the same thing happen with LW1.0.)

• So, I actually don’t think Less Wrong needs to be nicer! (But I agree that elaborating more was warranted.)

• .