# Jokes Thread

This is a thread for rationality-related or LW-related jokes and humor. Please post jokes (new or old) in the comments.

------------------------------------

Q: Why are Chromebooks good Bayesians?

A: Because they frequently update!

------------------------------------

A super-intelligent AI walks out of a box...

------------------------------------

Q: Why did the psychopathic utilitarian push a fat man in front of a trolley?

A: Just for fun.

How many rationalists does it take to change a lightbulb?Just one. They’ll take any excuse to change something.

How many effective altruists does it take to screw in a lightbulb?Actually, it’s far more efficient if you convince someone

elseto screw it in.How many Giving What We Can members does it take to change a lightbulb?Fifteen have pledged to change it later, but we’ll have to wait until they finish grad school.

How many MIRI researchers does it take to screw in a lightbulb?The problem is that there are multiple ways to parse that, and while it might naively seem like the ambiguity is harmless, it would actually be disastrous if

anynumber of MIRI researchers tried to screw inside of a lightbulb.How many CFAR instructors does it take to change a lightbulb?By the time they’re done, the lightbulb should be able to change itself.

How many Leverage Research employees does it take to screw in a lightbulb?I don’t know, but we have a team working to figure that out.

How many GiveWell employees does it take to change a lightbulb?Not many. I don’t recall the exact number; there’s a writeup somewhere on their site, if you care to check.

How many cryonicists does it take to change a lightbulb?Two; one to change the lightbulb, and one to preserve the old one, just in case.

How many neoreactionaries does it take to screw in a lightbulb?We’d be better off returning to the dark.

How many neoreactionaries does it take to screw in a lightbulb?Mu. We should all be using oil lamps instead, as oil lamps have been around for thousands of years, lightbulbs only a hundred. Also, oil lamps won’t be affected by an EMP or solar flair. Reliable indoor lighting in general is a major factor in the increase of social degeneracy like nightclubs and premarital sex, and biological disorders like insomnia and depression. Lightbulbs are a cause and effect of social technology being outpaced by material conditions, and their place in society should be thoroughly reexamined, preferably via hundreds of blog posts and a few books. (Tangentially, blacks are five times more likely than whites to hate the smell of kerosene. How

interesting.)Alternatively, if you are already thoroughly pwned and/or gnoned, the answer is one, at a rate of $50 per lightbulb.

Edit: $45 if you or one of your friends has other electric work that could also be done. $40 if you are willing to take lessons later on how to fix your own damn house. $35 if you’re willing to move to Idaho. $30 if you give a good reason to only charge $30 a bulb.

The effective altruist comment just got me interested in effective altruism. I’ve seen the term thrown about, but I never bothered to look it up. Extrapolating from just the joke, I may be an effective altruist. Thanks for getting me interested in something I should have checked ages ago and for reminding me to look things up as I don’t know them instead of just assuming I got the “gist of the passage.”

Awesome. PM me if you want to talk more about effective altruism. (I’m currently staffing the EA Summit, so I may not reply swiftly.)

Yet another instance of comedy saving the world.

I was pessimistic that this thread would yield anything worthwhile, but am gratified to be proven wrong.

I literally burst out laughing at the MIRI one.

Congratulations. You win this thread.

Moral Philosopher: How would you characterize irrational behavior?

Economist: When someone acts counter to their preferences.

Moral Philosopher: Oh, that’s what we call virtue.

This seems a bit more like an Ayn Rand joke than a Less Wrong joke.

Three logicians walk into a bar. Bartender asks “Do you all want a drink?”. The first says “I don’t know”, the second says “I don’t know”, and the third says “yes”.

A brony, a fanfic writer and an AI researcher enters a bar, he orders a drink.

“Yields a joke when preceded by its quotation” yields a joke when preceded by its quotation.

“However, yields an even better joke (due to an extra meta level) when preceded by its quotation and a comma”, however, yields an even better joke (due to an extra meta level) when preceded by its quotation and a comma.

“Is a even better joke than the previous joke when preceded by its quotation” is actually much funnier when followed by something completely different.

Q: What’s quining?

A: “Is an example, when preceded by its quotation” is an example, when preceded by its quotation.

“Kind of misses the point of the joke” kind of misses the point of the joke.

“when quined, makes quite a joke” when quined, makes quite a joke

Reminds me of A Self-Referential Story.

If I want something, it’s Rational. If you want something, it’s a cognitive bias.

If they want something, the world is mad and people are crazy.

More succinctly: I am rational, you are biased, they are mind-killed.

None of these quite fit the “irregular verbs” pattern that Russell and others made famous; in those all three words should have overlapping denotations and merely greatly differ in connotations. Maybe “I use heuristics, you are biased, they are mind-killed”, but there the “to use”/”to be” distinction still ruins it.

I have heuristics, you have biases, they have killed-minds.

Three rationalists walk into a bar.

The first one walks up to the bar, and orders a beer.

The second one orders a cider.

The third one says “Obviously you’ve never heard of Aumann’s agreement theorem.”

An exponentially large number of Boltzmann Brains experience the illusion of walking into bars, and order a combination of every drink imaginable.

An attractive woman goes into a bar, and enters into a drinking contest with Nick Bostrom. After repeatedly passing out she wakes up the next day with a hangover and a winning lottery ticket.

Three neoreactionaries walk into a bar

“Oh, how I hate these modern sluts” says the first one, watching some girls in miniskirts on the dancefloor “We should return to the 1950s when people acted respectably”

“Pfft, you call yourself reactionary?” replies the second “I idolise 11th century Austria, where people acted respectably and there were no ethnic minorites”

“Ahh, but I am even more reactionary then either of you” boasts the third “I long for classical Greece and Rome, the birthplace of western civilisation, where bisexuality was normal and people used to feast until they vomited!”

I don’t get the Bostrom one

I dunno whether its that funny, but its the sleeping beauty problem in anthropics, where you can alter subjective probabilities (e.g. of winning the lottery) by waking people up and giving them an amnesia-inducing drug. Only in this case, sleeping beauty is drunk.

Of course, explained like this it definitely isn’t funny

Where did the topic of neoreactionaries come up? (also your joke doesn’t use a form of the word ‘degeneracy,’ minus ten points.)

Q: Why did the AI researcher die?

A: They were giving a live AI demo and while handing out papers, said “Damn, there are never enough paperclips—I wish I would never run out”

Woody Allen

All syllogisms have three parts

Therefore, this is not a syllogism

I came up with that.

Cool! Before or after 1987?

Was the joke in that book? I’m pretty sure I’ve never read it, and I remember coming up with the joke.

Early 80s, I think. “All syllogisms” was one of my first mass-produced button slogans—the business was started in 1977, but I took some years to start mass producing slogans.

My printing records say that I did 3 print runs in 1988, but that plausibly means that I had been selling the button for a while because I don’t think I was doing 3 print runs at a time.

I thought I read the joke in Culture Made Stupid, but I can’t find it now and am probably mistaken.

An AI robot and a human are hunting. The human is bitten by a snake, and is no longer breathing. The AI quickly calls 911. It yells “My hunting partner was bitten by a poisonous snake and I think they’re dead!” The operator says “Calm down. First, make sure he’s dead.” A gunshot is heard. “Okay, now they’re definitely dead.”

Less of a joke than a pithy little truism, but I came up with it:

Notability is not ability.

Not an actual joke, but every time I reread Ayn Rand’s dictum “check your premises,” I can hear in the distance Eliezer Yudkowsky discreetly coughing and muttering, “check your priors.”

Both of those authors are known to use English in nonstandard ways for sake of an argument, so I’m actually now wondering if those two are as synonymous as they look.

Eliezer’s version obviously includes probabilities. I don’t know if Rand used any probabilistic premises, but on my very limited knowledge I would guess she didn’t.

Not as I recall, although I haven’t read Ayn Rand in something like fifteen years. Her schtick was more wild extrapolations of non-probabilistic logic.

Pretty much. I’ve actually gotten in a debate with a Randian on Facebook about what constitutes evidence. He doesn’t seem to like Bayes’ Theorem very much—he’s busy talking about how we shouldn’t refer to something as possible unless we have physical evidence of its possibility, because of epistemology.

That’s contrary to my experience of epistimology. It’s just a word, define it however you want, but in both epistemic logic and pragmatics-stripped conventional usage,

possibilityis nothing more than a lack of disproof.“I lack all conviction,” he thought. “Guess I’m the best!”

I hate to be “that guy”, but could you explain this one? I’m not sure I get it. Is it making fun of LW’s “politics is the mindkiller”/”keep your identity small” mindset?

https://en.wikipedia.org/wiki/The_Second_Coming_%28poem%29

A cryonicist’s response to someone who has kept you waiting:

“That’s okay. I have forever.”

There was a young man from Peru Whose limericks stopped at line two.

There once was a man from Verdun.

And of course, there’s the unfortunate case of the man named Nero...

Someone was going to tell me a Rationality joke, but the memetic hazard drove them to insanity.

This isn’t exactly rationalist, but it corrlates...

A man with Asperger’s walks into a pub. He walks up to the bar, and says “I don’t get it, what’s the joke?”

Is this being downvoted due to being perceived as offensive, or because its not funny? I certainly did not intend it to be offensive, in fact I first saw it when reading a joke thread on an Asperger’s forum.

I haven’t downvoted it.

On the other hand laughing at how people with Asperger sometimes aren’t socially skilled, hasn’t much value.

Its supposed to be funny due to metahumour. I certainly agree that simply laughing at people with poor social skills is neither witty nor productive.

Can’t speak for others, but anti-jokes are throughly explored already.

Since others might be interested, here is the link to the only article I could find on lesswrong about antijokes.

A Bayesian apparently is someone who after a single throw of a coin will believe that it is biased. Based on either outcome.

Also, why do ‘Bayes’, ‘base’ and ‘bias’ sound similar?

Heck, I had to stop and take a pen and paper to figure that out. Turns out, you were wrong. (I expected that, but I wasn’t sure how specifically.)

As a simple example, imagine that my prior belief is that 0.1 of coins always provide head, 0.1 of coins always provide tails, and 0.8 of coins are fair. So, my prior belief is that 0.2 of coins are biased.

I throw a coin and it’s… let’s say… head. What are the posterior probabilities? Multiplying the prior probabilities with the likelihood of this outcome, we get 0.1 × 1, 0.8 × 0.5, and 0.1 × 0. Multiplied and normalized, it is 0.2 for the heads-only coin, and 0.8 for the fair coin. -- My posterior belief remains 0.2 for biased coin, only in this case I know how specifically it is biased.

The same will be true for any

symetricalprior belief. For example, if I believe that 0.000001 of coins always provide head, 0.000001 of coins always provide tails, 0.0001 of coins provide head in 80% of cases, 0.0001 of coins provide tails in 80% of cases, and the rest are fair coins… again, after one throw my posterior probability of “a biased coin” will remain exactly the same, only the proportions of specific biases will change.On the other hand, if my prior belief is asymetrical… let’s say I believe that 0.1 of coins always provide head, and 0.9 of coins are fair (and there are no always-tails coins)… then yes, a single throw that comes up head

willincrease my belief that the coin was biased. (Because the outcome of tails would have decreased it.)(Technically, a Bayesian superintelligence would probably believe that all coins are asymetrical. I mean, they have

different pictureson their sides, that can influence the probabilities of the outcomes a little bit. But such a superintelligence would have believed that the coin was biased evenbeforethe first throw.)Now there’s a way to get people interested in learning probability.

Not so fast.

Not quite. In your example 0.2 of coins are not

biased, they arepredeterminedin that they always provide the same outcome no matter what.Let’s try a bit different example: the prior is that 10% of coins are biased towards heads (their probabilities are 60% heads, 40% tails), 10% are biased towards tails (60% tails, 40% heads), and 80% are fair.

After one throw (let’s say it turned out to be heads) your posterior for the fair coin did not change, but your posterior for the heads-biased coin went up and for the tails-biased coin went down. Your expectation for the next throw is now skewed towards heads.

My expectation of “this coin is biased” did not change, but “my expectation of the next result of this coin” changed.

In other words, I changed by expectation that the next flip will be heads, but I didn’t change my expectation that from the next 1000 flips approximately 500 will be heads.

Connotationally: If I believe that biased coins are very rare, then my expectation that the next flip will be heads increases only a little. More precisely, if the ratio of biased coins is

p, my expectation for the next flip increases at most by approximatelyp. The update based on one coin flip does not contradict common sense, it is as small as the biased coins are rare; and as large as they are frequent.In this particular example, no, it did not. However if you switch to continuous probabilities (and think not in terms of binary is-biased/is-not-biased but rather in terms of the probability of the true mean not being 0.5 plus-minus epsilon) your estimate of the character of the coin will change.

Also

and

-- these two statements contradict each other.

Using my simplest example, because it’s simplest to calculate:

Prior:

0.8 fair coin, 0.1 heads-only coin, 0.1 tails-only coin

probability “next is head” = 0.5

probability “next 1000 flips are approximately 500:500” ~ 0.8

Posterior:

0.8 fair coin, 0.2 heads-only coin

probability “next is head” = 0.6 (increased)

probability “next 1000 flips are approximately 500:500” ~ 0.8 (didn’t change)

Um.

Probability of a head = 0.5 necessarily means that the expected number of heads in 1000 tosses is 500.

Probability of a head = 0.6 necessarily means that the expected number of heads in 1000 tosses is 600.

Are you playing with two different meanings of the word “expected” here?

If I roll a 6-sided die, the expected value is 3½.

But I don’t really

expect to see3½ as an outcome of the roll. I expect to see either 1, or 2, or 3, or 4, or 5, or 6. But certainly not 3½.If my model says that 0.2 coins are heads-only and 0.8 coins are fair, in 1000 flips I

expect to seeeither 1000 heads (probability 0.2) or cca 500 heads (probability 0.8). But I don’texpect to seecca 600 heads. Yet, theexpected valueof the number of heads in 1000 flips is 600.No, I’m just using the word in the statistical-standard sense of “expected value”.

You can only multiply out P(next result is heads) * ( number of tosses) to get the expected number of heads if you believe those tosses are independent trials. The case of a biased coin toss explicitly violates this assumption.

But the tosses

areindependent trials, even for the biased coin. I think you mean the P(heads) is not 0.6, it’s either 0.5 or 1, you just don’t know which one it is.Which means that P(heads on toss after next|heads on next toss) != P(heads on toss after next|tails on next toss). Independence of A and B means that P(A|B) = P(A).

As long as you’re using the same coin, P(heads on toss after next|heads on next toss)

==P(heads on toss after next|tails on next toss).You’re confusing the probability of coin toss outcome with your knowledge about it.

Consider a RNG which generates

independentsamples from a normal distrubution centered on some—unknown to you—value mu. As you see more samples you get a better idea of what mu is and your expectations about what numbers you are going to see next change. But these samples do not become dependent just because your knowledge of mu changes.Please actually do your math here.

We have a coin that is heads-only with probability 20%, and fair with probability 80%. We’ve already conducted exactly one flip of this coin, which came out heads (causing out update from the prior of 10/80/10 to 20/80/0), but no further flips yet.

For simplicity, event A will be “heads on next toss” (toss number 2), and B will be “heads on toss after next” (toss number 3).

P(A) = 0.2

1 + 0.80.5 = 0.6 P(B) = 0.21 + 0.80.5 = 0.6P(A & B) = 0.2

11 + 0.80.50.5 = 0.4Note that this is not the same as P(A)

P(B), which is 0.60.6 = 0.36.The definition of independence is that A and B are independent iff P(A & B) = P(A) * P(B). These events are not independent.

Turning the math crank without understanding what you are doing is worse than useless.

Our issue is about how to understand probability, not which numbers come out of chute.

I don’t think so. None of the available potential coin-states would generate an expected value of 600 heads.

p = 0.6 → 600 expected heads is the many-trials (where each trial is 1000 flips) expected value given the prior and the result of the first flip, but this is different from the expectation of

this trial, which is bimodally distributed at [1000]x0.2 and [central limit around 500]x0.8No. If the distribution is symmetrical, then the probability density at .5 will be unchanged after a single coin toss.

No they don’t. He was saying that his estimate of the probability that the coin is unbiased (or approximately unbiased) does not change, but that the probability that the coin is weighted towards heads increased at the expense of the probability that the coin is weighted towards tails (or vice-versa, depending on the outcome of the first toss), which is correct.

In the continuous-distribution world the probability density at exactly 0.5 is infinitesimally small. And the probability density at 0.5 plus-minus epsilon will change.

Yes, they do. We’re talking about expected values of coin tosses now, not about the probabilities of the coin being biased.

(army1987 already addressed density vs mass.) No, for any x, the probability density at 0.5+x goes up by the same amount that the probability density at 0.5-x goes down (assuming a symmetrical prior), so for any x, the probability mass in [0.5-x, 0.5+x] will remain exactly the same.

Ok, instead of 1000 flips, think about the next 2 flips. The probability that exactly 1 of them lands heads does not change. This does not contradict the claim that the probability of the next flip being heads increases, because the probability of the next two flips both being heads increases while the probability of the next two flips both being tails decreases by the same amount (assuming you just saw the coin land heads).

You don’t even need to explicitly use Bayes’s theorem and do the math to see this (though you can). It all follows from symmetry and conservation of expected evidence. By symmetry, the change in probability of some event which is symmetric with respect to heads/tails must change by the same amount whether the result of the first flip is heads or tails, and by conservation of expected evidence, those changes must add to 0. Therefore those changes are 0.

I don’t think that is true. Imagine that your probability density is a normal distribution. You update in such a way that the mean changes, 0.5 is no longer the peak. This means that your probability density is no longer symmetrical around 0.5 (even if you started with a symmetrical prior)

andthe probability density line is not a 45 degree straight line—with the result that the density at 0.5+x changes by a different amount than at 0.5-x.That is correct. Your probability distribution is no longer symmetrical after the first flip, which means that on the

secondflip, the symmetry argument I made above no longer holds, and you get information about whether the coin is biased or approximately fair. That doesn’t matter for the first flip though. Did you read the last paragraph in my previous comment? If so, was any part of it unclear?That does not follow from anything you wrote before it (the 45 degree straight line part is particularly irrelevant).

Hm. Interesting how what looks like a trivially simple situation can become so confusing. Let me try to walk through my reasoning and see what’s going on...

We have a coin and we would like to know whether it’s fair. For convenience let’s define heads as 1 and tails as 0, one consequence of that is that we can think of the coin as a bitstring generator. What does it mean for a coin to be fair? It means that expected value of the coin’s bitstring is 0.5. That’s the same thing as saying that the mean of the sample bitstring converges to 0.5.

Can we know for certain that the coin is fair on the basis of examining its bitsting? No, we can not. Therefore we need to introduce the concept of

acceptablecertainty, that is, the threshold beyond which we think that the chance of the coin being fair is high enough (that’s the same concept as the p-value). In frequentist statistics we will just run an exact binomial test, but Bayes makes things a bit more complicated.Luckily, Gelman in

Bayesian Data Analysislooks exactly at this case (2nd ed., pp.33-34). Assuming a uniform prior on [0,1] the posterior distribution for theta (which in our case is the probability of the coin coming up heads or generating a 1) isp( th | y ) is proportional to (th ^ y) * ((1 - th)^(n—y))

where

yis the number of heads andnis the number of trials.After the first flip y=1, n=1 and so p( th | 1) is proportional to ( th )

Aha, this is interesting. Our prior was uniform so the density was just a straight horizontal line. After the first toss the line is still straight but is now sloping up with the minimum at zero and the maximum at 1.

So the expected value of the mean of our bitstring used to be 0.5 but is now greater than 0.5. And that is why I argued that the very first toss changes your expectations: your expected bitstring mean (= expected probability of the coin coming up heads) is now

no longer 0.5and so you don’t think that the coin is fair (because the fair coin’s expected mean is 0.5).But that’s only one way of looking at it and now I see the error of my ways. After the first toss our probability density is still a straight line and it pivoted around the 0.5 point. This means that the probability mass in some neighborhood of [0.5-x, 0.5+x] did not change and so the probability of the coin being fair remains the same. The change in the expected value is because we think that if the coin is biased, it’s more likely to be biased towards heads than towards tails.

And yet this works because we started with a uniform prior, a straight density line. What if we start with a different, “curvier” prior? After the first toss the probability density should still pivot around the 0.5 point but because it’s not a straight line the probability mass in [0.5-x, 0.5+x] will not necessarily remain the same. Hmm… I don’t have time right now to play with it, but it requires some further thought.

Yes.

Provided the prior is symmetrical, the probability mass in [0.5-x, 0.5+x] will remain the same after the first toss by the argument I sketched above, even though the probability density will not be a straight line. On subsequent tosses, of course, that will no longer be true. If you have flipped more heads than tails, then your probability distribution will be skewed, so flipping heads again will decrease the probability of the coin being fair, while flipping tails will increase the probability of the coin being fair. If you have flipped the same (nonzero) number of heads as tails so far, then your probability distribution will be different than it was when you started, but it will still be symmetrical, so the next flip does not change the probability of the coin being fair.

That’s not what a probability

densityis. You’re thinking of a probability mass.Yes, you are right.

I didn’t realize you were serious, given that this is a joke thread.

Here’s the easy way to solve this:

By conservation of expected evidence, if one outcome is evidence for the coin being biased, then the other outcome is evidence against it.

They might believe that it’s biased either way if they have a low prior probability of the coin being fair. For example, if they use a beta distribution for the prior, they only assign an infinitesimal probability to a fair coin. But since they’re not finding evidence that it’s biased, you can’t say the belief is based on the outcome of the toss.

I suppose there is a sense it which your statement is true. If I’m given a coin which is badly made, but in a way that I don’t understand, then the first toss is fair. I have no idea if it will land on heads or tails. Once I toss it, I have some idea of in which way it’s unfair, so the next toss is not fair.

That’s not usually what people mean when they talk about a fair coin, though.

“I wonder what is the probability of random sc2 player being into math and cognitive biases”

“It’s probably more one-sided than a Möbius strip”