I’m a 3rd year PhD student at Columbia. My academic interests lie in mechanism design and algorithms related to the acquisition of knowledge. I write a blog on stuff I’m interested in (such as math, philosophy, puzzles, statistics, and elections): https://ericneyman.wordpress.com/

# Eric Neyman

To elaborate on my feelings about the truck:

If it is meant as an attack on Paul, then it feels pretty bad/norm-violating to me. I don’t know what general principle I endorse that makes it not okay: maybe something like “don’t attack people in a really public and flashy way unless they’re super high-profile or hold an important public office”? If you’d like I can poke at the feeling more. Seems like some people in the Twitter thread (Alex Lawsen, Neel Nanda) share the feeling.

If I’m wrong and it’s not an attack, I still think they should have gotten Paul’s consent, and I think the fact that it might be interpreted as an attack (by people seeing the truck) is also relevant.

(Obviously, I think the events “this is at least partially an attack on Paul” and “at least one of the authors of this post are connected to Control AI” are positively correlated, since this post is an attack on Paul. My probabilities are roughly 85% and 97%*, respectively.)

*For a broad-ish definition of “connected to”

I don’t particularly see a reason to dox the people behind the truck, though I am not totally sure. My bar against doxxing is pretty high, though I do care about people being held accountable for large scale actions they take.

That’s fair. I think that it would be better for the world if Control AI were not anonymous, and I judge the group negatively for being anonymous. On the other hand, I don’t think I endorse them being doxxed. So perhaps my request to Connor and Gabriel is: please share what connection you have to Control AI, if any, and share what more information you have permission to share.

*(Conflict of interest note: I work at ARC, Paul Christiano’s org. Paul did not ask me to write this comment. I first heard about the truck (below) from him, though I later ran into it independently online.)*There is an anonymous group of people called Control AI, whose goal is to convince people to be against responsible scaling policies because they insufficiently constraint AI labs’ actions. See their Twitter account and website (

~~also anonymous~~*Edit: now identifies Andrea Miotti of Conjecture as the director*). (I first ran into Control AI via this tweet, which uses color-distorting visual effects to portray Anthropic CEO Dario Amodei in an unflattering light, in a way that’s reminiscent of political attack ads.)Control AI has rented a truck that had been circling London’s Parliament Square. The truck plays a video of “Dr. Paul Christiano (Made ChatGPT Possible; Government AI adviser)” saying that there’s a 10-20% chance of an AI takeover and an overall 50% chance of doom, and of Sam Altman saying that the “bad case” of AGI is “lights out for all of us”. The back of the truck says “Responsible Scaling: No checks, No limits, No control”. The video of Paul seems to me to be an attack on Paul (but see Twitter discussion here).

I currently strongly believe that the authors of this post are either in part responsible for Control AI, or at least have been working with or in contact with Control AI. That’s because of the focus on RSPs and because both Connor Leahy and Gabriel Alfour have retweeted Control AI (which has a relatively small following).

Connor/Gabriel—if you are connected with Control AI, I think it’s important to make this clear, for a few reasons. First, if you’re trying to drive policy change, people should know who you are, at minimum so they can engage with you. Second, I think this is

*particularly*true if the policy campaign involves attacks on people who disagree with you. And third, because I think it’s useful context for understanding this post.Could you clarify if you have any connection (even informal) with Control AI? If you are affiliated with them, could you describe how you’re affiliated and who else is involved?

EDIT: This Guardian article confirms that Connor is (among others) responsible for Control AI.

Social graces are not only about polite lies but about social decision procedures on maintaining game theoretic equilibria to maintain cooperation favoring payoff structures.

This sounds interesting. For the sake of concreteness, could you give a couple of central examples of this?

# How much do you believe your results?

There were 14 -- but they did so well that it’s unlikely to have been by chance: the p-value is 0.0002 (i.e. the probability of IQ >150 people having gotten such a large percentile conditioned on their true skill levels being distributed like the entire population is only 0.02%).

# [Crosspost] ACX 2022 Prediction Contest Results

# Solving for the optimal work-life balance with geometric rationality

Hi! I just wanted to mention that I

*really*appreciate this sequence. I’ve been having lots of related thoughts, and it’s great to see a solid theoretical grounding for them. I find the notion that bargaining can happen across lots of different domains—different people or subagents, different states of the world, maybe different epistemic states—particularly useful. And this particular post presents the only argument for rejecting a VNM axiom I’ve ever found compelling. I think there’s a decent chance that this sequence will become really foundational to my thinking.

Note that this is just the arithmetic mean of the probability distributions. Which is indeed what you want if you believe that P is right with probability 50% and Q is right with probability 50%, and I agree that this is what Scott does.

At the same time, I wonder—is there some sort of frame on the problem that makes logarithmic pooling sensible? Perhaps (inspired by the earlier post on Nash bargaining) something like a “bargain” between the two hypotheses, where a hypothesis’ “utility” for an outcome is the probability that the hypothesis assigns to it.

The aggregation method you suggest is called logarithmic pooling. Another way to phrase it is: take the geometric mean of the odds given by the probability distribution (or the arithmetic mean of the log-odds). There’s a natural way to associate every proper scoring rule (for eliciting probability distributions) with an aggregation method, and logarithmic pooling is the aggregation method that gets associated with the log scoring rule (which Scott wrote about in an earlier post). (Here’s a paper I wrote about this connection: https://arxiv.org/pdf/2102.07081.pdf)

I’m also exited to see where this sequence goes!

Thanks for the post! Quick question about your last equation: if each h is a distribution over a coarser partition of W (rather than W), then how are we drawing w from h for the inner geometric expectation?

How much should you shift things by? The geometric argmax will depend on the additive constant.

Thanks for the post—I’ve been having thoughts in this general direction and found this post helpful. I’m somewhat drawn to geometric rationality because it gives more intuitive answers in thoughts experiments involving low probabilities of extreme outcomes, such as Pascal’s mugging. I also agree with your claim that “humans are evolved to be naturally inclined towards geometric rationality over arithmetic rationality.”

On the other hand, it seems like geometric rationality only makes sense in the context of natural features that cannot take on negative values. Most of the things I might want to maximize (e.g. utility) can be negative. Do you have thoughts on the extent to which we can salvage geometric rationality from this problem?

I wonder if the effect is stronger for people who don’t have younger siblings. Maybe for people with younder siblings, part of the effect kicks in when they have a younger sibling (but they’re generally too young to notice this), so the effect of becoming a parent is smaller.

“My probability is 30%, and I’m 50% sure that the butterfly probability is between 20% and 40%” carries useful information, for example. It tells people how confident I am in my probability.

I often talk about the “true probability” of something (e.g. AGI by 2040). When asked what I mean, I generally say something like “the probability I would have if I had perfect knowledge and unlimited computation”—but that isn’t quite right, because if I had

*truly*perfect knowledge and unlimited computation I would be able to resolve the probability to either 0 or 1. Perfect knowledge and computation*within reason*, I guess? But that’s kind of hand-wavey. What I’ve actually been meaning is the butterfly probability, and I’m glad this concept/post now exists for me to reference!More generally I’d say it’s useful to make intuitive concepts more precise, even if it’s hard to actually use the definition, in the same way that I’m glad logical induction has been formalized despite being intractable. Also I’d say that this is an interesting concept, regardless of whether it’s useful :)

The Bayesian persuasion framework requires that the set of possible world states be defined in advance—and then the question becomes, given certain utility functions for the expert and decision-maker, what information about the world state should the expert commit to revealing?

I think that Bayesian persuasion might not be the right framework here, because we get to choose the AI’s reward function. Assume (as Bayesian persuasion does) that you’ve defined all possible world states.

^{[1]}Do you want to get the AI to reveal*all*the information—i.e. which particular world state we’re in—rather than a convenient subset (that it has precommitted to)? That seems straightforward: just penalize it really heavily if it refuses to tell you the world state.I think the much bigger challenge is getting the AI to tell you the world state

*truthfully*—but note that this is outside the scope of Bayesian persuasion, which assumes that the expert is constrained to the truth (and is deciding which parts of the truth they should commit to revealing).- ^
“World states” here need not mean the precise description of the world, atom by atom. If you only care about answering a particular question (“How much will Apple stock go up next week?” then you could define the set of world states to correspond to relevant considerations (e.g. the ordered tuple of random variables (how many iPhones Apple sold last quarter, how much time people are spending on their Macs, …)). Even so, I expect that defining the set of possible world states to be practically impossible in most cases.

- ^

# [Question] Three questions about mesa-optimizers

For personal reasons it made sense for me to calculate the percentage of Londoners who will have COVID this Thursday, the 16th. The number I got was much higher than I intuitively expected: 10%. Please point out any errors you see!

Among specimens collected in London 5 days ago, about 8000 were positive. This is relative to 4000 before the recent rise in cases, suggesting about 4000 are Omicron. Source

Omicron doubles at a rate of 2.5 days in the UK. Source

So among specimens collected Monday, we’d expect ~16k Omicron cases. Among specimens collected Thursday the 16th that should be ~35k.

As a ballpark guess, we might guess that about half of cases are caught, so that’s ~70k.

The typical time period between someone catching COVID and getting tested is 5 days. So the number of Londoners who will

*catch*COVID on Thursday is ~280k, since they’ll typically get tested 5 days (two doublings) after that. That’s about 3% of the population of London.Omicron grows by a factor of ~1.3 per day, so (3/1.3)% will catch COVID on Wednesday, and so on.

**The total percentage of Londoners who will****have****COVID on Thursday is thus ~10%**(summing the appropriate geometric series).

Thoughts?

(Note: I work with Paul at ARC theory. These views are my own and Paul did not ask me to write this comment.)I think the following norm of civil discourse is super important:

do not accuse someone of acting in bad faith, unless you have really strong evidence.An accusation of bad faith makes it basically impossible to proceed with discussion and seek truth together, because if you’re treating someone’s words as a calculated move in furtherance of their personal agenda, then you can’t take those words at face value.I believe that this post violates this norm pretty egregiously. It begins by saying that hiding your beliefs “is lying”. I’m pretty confident that the sort of belif-hiding being discussed in the post is

notsomething most people would label “lying” (see Ryan’s comment), and itdefinitelyisn’t a central example of lying. (And so in effect it labels a particular behavior “lying” in an attempt to associate it with behaviors generally considered worse.)The post then confidently asserts that Paul Christiano hides his beliefs in order to promote RSPs. This post presents very little evidence presented that this is what’s going on, and Paul’s account seems consistent with the facts (and I believe him).

So in effect, it accuses Paul and others of lying, cowardice, and bad faith on what I consider to be very little evidence.

Edited to add:What should the authors have done instead? I think they should have engaged in a public dialogue with one or more of the people they call out / believe to be acting dishonestly. The first line of the dialogue should maybe have been: “I believe you have been hiding your beliefs, for [reasons]. I think this is really bad, for [reasons]. I’d like to hear your perspective.”