Former physicist, current worry-about-AI-ist.
Previously at AI Impacts
Richard Korzekwa
Have you experienced the Bughouse Effect? What was it like?
I’ve been on both sides of this, but as soon as you started describing the dynamic, the first case that came to mind was when I was on the receiving end. It was a 2v2 Starcraft game with strangers. I don’t recall very much about the match, except that I responded to an attack with exactly the wrong kind of unit (Starcraft has a very rock-paper-scissors-ish structure, and I responded to rock with scissors, basically). I think plausibly this cost us the game, but I suspect we were doomed anyway. My teammate absolutely lost it. He even kept “yelling” at me via messages after the game.
He of course had a point, and I don’t blame him for being frustrated, but the main thing I remember thinking and saying to him was something like “I’m a random guy on the Internet and I’m bad at this game and I don’t understand how any of this is so surprising to you that it would make you this mad?” I didn’t really get the thing where he got super mad, when he had no reasonable expectation that his teammate would be competent, given the complete lack of matchmaking at the time. Maybe I really was an outlier of incompetence for him? He was having a bad day?
For me to feel any large amount of Bughouse Effect, I need to start out with the expectation that my allies are not bozos. If I assume from the beginning that they are (or are likely to be) bozos, then I have an entirely different orientation toward the thing we’re doing together, and it’s much harder for me to get mad about it. But, for a long time, this had the side effect that that I’m way more likely to get mad at people I know and respect than I do with random strangers. By now I’ve mostly learned to assume that my allies will do dumb things, no matter how competent I thought they were. This mostly solves the problem, but it does leave me less excited in general about doing things that require competent allies.
When I was hiking the Pacific Crest Trail last year, I noticed that some gear stores, especially in towns with access to the Sierra, had little cans of oxygen for climbing (you can even buy them on Amazon! https://www.amazon.com/oxygen-hiking/s?k=oxygen+for+hiking). When I saw them, the part of my brain that generates sentiments like “FUCK IT LET’S SEE HOW HARD WE CAN REALLY GO” told me to buy one, but I decided I didn’t want to carry the can around after what would probably be a pretty brief experiment.
Every time someone on LW claims that honesty is incompatible with social grace, I have a strong desire to post a comment or reply consisting only of the string ‘skill issue’ (or, if I’m feeling verbose, ‘sounds like a skill issue’). This would be an honest report of what I think about the other person’s claim, but it would not be kind or helpful or likely to result in a productive discussion. So I don’t do it.
Sometimes when this happens, and I ask myself if there’s an actually-good comment that conveys the same thing, and there are two things I notice:
I’m not skilled enough at writing to flesh out what I mean in a more productive way without significant time cost.
I feel noticeably less inclined than usual to engage in disagreement with people who make this claim. This is both because I lack skill at internet fights to not come out of it feeling like I got wrecked, and because, as you put it “Some skills are hard to appreciate unless you have some prerequisite amount of the skill yourself”. So I anticipate an uphill battle.
Anyway, I think the actually-good response is this post, so thank you.
For what it’s worth, the sentiment I recall at the time among Americans was not that (almost) everyone everywhere thought it was terrible, just that the official diplomatic stance from (almost) every government was that it was terrible (and also that those governments had better say it’s terrible or at least get out of the way while the US responds). I think I remember being under the impression that almost everyone in Europe thought it was obviously bad. To be fair, I didn’t think much at the time about what, e.g., the typical person in China or Brazil or Nigeria thought about it. Also, that was a long time ago, so probably some revision in my memory.
One way of thinking about offsetting is using it to price in the negative effects of the thing you want to do. Personally, I find it confusing to navigate tradeoffs between dollars, animal welfare, uncertain health costs, cravings for foods I can’t eat, and fewer options when getting food. The convenient thing about offsets is I can reduce the decision to “Is the burger worth $x to me?”, where $x = price of burger + price of offset.
A common response to this is “Well, if you thought it was worth it to pay $y to eliminate t hours of cow suffering, then you should just do that anyway, regardless of whether you buy the burger”. I think that’s a good point, but I don’t feel like it helps me navigate the confusing-to-me tradeoff between like five different not-intuitively-commensurable considerations.
Not to mention that of all of the hunter gatherer tribes ever studied, there has never been a single vegetarian group discovered. Not. A. Single. One.
Of the ~200 studied, ~75% of them got over 50% of their calories from animals. Only 15% of them got over 50% of their calories from non-animal sources.
Do you have a source for this? I’m asking more out of curiosity than doubt, but in general, I think it would be cool to have more links for some of the claims. And thanks for all of the links that are already there!
It is sometimes good to avoid coming across as really weird or culturally out of touch, and ads can give you some signal on what’s normal and culturally relevant right now. If you’re picking up drinks for a 4th of July party, Bud Light will be very culturally on-brand, Corona would be fine, but a bit less on-brand, and mulled wine would be kinda weird. And I think you can pick this sort of thing up from advertising.
Also, it might be helpful to know roughly what group membership you or other people might be signalling by using a particular product. For example, I drive a Subaru. Subaru has, for a long time, marketed to (what appears to me to be) people who are a bit younger, vote democrat, and spend time in the mountains. This is in contrast to, say, Ram trucks, which are marketed to (what looks to me like) people who vote Republican. If I’m in a context where people who don’t know me very well see my car, I am now aware that they might be biased toward thinking I vote democrat or spend time outdoors. (FWIW, I did a low-effort search for which states have the strongest Subaru sales and it is indeed states with mountains and states with people who vote democrat).
Recently I’ve been wondering what this dynamic does to the yes-men. If someone is strongly incentivized to agree with whatever nonsense their boss is excited about that week, then they go on Twitter or national TV to repeat that nonsense, it can’t be good for seeing the world accurately.
Sometimes what makes a crime “harder to catch” is the risk of false positives. If you don’t consider someone to have “been caught” unless your confidence that they did the crime is very high, then, so long as you’re calibrated, your false positive rate is very low. But holding off on punishment in cases where you do not have very high confidence might mean that, for most instances where someone commits the crime, they are not punished.
If you want someone to compress and communicate their views on the future, whether they anticipate everyone will be dead within a few decades because of AI seems like a pretty important thing to know. And it’s natural to find your way from that to asking for a probability. But I think that shortcut isn’t actually helpful, and it’s more productive to just ask something like “Do you anticipate that, because of AI, everyone will be dead within the next few decades?”. Someone can still give a probability if they want, but it’s more natural to give a less precise answer like “probably not” or a conditional answer like “I dunno, depends on whether <thing happens>” or to avoid the framing like “well, I don’t think we’re literally going to die, but”.
He says, under the section titled “So what options do I have if I disagree with this decision?”:
But beyond [leaving LW, trying to get him fired, etc], there is no higher appeals process. At some point I will declare that the decision is made, and stands, and I don’t have time to argue it further, and this is where I stand on the decision this post is about.
Yeah, seems like it fails mainly on 1, though I think that depends on whether you accept the meaning of “could not have done otherwise” implied by 2⁄3. But if you accept a meaning that makes 1 true (or, at least, less obviously false), then the argument is no longer valid.
This seems closely related to an argument I vaguely remember from a philosophy class:
A person is not morally culpable of something if they could not have done otherwise
If determinism is true, there is only one thing a person could do
If there is only one thing a person could do, they could not have done otherwise
If determinism is true, whatever someone does, they are not morally culpable
Seems reasonable.
Possibly I’m behind on the state of things, but I wouldn’t put too much trust in a model’s self-report on how things like routing work.
Of course many ways of making a room more fun are idiosyncratic to a particular theme, concept, or space.
I think fun is often idiosyncratic to particular people as well, and this is one reason why fun design is not more common, at least for spaces shared by lots of pepople. For me, at least, ‘fun’ spaces are higher variance than more conventional spaces. Many do indeed seem fun, but sometimes my response is “this is unusual and clearly made for someone who isn’t me”.
But maybe this is mostly a skill issue. The Epic campus looks consistently fun to me, for example.
AI Impacts looked into this question, and IMO “typically within 10 years, often within just a few years” is a reasonable characterization. https://wiki.aiimpacts.org/speed_of_ai_transition/range_of_human_performance/the_range_of_human_intelligence
I also have data for a few other technologies (not just AI) doing things that humans do, which I can dig up if anyone’s curious. They’re typically much slower to cross the range of human performance, but so was most progress prior to AI, so I dunno what you want to infer from that.
And like, this is why it’s normal epistemics to ignore the blurbs on the backs of books when evaluating their quality, no matter how prestigious the list of blurbers! Like that’s what I’ve always done, that’s what I imagine you’ve always done, and that’s what we’d of course be doing if this wasn’t a MIRI-published book.
If I see a book and I can’t figure out how seriously I should take it, I will look at the blurbs.
Good blurbs from serious, discerning, recognizable people are not on every book, even books from big publishers with strong sales. I realize this is N=2, so update (or not) accordingly, but the first book I could think of that I knew had good sales, but isn’t actually good is The Population Bomb. I didn’t find blurbs for that (I didn’t look all that hard, though, and the book is pretty old, so maybe not a good check for today’s publishing ecosystem anyway). The second book that came to mind was The Body Keeps the Score. The blurbs for that seem to be from a couple respectable-looking psychiatrists I’ve never heard of.
Another victory for trend extrapolation!
I initially wanted to nominate this because I somewhat regularly say things like “I think the problem with that line of thinking is that you’re not handling your model uncertainty in the right way, and I’m not good at explaining it, but Richard Ngo has a post that I think explains it well.” Instead of leaving it at that, I’ll try to give an outline of why I found it so helpful. I didn’t put much thought into how to organize this review, it’s centered very much around my particular difficulties, and I’m still confused about some of this, but hopefully it gets across some of what I got out of it.
This post helped me make sense of a cluster of frustrations I’ve had around my thinking and others’ thinking, especially in domains where things are complex and uncertain. The allure of cutting the world up into clear, distinct, and exhaustive possibilities is strong, but doing so doesn’t always lead to clearer thinking. To give a few examples where I’ve seen this lead people astray (choosing not particularly charitable or typical examples, for simplicity):
The origins of covid-19 are zoonotic or a lab leak
AI research will or will not be automated by 2027
AI progress after time t will be super-exponential or it won’t
All of these could be clearly one or the other. And when pressed, sometimes people will admit “Okay, yes, I suppose it could be <3rd thing> or <4th thing>”. But I get the sense that they’re not leaving much room for “5th thing which I failed to consider, and which is much more like one of these things than the others, but importantly different from all of them”.
Previously, I’d misread this as a habit of overconfidence. For example, people would be talking about whether A or B is true and my thinking would come out like this:
roughly15% A is clearly true
roughly 5% B is clearly true
roughly 80% some secret third thing, maybe pretty similar to A or B
These small credences mainly came from A and B looking to me like overly-specific possibilities. I would have a bunch of guesses about how the world works, which claims about it are true, etc. And this cashes out as a few specific models, outcomes, and their credences, along with a large pile of residual uncertainty. So when Alice assigns 75% to A, this seems weirdly overconfident to me.
Usually Alice has some sensible model in which A is either true or not true. For example, many reasonable versions of “AI progress can be adequately represented by <metric>, which will increase as <function> of <inputs>” will yield AI progress that is either definitely super-exponential or definitely not. And Alice might even have credences over variations on <metric>, <function>, and <inputs>, as well as a few different flavors of super-exponential functions. She uses this to assign some probability to super-exponential growth and the rest to exponential or sub-exponential growth. I would see this and think “great, this looks rigorous, and it seems useful to think about this in this way”, so we’d argue about it or I’d make my own model or whatever. And this is often productive. But in the end I would still think “okay, great, to the extent the world is like that model (or family of models), it tells us something useful”, but I’d have limited confidence the model matched reality, so it would still only tug lightly on my overall thinking, I would still think Alice was overconfident, and I would feel mildly disappointed I hadn’t been able to make a better model that I actually believed in.
Part of why this is so disappointing is that it sure feels like I ought to be able to carve up possibilities in a way that allows me to use the rules of probability without having to assign a big lump of probability mass to “I dunno”. First, because figuring out which things could be true is the first step in Bayesian reasoning, plus there’s a sense in which Bayesian reasoning is obviously correct! Second, because I’ve seen smart people I respect cut possibility space up into neat, tidy pieces, apply probabilistic reasoning, and gain a clearer view of the world. And third, because Alice would ask me “Well, if it’s not A or B, what else could it be?” and I would say something like “I don’t know, possibly something like C” where I didn’t think C was particularly likely but I didn’t have any specific dominant alternatives. This kind of challenge can make it feel to me or look to Alice like I’m just reluctant to accept that either A or B will happen.
This post was helpful to me in part because it helped me notice that this dynamic is sometimes the result of others adhering strongly to a proposition-based reasoning in contexts where it’s not appropriate, or me thinking mainly in terms of models, then trying to force this into a propositional framework. For example, I might think something like:
Then I would try to operationalize this within a Bayesian framework with propositions like:
Then I’d try to assign probabilities to these propositions, plus some amount to “I’m thinking about this wrong” or “Reality does some other thing”, and figure out how to update them on evidence.
I think it’s fine to attempt this, but in contexts where I almost always put a majority of my credence on “I dunno, some combination of these or some other thing that’s totally different”, I don’t think it’s all that fruitful, and it’s more useful to just do the modeling, report the results, think hard about how to figure out what’s true, and seek evidence where it can be found. As evidence comes in, I can hopefully see which parts of my models are right or wrong, or come up with better models. (This is, by the way, how everything went all the time in my past life as an experimental physicist.)
I don’t have takes many good parts of the post, like what the best approach is to formalizing fuzzy truth values. But I do think that a shift toward more model-based reasoning is good in many domains, especially around topics like AI safety and forecasting, where people often arrive at (I claim) overconfident conclusions.